+ All Categories
Home > Documents > 1 Data Mining &Intrusion Detection Shan Bai Instructor: Dr. Yingshu Li CSC 8712,Spring 08.

1 Data Mining &Intrusion Detection Shan Bai Instructor: Dr. Yingshu Li CSC 8712,Spring 08.

Date post: 25-Dec-2015
Category:
Upload: lucas-chapman
View: 216 times
Download: 1 times
Share this document with a friend
62
1 Data Mining &Intrusion Detection Shan Bai Instructor: Dr. Yingshu Li CSC 8712 ,Spring 08
Transcript

1

Data Mining ampIntrusion Detection

Shan Bai Instructor Dr Yingshu Li

CSC 8712 Spring 08

2

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

3

What is an intrusion

An intrusion can be defined as ldquoany set of actions that attempt to compromise the Integrity confidentiality or availability of a resourcerdquo

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

1 2 3 4 5 6 7 8 9 10 11 12 13 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002

Incidents Reported to Computer Emergency Response

TeamCoordination Center

Spread of SQL Slammer worm 10 minutes

after its deployment

4

Intrusion Examples Trojan horse worm Address spoofing

a malicious user uses a fake IP address to send malicious packets to a target

Many othershellip

DOS denial-of-service

R2L unauthorized access from a

remote machine eg guessing password

U2R unauthorized access to local

super user (root) privileges eg various ``buffer overflow attacks

Probing surveillance and other probing

eg port scanning

5

Intrusion Detection System (IDS)

Intrusion Detection System combination of software and hardware that attempts to

perform intrusion detection raises the alarm when possible intrusion happens

6

IDS Categories

Intrusion detection systems are split into two groups Anomaly detection systems

Identify malicious traffic based on deviations from established normal network

Misuse detection systems Identify intrusions based on a known pattern

(signatures) for the malicious activity

7

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

2

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

3

What is an intrusion

An intrusion can be defined as ldquoany set of actions that attempt to compromise the Integrity confidentiality or availability of a resourcerdquo

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

1 2 3 4 5 6 7 8 9 10 11 12 13 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002

Incidents Reported to Computer Emergency Response

TeamCoordination Center

Spread of SQL Slammer worm 10 minutes

after its deployment

4

Intrusion Examples Trojan horse worm Address spoofing

a malicious user uses a fake IP address to send malicious packets to a target

Many othershellip

DOS denial-of-service

R2L unauthorized access from a

remote machine eg guessing password

U2R unauthorized access to local

super user (root) privileges eg various ``buffer overflow attacks

Probing surveillance and other probing

eg port scanning

5

Intrusion Detection System (IDS)

Intrusion Detection System combination of software and hardware that attempts to

perform intrusion detection raises the alarm when possible intrusion happens

6

IDS Categories

Intrusion detection systems are split into two groups Anomaly detection systems

Identify malicious traffic based on deviations from established normal network

Misuse detection systems Identify intrusions based on a known pattern

(signatures) for the malicious activity

7

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

3

What is an intrusion

An intrusion can be defined as ldquoany set of actions that attempt to compromise the Integrity confidentiality or availability of a resourcerdquo

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

1 2 3 4 5 6 7 8 9 10 11 12 13 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002

Incidents Reported to Computer Emergency Response

TeamCoordination Center

Spread of SQL Slammer worm 10 minutes

after its deployment

4

Intrusion Examples Trojan horse worm Address spoofing

a malicious user uses a fake IP address to send malicious packets to a target

Many othershellip

DOS denial-of-service

R2L unauthorized access from a

remote machine eg guessing password

U2R unauthorized access to local

super user (root) privileges eg various ``buffer overflow attacks

Probing surveillance and other probing

eg port scanning

5

Intrusion Detection System (IDS)

Intrusion Detection System combination of software and hardware that attempts to

perform intrusion detection raises the alarm when possible intrusion happens

6

IDS Categories

Intrusion detection systems are split into two groups Anomaly detection systems

Identify malicious traffic based on deviations from established normal network

Misuse detection systems Identify intrusions based on a known pattern

(signatures) for the malicious activity

7

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

4

Intrusion Examples Trojan horse worm Address spoofing

a malicious user uses a fake IP address to send malicious packets to a target

Many othershellip

DOS denial-of-service

R2L unauthorized access from a

remote machine eg guessing password

U2R unauthorized access to local

super user (root) privileges eg various ``buffer overflow attacks

Probing surveillance and other probing

eg port scanning

5

Intrusion Detection System (IDS)

Intrusion Detection System combination of software and hardware that attempts to

perform intrusion detection raises the alarm when possible intrusion happens

6

IDS Categories

Intrusion detection systems are split into two groups Anomaly detection systems

Identify malicious traffic based on deviations from established normal network

Misuse detection systems Identify intrusions based on a known pattern

(signatures) for the malicious activity

7

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

5

Intrusion Detection System (IDS)

Intrusion Detection System combination of software and hardware that attempts to

perform intrusion detection raises the alarm when possible intrusion happens

6

IDS Categories

Intrusion detection systems are split into two groups Anomaly detection systems

Identify malicious traffic based on deviations from established normal network

Misuse detection systems Identify intrusions based on a known pattern

(signatures) for the malicious activity

7

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

6

IDS Categories

Intrusion detection systems are split into two groups Anomaly detection systems

Identify malicious traffic based on deviations from established normal network

Misuse detection systems Identify intrusions based on a known pattern

(signatures) for the malicious activity

7

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

7

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

8

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

9

Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it

False positives A false positive is a situation where something abnormal (as

defined by the IDS) happens but it is not an intrusion Too many false positives

User will quit monitoring IDS because of noise False negatives

A false negative is a situation where an intrusion is really happening but IDS doesnt catch it

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

10

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

11

Why do we need Data Mining

Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10

We are drowning in data but starving for knowledge1048714

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

12

Data Mining vs KDD

Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data

Data Mining Use of algorithms to extract the information and patterns derived by the KDD process

Data mining is the core of the knowledge discovery process

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

13

KDD Process

Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform

to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in

meaningful manner

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

14

Data Mining A KDD Processndash Data mining core of

knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

15

Typical Data Mining Architecture

Data Warehouse

Data cleaning amp data integration Filtering

Databases

Database or data warehouse server

Data mining engine

Pattern evaluation

Graphical user interface

Knowledge-base

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

16

Outline

Intrusion Detection

Data Mining

Data Mining in Intrusion Detection

Reference

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

17

Network intrusion detection

Number of intrusions on the network is typically a very small fraction of the total network traffic

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

18

Why Can Data Mining Help

Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities

Maintain models on dynamic data

Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses

Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying

information)

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

19

Intrusion Detection

Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks

LimitationsSignature database has to be manually revised

for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created

signatures across the computer system

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

20

Data Mining for Intrusion Detection Techniques and Applications

Frequent pattern mining Classification Clustering Mining data streams

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

21

Patterns that occur frequently in a database

Mining Frequent patterns ndash finding regularities

Process of Mining Frequent patterns for intrusion de

tection Phase I mine a repository of normal frequent itemsets for a

ttack-free data

Phase II find frequent itemsets in the last n connections an

d compare the patterns to the normal profile

Frequent pattern mining

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

22

Frequent pattern mining

Apriori bull Any subset of a frequent itemset must be also freque

nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent

bull No superset of any infrequent itemset should be generated or tested

ndash Many item combinations can be pruned

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

23

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

24

Sequential Pattern Mining

Given a set of sequences find the complete set of frequent subsequences

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

25

Apriori Property in Sequences

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

26

Classification A Two-Step Process Model construction describe a set of predetermined

classes Training dataset tuples for model construction

Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae

Model application classify unseen objects Estimate accuracy of the model using an independent test

set Acceptable accuracy apply the model to classify data tu

ples with unknown class labels

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

27

Classification

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

28

Classification Decision Tree

A node in the tree a test of some attribute A branch a possible value of the attribute Classification

Start at the root Test the attribute Move down the tree branch

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

29

Neural classification HIDE

ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the

statistical model Statistical processor maintains a model for normal activities

and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

30

Clustering

What Is Clustering Group data into clusters

ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

31

Clustering

What Is A Good Clustering High intra-class similarity and low interclasssimilar

ity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

32

Clustering

Clustering Approaches Partitioning algorithms

ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin

g Hierarchy algorithms

ndash Agglomerative each object is a cluster merge clusters to form larger ones

ndash Divisive all objects are in a cluster split it up into smaller clusters

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

33

Clustering

K-Means Example

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

34

Mining Data Streams for Intrusion Detection

Maintaining profiles of normal activities The profiles of normal activities may drift

Identifying novel attacks Identifying clusters and outliers in traffic data

streams Reduce the future alarm load by writing

filtering rules that automatically discard well-understood false positives

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

35

Data Mining for Intrusion Detection

Misuse detectionPredictive models are built from labeled data sets (instances

are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually

created signatures Recent research eg JAM (Java Agents for Metalearning)

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

36

Misuse Detection

Intrusion Patterns

activities

pattern matching

intrusion

Canrsquot detect new attacks

Example if (src_ip == dst_ip) then ldquoland attackrdquo

look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

37

JAM (Java Agents for Metalearning)

JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks

The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior

The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model

Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions

The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

38

Data Mining for Intrusion Detection

Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt

rusion Detection System

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

39

Anomaly Detection

activity measures

0102030405060708090

CPU ProcessSize

normal profileabnormal

probable intrusion

Relatively high false positive rate - anomalies can just be new normal activities

baseline the normal traffic and then look for things that are out of the norm

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

40

ADAM Audit Data Analysis and Mining

Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line

algorithm Secondly ADAM runs an online algorithm

Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one

of the following Known type of attack Unknown type of attack False alarm

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

41

ADAM Detecting Intrusion by Data Mining

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

42

ADAM Audit Data Analysis and Mining

ADAM has two phases in their model

1st Phase Train the classifier Offline process Takes place only once Before the main experiment

2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

43

The MINDS Project

MINDS ndash MINnesota INtrusion Detection System

Learning from Rare Class ndash Building rare class prediction models

Anomalyoutlier detection

Summarization of attacks using association pattern analysis

TID Items

1 Bread Coke Milk

2 Beer Bread

3 Beer Coke Diaper Milk

4 Beer Bread Diaper Milk

5 Coke Diaper Milk

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

Rules Discovered Milk --gt Coke Diaper Milk --gt Beer

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

44

MINDS - Learning from Rare Class

Problem Building models for rare network attacks (Mining needle in a haystack)

Standard data mining models are not suitable for rare classes

Models must be able to handle skewed class distributions

Learning from data streams - intrusions are sequences of events

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

45

MINDS - Anomaly Detection

Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm

Nearest neighbor approach

Density based schemes

Unsupervised Support Vector Machines (SVM)

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

46

Experimental Evaluation

network

net-flow data using CISCO

routers

Data preprocessing

MINDSanomaly detection

helliphellip

Anomaly

scores Association pattern analysis

Open source signature-based network IDS wwwsnortor

g

10 minutes cycle

2 millions connections

Anomaly detection is applied

4 times a day

10 minutes time window

bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment

bull Real network data from University of Minnesota

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

47

MINDS - Framework for Mining Associations

Anomaly Detection System

attack

normal

R1 TCP DstPort=1863 Attack

hellip

hellip

hellip

hellip

R100 TCP DstPort=80 Normal

Discriminating Association

Pattern Generator

1 Build normal profile

2 Study changes in normal behavior

3 Create attack summary

4 Detect misuse behavior

5 Understand nature of the attack

update

Knowledge Base

Ranked connections

MINDS association analysis module

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

48

Discovered Real-life Association Patterns

At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first

followed by an attack on a specific machine identified as vulnerable by the attacker

Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)

Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

49

DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)

This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ

Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol

Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

50

SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)

SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)

SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)

helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)

Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patternshellip(ctd)

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

51

DstPort=6667 Protocol=TCP (c1=254 c2=1)

This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector

Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets

fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged

as anomalous is interesting This might indicate that the IRC server has been taken down (by a

DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patternshellip(ctd)

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

52

DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)

DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)

DstPort=1863 Protocol=TCP (c1=606 c2=8)

This pattern indicates a large number of anomalous TCP connections on port 1863

Further analysis reveals that the remote IP block is owned by Hotmail

Flag=0 is unusual for TCP traffic

Discovered Real-life Association Patternshellip(ctd)

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

53

MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be

detected by state-of-the-art signature based methods

SNORT has static knowledge manually updated by human analysts

MINDS anomaly detection algorithms are adaptive in nature

MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine

MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks

originating from multiple sites

Wormvirus detectionafter infection

Insider attack Policy violation

Outsider attack Network intrusion

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

54

IDS Using both Misuse and Anomaly DetectionRIDS-100

RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China

The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China

RIDS make the use of both intrusion detection technique misuse and anomaly detection

Distance based outlier detection algorithm is used for detection deviational behavior among collected network data

For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection

This large amount of data pattern is scanned using data mining classification Decision Tree algorithm

httpwwwrising-globalcom

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

55

A cooperative anomaly and intrusiondetection system (CAIDS)

built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

56

A cooperative anomaly and intrusiondetection system (CAIDS)

A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events

For an example we envision a window where we observe a 3-event sequence

E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t

o the two events D and F on the RHS of the rule

If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

57

A cooperative anomaly and intrusiondetection system (CAIDS)

In practice the event E could be an authentication service characterized by two attributes

(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b

y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t

hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows

(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

58

A cooperative anomaly and intrusiondetection system (CAIDS)

An association rule is aimed at finding interesting intra-relationship inside a single connection record

In general an FER is specified by the following expression

L1 L2hellip Ln R1hellip Rm (c s window) (2)

Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events

We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

59

A cooperative anomaly and intrusiondetection system (CAIDS)

Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

60

Conclusion

In this report we have studied basic concept and some classic system models like ADAM MINDSin this area

To make summary of those system models their technologies and their validation methods

Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

61

Reference DARPA 1998 data set

A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml

Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001

Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)

W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000

Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004

Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517

Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997

62

Questions amp Comments

62

Questions amp Comments


Recommended