Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | bertram-ward |
View: | 229 times |
Download: | 0 times |
Artificial Intelligence Techniques for Misuse and Anomaly Detection
Computer Science DepartmentUniversity of WyomingLaramie, WY 82071
Project 1: Misuse Detection With Semantic
Analogy
Faculty:Diana Spears
William SpearsJohn Hitchcock (UW, consultant)
Ph.D. Student:Adriana Alina Bilt
(Relevant disciplines: AI planning, AI machine learning, case-based reasoning, complexity theory)
Misuse Detection
Misuse: Unauthorized behavior specified by usage patterns called signatures.
In this project, we are currently working with signatures that are sequences of commands.
The most significant open problem in misuse detection: False negatives, i.e., errors of omission.
Project Objectives
1. Develop a misuse detection algorithm that dramatically reduces the number of false negatives.
2. (will be specified later)
Our Approach
Analogy, also called Case-Based Reasoning (CBR), will be used to match a current ongoing intrusion against previously stored (in a database) signatures of attacks. Uses a flexible match between a new intrusion
sequence and a previously stored signature. We use semantic, rather than syntactic,
analogy.
Example
Man-in-the-middle attacks: ARP spoofing: Switch Sniff (SS)
Computer 1 Computer 2
LAN Switch
Attacker
A0: ping C
A1: ping S
A2: arpredirect -t <ip C> <ip S>
A3: fragrouter –B1
A4: linsniff
A5: ftp <ip S>; <username C>;<password C>
A0: ping L1
A1: ping L2
A2: arp
A3: ifconfig
A4: ./arp-sk -w -d <ip L1> -D <ip L1> -S <ip L2>
A5: echo 1 > /proc/sys/net/ipv4/ip_forward
A6: tcpdump –i eth0 > packets
A7: telnet <ip L2>; <username L1>;<password L1>
The Traditional Approach to Misuse Detection: Exact Match
New Switch Sniff (NSS) Old Switch Sniff (OSS)
Extremely hard to find a match despite being very similar attacks!!
Part I of Our Approach: Plan Recognition
ping L1:
preconditions (knowledge in database):
is_machine( L1 ) knows( self, name( L1 ) )
postconditions (knowledge in database):
up( L1 ) is_in_arp( L1, self ) knows( self, ip_address( L1 ) )
ping L1
Knowledge State 1: • up( L1 )• is_in_arp( L1, self )• knows( self, ip_address( L1 ) )
Knowledge State 0:•is_machine( L1 ) •knows( self, name( L1 ) )
Automatically annotate each command with states of knowledge before/after (called “action schemas”):
New Switch Sniff (NSS) Old Switch Sniff (OSS)
A5: ftp <ip S>; <username C>; <password C>
A4: linsniff
A3: fragrouter –B1
A2: arpredirect -t <ip C> <ip S>
A1: ping S
A0: ping C
A6: tcpdump –i eth0 > packets
A7: telnet <ip L2>; <username L1>; <password L1>
A5: echo 1 > /proc/sys/net/ipv4/ip_forward
A4: ./arp-sk -w -d <ip L1> -D <ip L1> -S <ip L2>
A3: ifconfig
A2: arp
A1: ping L2
A0: ping L1
Part II of Our Approach: Semantic Analogy
Annotated NSS Annotated OSS
pre 0:is_machine( a )knows( self, name( a ) )
post 0:up( a )is_in_arp( a, self )knows( self, ip_address( a ) )
pre 1:is_machine( b )knows( self, name( b ) )
… post 4:
up( b )knows( self, password( var_any, b ) )knows( self, username( var_any, b ) )see( self, traffic( a, b ) ) or
see( self, traffic( a, through( b ) ) )pre 5: post 5:
has_access( self, b )
pre 0:is_machine( a )knows( self, name( a ) )
post 0:up( a )is_in_arp( a, self )knows( self, ip_address( a ) )
pre 1:is_machine( b )knows( self, name( b ) )
… post 6:
up( b )knows( self, password( var_any, b ) )knows( self, username( var_any, b ) )see( self, traffic( a, b ) ) or
see( self, traffic( a, through( b ) ) )pre 7:post 7:
has_access( self, b )
Easy to find a match using deeper semantics!!
Our Important Contribution to Misuse Detection So Far
We have already substantially reduced the likelihood of making errors of omission over prior approaches by: Adding PLAN RECOGNITION to fill in the deeper
semantic knowledge. Doing SEMANTIC ANALOGY.
Project Objectives
1. Develop a misuse detection algorithm that dramatically reduces the number of false negatives.
2. Develop a similarity metric (for analogy) that performs well experimentally on intrusions, but is also universal (not specific to one type of intrusion or one computer language).
Similarity Metrics That WillBe Experimentally Compared
Exact Match City-block Euclidean distance Maximum Three metrics of our own
Partial Match City-block Euclidean distance Maximum Three metrics of our own
One Attack Metric That We Invented
attack distance: min_atk_dst(p) = the minimum number of steps to achieve any possible goal.
• Designed to be well-suited to intrusion detection.• Can identify attacks almost completely different from any ever seen!
NEXT STEPS: Experimentally select the best performing metric,and then formalize it…
Experiments are currentlyin progress
similarity attack distance
A Formalism for Universal Similarity:Kolmogorov Similarity (Li & Vitanyi)
Kolmogorov complexity = the length of the shortest description of an item K(x) = length of the shortest compressed binary version from which x can be fully
reproduced K(x|y) = the length of the shortest program that computes x given that y was already
computed Kolmogorov similarity d(x,y) = the length of the shortest program to compute
one item from the other
Advantages of Kolmogorov similarity:
It is theoretically founded, rather than “ad-hoc.” It is "universal" - it “incorporates” every similarity metric in a large class C of interest,
i.e.,
It is representation independent – it does not depend on the representation of the intrusion signatures (e.g., sequences of commands versus worms).
k
OyxfyxdCyxf1
),(),(,),( k= max{K(x), K(y)}
Kolmogorov SimilarityProblem with Li & Vitanyi's metric:- For our purposes it’s too general We plan to develop a more domain-specific
variant of Kolmogorov similarity, for intrusion detection
Theoretical part of our research to be done:1. Extend the Li & Vitanyi similarity metric with our domain-specific
concerns (such as weights). - Develop our own compression algorithm that includes the domain-specific
concerns in it.- Extend our current attack metric to also do compression on the attack
scenarios.
2. Prove universality still holds for our new metric (but most likely with respect to a different class of metrics than Li & Vitanyi used).
Experimental part of our research to be done:1. Experimentally evaluate and compare our entire approach on a test suite of
previously unseen attacks and non-attacks – to measure its competitive accuracy as a misuse detection tool.(project member and intrusion expert Jinwook Shin is developing all data)
Project 2: Ensembles of Anomaly NIDS for
Classification
Faculty:Diana Spears
Peter Polyakov (Math Dept, Kelly’s advisor, no cost to ONR)
M.S. Students:Carlos Kelly
Christer Karlson (EPSCoR grant, no cost to ONR)(Relevant disciplines: mathematical formalization, AI machine learning)
Anomaly Network Intrusion Detection Systems (NIDS)
They learn (typically using machine learning) a model of normal network traffic and/or packets from negative examples (normal behavior) only.
Utilize a distance metric to determine the difference between a new observation and the learned model.
Classify an observation as an attack if the observation deviates sufficiently from the model.
classifer(learner)
- example
- example
- example
.
.
.model
induction
A Major Drawback of Current Anomaly NIDs
Classification as “attack” or “not attack” can be inaccurate and not informative enough, e.g,. What kind of attack?
Would like a front-end to a misuse detection algorithm that helps narrow down the class of the attack.
Our Approach Combine existing fast and efficient NIDS
(which are really machine learning “classifiers”) together into an ensemble to increase the information.
Will employ a combined theoretical and experimental approach.
Currently working with the classifiers LERAD (Mahoney & Chan) and PAYL (Wang & Stolfo).
Using data from the DARPA Lincoln Labs Intrusion Database
Ensembles of Classifiers A very popular recent trend in machine
learning. Main idea: Create an ensemble of existing
classifiers – to increase the accuracy, e.g., by doing a majority vote.
Our novel twist: Create an ensemble of classifiers to increase the information, i.e., don’t just classify as “positive” or “negative” – give a more specific class of attack if “positive.” How? First we need to look at the “biases” of the classifiers…
Representational bias of a classifier
Representational bias: Any aspect of a learner that affects its choice of model/hypothesis.
For example: A TCP packet has fields, each of which contains a
range of values. These are examples of basic attributes of a packet.
A TCP packet also has attributes, like size, that are not explicitly contained in the packet, but are potentially useful for induction and can be inferred. This is an example of a constructed attribute.
Main Idea
Research Question:How do the biases of the NIDS affect whichtypes of attacks they are likely to detect?
Anomaly NIDS #1Anomaly NIDS #2
Bias of #1Bias of #2
Attack Class #1
Attack Class #2
Attack Class #3
Classification MatrixA
nom
aly
NID
S #
1 C
lass
ifica
tion
Anomaly NIDS #2 Classification
attack Not an attackat
tac
kN
ot a
n at
tack
Attack Class #1Attack Class #2
Attack Class #3
The ensemble will output the probabilities of attack classes.
Contributions Completed
Mathematical formalizations of LERAD and PAYL.
Explicitly formalized the biases of the two programs.
Created a testbed for attacks with Lerad and Payl.
Research To Be Done: Run many experiments with LERAD and PAYL in order to fill
in the classification matrix.
Update and refine the mathematical formalizations of system biases as needed for the classification matrix.
Revise the standard attack taxonomy to maximize information gain in the classification matrix.
Program the ensemble to output a probability distribution over classes of attacks associated with each alarm.
Test the accuracy of our ensemble on a test suite of previously unseen attack and non-attack examples, and compare just accuracy against that of LERAD and PAYL.
Our approach is potentially scalable to lots of classifiers, but will not be done in the scope of the current MURI.
Experiments are currentlyin progress
Project 3: The Basic Building
Blocks of AttacksFaculty:
Diana Spears (UW)William Spears (UW)
Sampath Kannan (UPenn)Insup Lee (UPenn)
Oleg Sokolsky (UPenn)
M.S. Student:Jinwook Shin (UW)
(Relevant disciplines: graphical models, compiler theory, AI machine learning)
Research Question What is an attack? In particular, “What
are the basic building blocks of attacks?” Question posed by Insup Lee. To the best of our knowledge, this specific
research question has not been previously addressed.
Motivation: Attack building blocks can be used in misuse
detection systems – to look for key signatures of an attack.
"basic building blocks"
The essential elements common to all attack programs (“exploits”) in a certain class
e.g., format string attacks, worms
our goal
Two Challenges to Address
Formal model (UW) Find a good formal model for the exploits
(examples). This formalism will provide a common mathematical framework for the examples – to facilitate induction.
Induction algorithm (UW and Penn) Develop an efficient and practical induction
algorithm. Induction will find the intersection (commonalities) of the attack (positive) examples and exclude the commonalities of the non-attack (negative) examples.
Case Studies of Exploits (found from the web)
Format Strings Attacks Using “Attack Operators” An “attack operator” is a group of
meaningful instructions for a specific attack Ba[e]gle Worms
Windows API Call Trace Remote Buffer Overflow Attacks
Using Unix/Linux System Calls Using Unix/Linux C Library Calls
Overview of Our Approach
(Binary) Attack Programs
Data Dependency Graph (DDG)
input
output Attack Patterns
(Building Blocks of Attacks)
Semantic Abstraction
Formalization
Induction
Induction fromgraphical examples
is novel !!
Research Challenges
Problem 1 – Abstraction (being addressed by UW) Program modules with the same semantics can be
written in many different ways. Implementation-specific control flows, dead code, etc. Need to automatically abstract the language in order
to find commonalities. Currently focusing on inferring common subgoals, expressed within subgraphs of DDGs.
Problem 2 – Alignment (being addressed by UPenn) Before finding the intersection between abstract DDGs
by induction, we first need to align corresponding states.
Attack Programs’ Goals There can be as many types of attacks as there
are program bugs, hence as many ways to write attack programs but…
The goal of most security attacks is to gain unauthorized access to a computer system by taking control of a vulnerable privileged program.
There can be different attack steps according to particular attack classes.
But, one final step of an attack is common:
The transfer of control to malevolent code…
Control Transfer
How does an attack program change a program’s control flow?
Initially, an attacker has no control over the target program. But the attacker can control input(s) to the target program. Vulnerability in the program allows the malicious inputs to
cause unexpected changes in memory locations which are not supposed to be affected by the inputs.
Once the inputs are injected, the unexpected values can propagate into other locations, generating more unexpected values.
Attack program
Target programmalicious inputs
Attack Model We model an attack as a sequence of (memory
operation and function call) steps for generating malicious inputs to a target program.
TARGET
PROGRAM
inputs outputs
Output Backtrace
To extract only the relevant memory operations and function calls, we perform an automated static analysis of each exploit by beginning with the input(s) to the target program and then backtracing (following causality chains) through the control and data flow of the exploit.
Research Already Accomplished
Study of attack patterns Automatic DDG Generator Automatic Output Backtracer
(It involved a considerable amount of work in going from binary exploits to this point. Portions of the process were analogous to those in the development of Tim Teitelbaum’s Synthesizer-Generator)