SLG in MulVAL
netAccess(H2, Protocol, Port) :-
execCode(H1, User),
reachable(H1, H2, Protocol, Port).
netAccess(…)
Possible instantiations
table for goal
execCode(…)
Possible instantiations
table for first subgoal
from input tuples
1
SLG complexity for Datalog
• Total time dominated by the rule that has the maximum number of instantiations– Time for computing one table = Computation of the subgoals + retrieving information from input tuples + matching results in the rules bodies– Time for computing all tables = retrieving information from input tuples + matching results in the rules’ bodies
• See “On the Complexity of Tabled Datalog Programs” http://www.cs.sunysb.edu/~warren/xsbbook/node21.html
2
MulVAL complexity in SLG
execCode(Host, User) :- vulExists(Host, _, Program, remote, privilegeEscalation), networkService(Host, Program, Protocol, Port, User), netAccess(Host, Protocol, Port).
Scale with network size
O(N) different instantiations
3
netAccess(H2, Protocol, Port) :-
execCode(H1, _),
reachable(H1, H2, Protocol, Port).
MulVAL complexity in SLG
Scale with network size
O(N2) different instantiations
Complexity of MulVAL
4
Datalog proof generation
• In security analysis, not only do we want to know what attacks could happen, but also we want to know how attacks can happen– Thus, we need more than an yes/no answer for
queries.– We need the proofs for the true queries, which in the
case of security analysis will be attack paths.– We also want to know all possible attack paths; thus
we need exhaustive proof generation.
5
An obvious approach
6
execCode(Host, PrivilegeLevel) :- vulExists(Host, Program, remote, privilegeEscalation), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel), networkAccess(Host, Protocol, Port).
execCode(Host, PrivilegeLevel, Pf) :- vulExists(Host, Program, remote, privilegeEscalation, Pf1), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel, Pf2), networkAccess(Host, Protocol, Port, Pf3), Pf=(execCode(Host, PrivilegeLevel), [Pf1, Pf2, Pf3]).
This will break the bounded-term property and result in non-termination
for cyclic Datalog programs
MulVAL Attack-Graph Toolkit
Datalog representation
Machine configuration
Network configuration
Security advisories
XSB reasoning
engine
Datalo
g P
roo
f Step
s
Grap
h
Bu
ilder Datalog
proof graph
Datalog rules
Ou, Boyer, and McQueen. ACM CCS 2006
Joint work with Idaho National Laboratory
7
Translated rules
netAccess(H2, Protocol, Port, ProofStep) :-
execCode(H1, User),
reachable(H1, H2, Protocol, Port),
ProofStep= because( ‘multi-hop network access', netAccess(H2, Protocol, Port), [execCode(H1, User), reachable(H1, H2, Protocol, Port)] ).
Stage 1: Rule Translation
Proof step
8
netAccess(fileServer, rpc, 100003)
Stage 2: Build the Exhaustive Proof
because(‘multi-hop network access', netAccess(fileServer, rpc, 100003), [execCode(webServer, apache), reachable(webServer, fileServer, rpc, 100003)])
1multi-hop network access
0
execCode(webServer, apache)
reachable(webServer, fileServer, rpc, 100003)
2
3
9
Complexity of Proof Building
• O(N2) to complete Datalog evaluation– With proof steps generated
• O(N2) to build a proof graph from proof steps– Need to build O(N2) graph components– Building of one component
• Find the predecessor: table lookup• Find the successors: table lookup
Total time: O(N2), if table lookup is constant time
10
Logical Attack Graphs
10
2
3
4
5
6
: OR
: AND
: ground fact
execCode(attacker,workStation,root)
Trojan horse installation
accessFile(attacker,workStation, write,/usr/local/share)
NFS semantics
networkService (webServer,httpd,tcp,80,apache)
vulExists(webServer, CAN-2002-0392, httpd, remoteExploit, privEscalation)
netAccess(attacker,webServer, tcp,80)
Remote exploitexecCode(attacker, webServer,apache)
accessFile(attacker,fileServer, write,/export)
NFS shell
11
Performance and Scalability
0.01
0.1
1
10
100
1000
10000
1 10 100 1000
Number of hosts
CPU time (sec)
Fully connected
Partitioned
Ring
Star
12
Related Work
• Sheyner’s attack graph tool (CMU)– Based on model-checking
• Cauldron attack graph tool (GMU)– Based on graph-search algorithms
• NetSPA attack graph tool (MIT LL)– Graph-search based on a simple attack model
13
Advantages of the Logic-programming Approach
• Publishing and incorporation of knowledge/information through well-understood logical semantics
• Efficient and sound analysis by leveraging the reasoning power of well-developed logic-deduction systems
14
SAT-Solving Approaches to Context-Aware Enterprise Network Security Management. John Homer, Xinming Ou. In IEEE Journal on Selected Areas in Communications (JSAC).
SAT-based Security Hardening
• MulVAL proof graph provides information on potential consequences of vulnerabilities.
• How do we use this information to improve security?– Datalog proof turned to Boolean formula– SAT solver searches for optimal solution
Benefit of SAT
• Impossible for human to understand all configuration options and ramifications.– Computers can do it better
• Balance security and usability– Essentially a constraint solving process
• Provides automated, reliable approach to reason about conflicting requirements
Vision for Network Security
Management
Problem
atic C
onfiguration
MulV
AL
Usa
bilit
y R
eq
uire
me
nt
Desirable
Configuration
MulVAL Proof Graph
Graph to Boolean formula
SAT Solver
Training
Guidance
Φ
Suggested Configuration
Changes
SAT-Solving Techniques
• MinCostSAT– Utilize user-provided discrete cost values to find
mitigation solution that minimizes cost
• UNSAT Core Elimination– Reduce complexity in reconfiguration to simple
choices between conflicting requirements– Use partial-ordering lattice to further reduce scope of
choices, based on past decisions
Benefits
• Human user only addresses “problem areas” in network configuration
• Reduces complex problem to more manageable proportions
Example
webServer
fileServer
buffer
overrun
NFS shellRemote exploit
MulVAL Proof Graph p2
e2
c5
e1
c1
e3
c4 c6p1
c3
exploit
privilege
configuration setting
c7
c2
• Tseitin Transformation
• Can ‘e’ ever become true?
Circuit to CNF Conversion
a
bd
e
c
(a + b + d’)(a’ + d)(b’ + d)
d (a + b)
(c’ + d’ + e)(d + e’)(c + e’)
e (c d)
Is (e)(a + b + d’)(a’+d)(b’+d)(c’+d+e)(d+e’)(c+e’) satisfiable?
Consistency conditions for circuit variables
From Sharad Malik’s slides
Boolean Transformationp2
e2
c5
e1
c1
e3
c4 c6p1
c3
c7
c2
c1 ⋀ c2 ⋀ c3 ⇒ p1
c4 ⋀ c5 ⋀ p1 ⇒ p2
c6 ⋀ c7 ⋀ p1 ⇒ p2
e1 :
e2 :
e3 :
Φ = e1 ⋀ e2 ⋀ e3
ψ = Φ ⋀ c3 ¬⋀ p2
zChaff SAT solver
¬c1 ¬⋁ c2 ¬⋁ c3 ⋁ p1
¬c4 ¬⋁ c5 ¬⋁ p1 ⋁ p2
¬c6 ¬⋁ c7 ¬⋁ p1 ⋁ p2attack possibility constraints
policy requirement
MinCostSAT
Given ψ with n variables x1,x2,...,xn with cost ci ≥ 0,
find assignment X {0, 1}∈ n to satisfy ψ and minimize
C =∑ cixi
MinCostSAT in network reconfiguration :• Privilege variables incur cost when assigned true • Configuration variables incur cost when assigned false • Allow variables to be forced true or false
25
MinCostSAT
webServer
fileServer
buffer
overrun
NFS shellRemote exploit
Privilege Variables Cost
Execute code (file server) p2 1000
Execute code (web server) p1 50
Configuration settings Variables Cost
Access to web server c1 100
Active service (web server) c2 100
Active service (file server) c4 50
Vulnerability (file server) c5 20
File access on file server c6 50
NFS table (file server) c7 10
Minimal Cost Solution [total cost = 80]
Allow privileges on web server (p1) 50
Patch vulnerability on file server (c5) 20
Change NFS table settings (c7) 10
Scalability Testing
SizeCost func.
Num. of variables
Num. of clauses
Run time (sec)
100 hosts
(10 subnets)A 11,853 12,053 0.11
100 hosts
(10 subnets)B 11,853 12,053 0.21
250 hosts
(25 subnets)A 70,803 72,553 3.03
250 hosts
(25 subnets)B 70,803 72,553 6.49
Iterative UNSAT Core Elimination
• UNSAT Core : subset of original CNF clauses that are unsatisfiable by themselves
• For unsatisfiable formula ψ and UNSAT core μμ1, μ2,..., μn ⊆ ψ, ψ will remain unsatisfiable while μremains unchanged
• To resolve, a user needs to decide relative values of only a few network components
Iterative UNSAT Core Elimination
• Requires no up-front cost assignments,relies on human decisions as needed
• Further reduce user decisions by keeping partial-ordering lattice to store relative priorities established by prior decisions
• When two variables with known ordering appear in an UNSAT core, only lower-priority variable is presented to user
Open Problems
• How to come up with the numbers?– Monetary units?– How to estimate the costs?
• How to capture the difficulty level of attacks?– More difficult exploits reduces the risk?– Can this be done inline?– How about zero-day vulnerabilities?
• Scalability in production systems.
That’s it.
Questions?
30