Model-based Intrusion Assessment in Common...

Reprinted from Proceedings of the International Lisp Conference, March 2009

Model-based Intrusion Assessment in Common Lisp

Robert P. Goldman

SIFT, [email protected]

Steven A. Harp

Adventium [email protected]

Categories and Subject Descriptors K [6]: 5

Keywords computer security, intrusion detection, re-port fusion, IDS fusion, IDS correlation, lisp

1. IntroductionWe describe the Scyllarus system, which performs In-trusion Detection System (IDS) fusion, using Bayesnets and qualitative probability.1 IDSes are systems thatsense intrusions in computer networks and hosts. IDSfusion is the problem of fusing reports from multipleIDSes scattered around a computer network we wish todefend, into a coherent overall picture of network sta-tus. Scyllarus treats the problem of IDS fusion as anabduction problem, formalized using Bayes nets andKnowledge-based Model Construction (KBMC). Be-cause of the coarseness of the data available, Scyl-larus uses a qualitative framework, based on System-Z+. Qualitative Bayes nets allow Scyllarus to exploitthe strengths of probabilistic reasoning, without exces-sive knowledge acquisition and without committing toa misleading level of accuracy in its conclusions. TheScyllarus system gave excellent results on a medium-sized corporate network, where it was in continuoususe for approximately four years, and was validated in aDARPA-funded assessment. Under US Federal govern-1 The Scyllarus system is named after the Mantis shrimp, an animalthat detects its prey with one of the world’s most complex reti-nas. It is also one of the most formidable animals, for its weight,smashing its prey with heavily calcified clubs. “Mantis shrimpcan break through aquarium glass with a single strike from thisweapon.”(Wikipedia)

[Copyright notice will appear here once ’preprint’ option is removed.]

ment funding, we are now working to adapt Scyllarus toanalyze detection reports from sensors monitoring veryhigh speed (10 - 100 Gb/second) networks in a projectcalled “SMITE.”

Common Lisp (CL) has provided significant bene-fits to the development and deployment of Scyllarus.Most basically, it enabled the assembly of an ambi-tiously complex system, which uses multiple inferencetechniques at different stages of its processing: clus-tering, that is partly based on information encoded inits ontology, and partly in CLOS methods and data-dependency based logical reasoning combined withcost-based search for explanations (using qualitativeprobability) in order to weigh explanations against eachother. The Lisp garbage collector allows the Scyllarusanalyzer to run online indefinitely, despite its contin-uous construction of complex graphs of interrelatedevents and entities. Lisp also provided a good frame-work for integrating the ontology we developed usingthe Protege ontology editor. Finally, the ability to de-bug and hot-patch our algorithms while in operationhas proven invaluable. Nevertheless, all is not for thebest in the best of all possible worlds; we also reportsome rough spots in our use of CL in what has been arelatively long-lived research software project (approx-imately 9 years).

1.1 Intrusion detectionThe function of Scyllarus is to take reports from mul-tiple intrusion detection algorithms and fuse them intoa coherent picture of the state of the defended network(together with some information about the environmentin which that network operates). To perform this task,Scyllarus uses Bayesian (probabilistic) reasoning, pri-marily to answer two (interrelated) questions:

1. Is the notification (are the notifications) that Scyl-larus has received from the algorithms likely to re-flect a false positive?

Goldman, Harp, ILC 2009 1 2009/5/13

2. Is there a benign explanation that can explain awaythe notification or notifications that Scyllarus hasreceived? For example, a flood of SMTP messageswith duplicated content from a particular host mightbe a sign that that host has been compromised andturned into a spam bot. However, it’s also possiblethat the host is a bona fide mailing list server, andit’s just sending out the day’s digest messages.

Because the domain does not afford us access to goodstatistics, we do not use conventional Bayesian reason-ing. Instead, we use a qualitative abstraction of proba-bilistic reasoning, very similar to the big-O scheme fa-miliar to computer scientists, System-Z+ (Goldszmidtand Pearl 1992a,b, 1996).

Existing IDSes are not designed to work together, aspart of a suite of sensors. Instead, each program gen-erates a separate, and often voluminous, stream of re-ports, and fusing them into a coherent view of the cur-rent situation is left as an exercise for the user. Scyl-larus overcomes the limitations of both individual ID-Ses, and unstructured groups of IDSes. Instead of sim-ply joining together multiple alert streams, Scyllarusprovides a unified intrusion situation assessment. Crit-ical to this unification is Scyllarus’s Intrusion Refer-ence Model (IRM), which contains information aboutthe configuration of the site to be protected (includingthe IDSes), the site’s security policies and objectives,and the phenomena of interest (intrusion events).

Data reduction is a primary goal of Scyllarus. IDSowners regularly either ignore or partially disable them,unable to absorb the massive stream of reports. To get asense of the gravity of this problem, see Figure 1, whichshows how Scyllarus was able to winnow the flow ofreports in a small corporate network.

Often the most damning weakness of an IDS is ahigh false positive rate. In general, with any sensor, onemust pay in false positives for whatever is gained insensitivity. One way to overcome this limitation is toassemble a suite of sensors. This can be a very efficientway to overcome the problem of false positives, as longas we can find sensors that fail relatively independently.

1.2 SMITE projectThe Scyllarus project was begun at Honeywell in 1999and has been intermittently active since then. Currentdevelopment is being done in the context of the SMITEsystem, a BBN project funded by DARPA’s2 Scalable2 DARPA is the U.S. Defense Advanced Research Projects Agency.

Network Monitoring (SNM) program. The SNM pro-gram’s goal is to develop new approaches to network-based monitoring that deliver performance capabili-ties orders of magnitude better than conventional ap-proaches, regardless of the network’s size and com-putational burden. BBN’s approach deploys pipelinedsystems as data collectors on networks with multi-gigabit speeds. Special-purpose algorithms are beingdeveloped that are able to detect intrusion-relevantevents while keeping up with the network flow. Theevents are aggregated and fused by Scyllarus. See Fig-ure 2 for an overview of the SMITE system’s architec-ture. Current work on Scyllarus aims at optimizing itto be able to keep up with the flow of events from thehardware-based SMITE sensors, expected to cover 2 to3 orders of magnitude more traffic.

2. Scenario of UseThe malign explanation is not the only possible one,however. Figure 3 shows that there is an alternative,benign explanation. It is possible that what has reallyhappened is that a new service has been installed onthis host (“Legit Svc Added”) — that would accountfor new ports being opened. However, if a new servicewas legitimately added, we would also expect to see achange in the system’s (overt) configuration, but we arenot seeing that. On balance, the “compromised host”explanation is considered possible, but not especiallylikely.

Figure 4 shows how the situation might evolve withthe arrival of more evidence for intrusion. Here we seethat not only is the host in question accepting connec-tions to a new port, but we have also seen that it is ini-tiating a lot of connections outward, “Initiates Conns,”which we infer from reports from two sensors. Typi-cally, we would not expect a server to be initiating out-ward connections.3 Intuitively, the pattern of inferenceis as follows: the new legitimate service explanationwould account for the newly opened ports, but wouldnot account for the connection initiation. However, acompromised host (perhaps a host that has been addedto a botnet) would explain both symptoms.

3. Scyllarus ArchitectureThe architecture of Scyllarus, divided into four mod-ules, is depicted in Figure 5. The first is the input mod-3 with some exceptions such as DNS queries, SMTP transfers if amail server, etc.


16,000

Raw

Reports

IDS-1

IDS-2

IDS-3

Clustering

Reports

into Events

1000

4000

Evidence

Analysis 10

Uninteresting

events

Interesting

events Believable

Interesting

events

1

10

100

1000

10000

100000

2

4

6

8

10

12

14

16

18

20

22

24

26

28

30

Days in November, 2001

IDS Reports

Events

All Plausible Events

Med/High Plausibility &

Med/High Severity

High Plausibility &

High Severity

Figure 1. Scyllarus workload reduction.

ule, made up of the Sensor converters (or “verters”) andthe Report concentrator. The verters take reports fromIDSes and other sensors, translate them into Scyllarus-specific data strucures, and hand them off to the reportconcentrator for eventual storage and analysis. The sec-ond is the Cluster Preprocessor (CP), which assemblestogether sets of reports that could correspond to a sin-gle underlying event or process. The CP collects reportsthat could tend to either reinforce or disconfirm particu-lar hypotheses. The CP builds structures that are similarto belief networks (Pearl 1988). The third component,and the last of the active components, the Event As-sessor (EA) applies the logic of System-Z+ to evaluatecompeting explanations (e.g., mailserver versus spambot) for the reports. The final core component of Scyl-larus is the Intrusion Reference Model (IRM), a knowl-edge base describing the environment in which Scyl-

larus operates, and which supports the processing doneby the CP and EA.

4. Scyllarus input processingThe Scyllarus architecture was developed to flexiblyaccommodate reports from a diverse and changing setof sensors (primarily IDSes). One may plug arbitrarysets of translators into Scyllarus. These “verters” aresmall translator programs that translate IDS reports,which do not come in standardized formats,4 into astandard Scyllarus input report. The verters must bewritten anew for each IDS, but the effort is not too sub-stantial. For the SMITE project, we have the advantage

4 Even when we found sensors that complied with some standard,such as IDMEF, the standard wasn’t helpful, because it did notspecify semantics sufficiently to allow us to simply accept thereports.


Figure 2. The SMITE system architecture, including Scyllarus as “Event Correlation Analyzer.”

of a standard report format (implemented as a reportinglibrary to be compiled into each of the sensors) negoti-ated between the sensor and correlation teams, so thatwe need only a single SMITE verter.

After the reports have been translated into Scyllarusformat, they pass from the verters to the Report Con-centrator. The Report Concentrator receives incomingreports from IDSes and buffers the reports, ensuringthat the system remains responsive while not losingdata The Report Concentrator provides a real-time feedof reports to subscribers, the most important of whichare the event database and the Cluster Preprocessor.

5. Cluster PreprocessorThe Cluster Preprocessor reads raw reports posted bythe IDSs, and using background information providedby the IRM, a model of the protected network and a keyfor intrepreting IDS messages, produces clusters of IDSreports to be evaluated as events explaining the reports.

A single IDS report may give rise to one or more suchclusters.

The Cluster Preprocessor follows a simple process-ing loop:

1. Read the next IDS report from a socket stream con-nected to the Scyllarus Report Concentrator.

2. Match the IDS-provided report type to one or moreintrepretations known to Scyllarus. Each provides ahypothetical event purporting to explain the report.Scyllarus has models for various common networkand host-based IDSs.

3. Search through already hypothesized events for onesthat would explain each intrepretation of the newreport. Criteria for consistency vary from one typeof event to another, and are specified in the IRM asa set of “event test” objects to be satisified. Somebasic criteria include:


!"#$%"#&'()*+"',*

-"",.&,*

/(0)&01*/234*

5678,%9:01*;9,9*

/(%<(%*,"*!8&(0,*

+&1=*50,%"$>*56,(%098*

/>',(#*!"071*!=901(*

+&1=*?8"@*,"*AB,'&)(*

C(1&,*/<D*3))()*

E0&:9,('*!"00'*

/D900&01*

Possible

Unobserved

Unobserved

F",0(,*!G*H%9ID*

29''&<(*?H2*

Unobserved

Unobserved

H""*490>*2((%'*

Unobserved

Unobserved

E0D"#&01*498@9%(*

Unobserved

/(%<(%*3))'*2"%,'*

OBSERVED

E0J(D,()*

E0J(D,()*

Unobserved

Figure 3. Alternative explanations and the observations they might cause.

• occurrence within an acceptable temporal win-dow

• directed at the same target host and/or port• apparently originating from the same source• sharing a common user or login session

4. Propose new events as needed when existing onesare inconsistent with the new report.

5. Assemble new (or recently modified) events intoother larger-scale events. This allows Scyllarus toconsider multi-step attacks. Further model-basedtests for consistency are applied to this clustering.

6. Submit new or modified events and their supportingreports for evaluation.

The report stream (as well as other communicationswith Scyllarus) is SSL-encrypted on Lisp platformsthat support it. This preserves confidentiality and in-tegrity of the system, but it also is important to avoidincidentally triggering new supurious secondary alerts

from network IDSs that may see the Scyllarus reportingtraffic on the wire.

Identifying Independent Subsets of Events The Clus-ter Preprocessor is driven entirely by incoming reports.It spools events that need likelihood evaluation to theEA, but does not halt clustering to wait for assessmentto complete, since this evaluation time may be rela-tively long–extracting the most likely interpretationsfrom a very large ATMS network may take seconds.

Instead, the EA runs in a separate thread and eval-uates independent clusters of events as its processingbudget allows. The fundamental independence crite-rion is that the set of events implicitly defines a directedacyclic graph. Events are linked to other events and toreports according to the following relationships:

Supporters E → R. This is a relationship betweenan event and an IDS report that provides direct evi-dence for it. For example, a certain network IDS rulethat is triggered by a sequence of bytes commonly


!"#$%"#&'()*+"',*

-"",.&,*

/(0)&01*/234*

5678,%9:01*;9,9*

/(%<(%*,"*!8&(0,*

+&1=*50,%"$>*56,(%098*

/>',(#*!"071*!=901(*

+&1=*?8"@*,"*AB,'&)(*

C(1&,*/<D*3))()*

E0&:9,('*!"00'*

/D900&01*

Unobserved

F",0(,*!G*H%9ID*

29''&<(*?H2*

Unobserved

Unobserved

H""*490>*2((%'*

Unobserved

Unobserved

E0D"#&01*498@9%(*

Unobserved

/(%<(%*3))'*2"%,'*

OBSERVED

E0J(D,()*

E0J(D,()*

OBSERVED

OBSERVED

Likely!

OBSERVED

OBSERVED

OBSERVED

Figure 4. More evidence of intrusion arrives.

found in propagation of the Peacomm trojan couldsupport an event hypthesizing the malware infectionof the target host with Peacomm.

Components E(whole) → E(component). This is arelationship between events and other events thatmight be component parts of them. For example, onecomponent of a DNS cache poisoning attack is thesending of a flood of DNS queries.

Manifestations E(underlying) → E(manifestation).This is a relationship between an event and otherevents that might occur because the first event is oc-curring. For example, a worm’s propagation mightmanifest as repeated content transmission from theattacking host.

Specializes E(specific) → E(more general). Differ-ent IDS algorithms operate at different levels of res-olution. This link is a relationship between one pro-posed event and a more specific proposed event (e.g.

induced by a more precise type of IDS) that could beidentical.

The graph is defined by the closure under the abovefour links (and their inverses) of the set of events givento the assessor. Typically, this graph will have manyconnected components that are not connected with eachother. The connected components are the independentsubsets that the EA operates on separately.

Scyllarus finds connected components using a stan-dard depth-first graph search using the above link types.Additional link types could be added, if that were de-sirable, as long as the graph were to remain acyclic. Forincremental assessment, the search for connected com-ponents is slightly different, as discussed in the sectiondevoted to incremental assessment.


Report/Event!Archive!

Report!

Concentra

tor!

Event Assessor!

Event !Distributor!

Real!-!time !report feed!

Archive!&replay!

Events & !updates!

Real!-!time !event feed!

Intrusion !reports!

!"#$%&'("))!*+,+$+"-+).(/+0!

!"#$%&'("))!*+,+$+"-+).(/+0!

RT Java!Console!

1%+$'+&!

Static !Information!

Events!Reports!

Sensor!Converter!

Sensor!Converter!

Sensor!Converter!

Sensor!Converter!

Sensor!Converter!

Sensor!Converter!

Sensor!Converter!

Sensor!Converter!

Sensor!Converter!

IDMEF correlation reports to !other !correlators!or analyzers!

Cluster Preprocessor

Figure 5. Scyllarus architecture

6. Event AssessorThe Event Assessor uses qualitative probabilistic/Baye-sian reasoning to assess the likelihood of various eventhypotheses. In the current Scyllarus architecture, theEA is invoked by the Scyllarus Cluster Preproces-sor. The EA accepts as input clustered event hypothe-ses, together with their supporting reports. The EAbuilds qualitative probabilistic inference networks cor-responding to the clustered reports and events. It usesthese networks to compute posterior surprise levels(qualitative likelihoods) for the event hypotheses. Thesesurprise levels are recorded in the event structures, andmay be written into the IRM database for persistentstorage.

The EA must perform four primary computationaltasks:

1. Identify independent sub-graphs in the network de-fined by the event and report structures and the linksbetween them. This is done with depth-first search.

2. Build Bayes networks and identify evidence inter-pretations in these networks. The Bayes nets are

implemented as ATMS dependency networks. TheATMS computes interpretations corresponding tothe Bayes networks using its labeling algorithms.

3. Extract the set of most likely interpretations froman ATMS network. This is done using search al-gorithms. We search for interpretations of mini-mal cost. The solution used is primarily one ofdepth-first iterative deepening, although some spe-cial cases are handled differently.

4. Extract surprise levels from the most likely inter-pretations. Currently, we simply differentiate be-tween three classes of events: plausible events, thatappear in some of the most likely interpretations,unlikely or implausible events, that do not appearin any of the most likely interpretations and likelyevents, which are plausible events and, additionally,whose negation never appears in a likely interpreta-tion. That is, for a plausible event, E, it is also pos-sible that not(E) is plausible. An event E is likely ifE is plausible and not(E) is implausible. Extractingsurprise levels may simply be done by examiningthe interpretations generated in step 3.


6.1 Underlying Theory: System-Z+ QualitativeProbability

We have taken an approach, based on qualitative proba-bilities, that shares the basic structure of normal proba-bility theory but abstracts the actual probabilities used.We did this primarily to simplify knowledge acqui-sition and make it as simple as possible to incorpo-rate new IDSes into the Scyllarus architecture. Thisapproach may also permit cheaper computations thanthe normal probability calculus, but that remains to beseen.

Our approach is based on System-Z+, developed byMoises Goldszmidt and Judea Pearl (1996). In System-Z+, events are given a natural number rank, κ, that cor-responds to their degree of surprise (e.g., a rank of oneis more surprising than zero). The semantics of thisscheme comes from a set of probability distributionsin which the probabilities are polynomials in some in-finitesimal ε. In this scheme, the κ rank corresponds tothe exponent of the leading term of the polynomial. Thescheme is similar to the “big-O” notation used for eval-uating computational complexity in computer science.

In practical terms, the effect of this semantics is togive System-Z+ a qualitative flavor by providing a “lad-der” of events of qualitatively different orders of likeli-hood. Of course, we sacrifice exactness in doing so; welose the ability to talk about events being slightly moreor less likely. However, this sacrifice of exactness is notan issue in the Scyllarus intrusion detection application.

The EA must combine the judgments of a wide va-riety of intrusion detection systems (and potentiallyother relevant information sources), that use widelyvarying sources of information and algorithms. Fur-ther, in general we will not have access to the inter-nals of these sensors. In such an environment, it isnot realistic to expect good models of the response ofthese sensors; in particular, exact measures of P(sensorresponse—event) are not available. There have beensome attempts to investigate sensor response (e.g., thestudies conducted by Lincoln Labs (Lippmann et al.2000)), but the results seem heavily dependent on thecontext in which the sensors are deployed.

The issue of prior probabilities also militates againstthe use of exact probabilities. In order to use an exactBayesian method, we would need not only the detec-tion probability, P (sensor response|event), and thefalse alarm probability, P (sensor response|¬event),but also P (event), a measure of the prior probabilities

of the events that interest us, in this case the attacks andthe benign events that can cause false positives. Evenin the most constrained environments, the probabilitiesof the various attacks, are unlikely to be available tous, and the Scyllarus system is designed for applicationacross a wide variety of enterprises. Further, the prob-ability distributions for benign events are likely to beof odd forms (e.g., one’s own network-mapping soft-ware runs at particular times of the day). So our solu-tion must tolerate vague measures of likelihood.

Finally, in this domain, as with most practical appli-cations of probabilistic updating, the effect of the ev-idence will usually overwhelm the effect of the priorlikelihoods(e.g., (Pradhan et al. 1996)). So inexactitudein the quantities specified will not matter to our finalconclusions.

As far as computation is concerned, we may applythe normal operation of probability theory: condition-alization, Bayes’ law, etc. However, the arithmetic op-erations we use must change. Rather than multiply-ing probabilities, we add degrees of surprise. Ratherthan adding probabilities, we use min. Goldszmidt andPearl (1996, p. 59) provide the following substitutionsin their paper:

P (ω) =∑

φ∈ω P (φ) κ(ω) = minφ∈ω P (φ)P (ω) + P (¬ω) = 1 κ(ω) = 0 ∨ κ(¬ω) = 1P (ω|φ) = P (ω ∧ φ)/P (φ) κ(ω|φ) = κ(ω ∧ φ)− κ(φ)

Instead of the probability of an event being the sum ofthe probabilities of the primitive outcomes that make upthat event, the degree of surprise of an event is the min-imum of the degrees of surprise of the primitive out-comes that make it up. Instead of having the probabili-ties of mutually exclusive and exhaustive events sum toone, at least one of a set of mutually exclusive and ex-haustive events must be unsurprising. Finally, we havean analog of Bayes’ law in which the normalizing op-eration consists of subtraction rather than division.

We used Bayesian networks to help us in model-ing and solving the correlation problem. Bayesian net-works are ways of graphically capturing probabilisticreasoning. They are useful in expert systems becausethey simplify knowledge acquisition and, by captur-ing (conditional) independences, simplify computation(Pearl, 1988). In particular, in the domain of intrusiondetection, Bayes nets help us capture several importantpatterns or probabilistic reasoning:


• Reasoning based on evidence merging;• “Explaining away” reports by alternative explana-

tions. E.g., if a benign event accounts for a num-ber of reports, those reports will be explained away,and no longer provide support for more alarming hy-potheses.

• Abstraction reasoning that employs the subclass/su-perclass relationships in the event dictionary.

• Part/whole reasoning, to recognize complex com-posite events.

• Distinguishing between judgments that are based ondifferent sensor bases and those that use the samesensor. This helps us distinguish between caseswhen two sensors provide support for each otherand when we simply have redundant reports (e.g.,two network intrusion detection systems using ex-actly the same algorithm that see the same traffic, attwo different points).

A Bayesian network is a directed, acyclic graph (DAG)depicting a set of random variables. Edges betweennodes in the DAG represent causal influences. Usinga Bayesian network, we can capture a joint distribu-tion factorized into unconditional probabilities for rootnodes and conditional probability tables for non-rootnodes. The conditional probability tables contain prob-ability distributions for the child nodes, conditionedon all the values of their parents. For a thorough, butreadable, introduction to Bayesian networks, we rec-ommend Charniak’s (1992) “Bayesian Networks With-out Tears.”

There are a number of efficient algorithms for find-ing the posterior distributions of Bayesian networks,conditional on observations of some of the randomvariables. These algorithms may readily be adapted toprovide posterior κ rankings instead of probabilities.

6.2 System-Z+ and the ATMSThe Scyllarus Event Assessor (EA) does System-Z+Bayes net inference by representing the Bayes nets in aAssumption-based Truth Maintenance System (ATMS)with weighted assumptions, and finding minimum costenvironments for the ATMS networks. We adopted theATMS approach simply because the ATMS code wasreadily available, and we expected later to replace theATMS with a special-purpose System-Z+ Bayes netevaluator. However, with the exception of some patho-

logical cases, which we handle specially, System-Z+inference has never been a bottleneck in Scyllarus.

An ATMS (deKleer 1986) is a propositional logicdatabase with data dependencies or justifications, thatrecord the derivation of the literals from distinguishedassumptions. An ATMS network is a directed hyper-graph whose vertices are a set of literals, L. Amongthe literals are a distinguished contradictory node, ⊥,and a subset of assumptions, A ⊆ L. The justificationsare a set of hyperedges, J : 2L × L, whose tails aresets of literals, the justifiers, and whose head is a lit-eral, the justificand. Each justification is a boolean con-straint indicating that the justifiers entail the justificand.Using the justifications, the ATMS uses a boolean con-straint propagation algorithm to compute a labeling forthe set of literals. Each literal is labeled with a set ofenvironments. Each environment, Ei ⊆ A, is a mini-mal set of assumptions that, taken together, entails theliteral. Justifications whose justificand is the ⊥ nodeare used to identify inconsistent environments. We haveused the ATMS code supplied in Forbus and DeKleer’stextbook (Forbus and deKleer 1993).

ATMSes can be used to encode Bayes networks (Char-niak and Goldman 1988; Provan 1989).Each valueassignment to a random variable in the Bayes net isrepresented by a literal. Each conditional or uncondi-tional probability in the Bayes net is represented byan assumption. For example, in a Bayes net with the(boolean) nodes A,B, C and edges A → C,B → C,there will be literals lA, lB, lC , lA, lB, lC and assump-tions:

aA, aA,aB, aB,

aC|AB, aC|AB, aC|AB, aC|AB,

aC|AB, aC|AB, aC|AB, aC|AB

There will also be justifications representing the prob-abilistic entailments. For example, 〈aA → lA〉 and〈aA → lA〉 illustrate the representation of root nodesand unconditional priors in the Bayes net. The condi-tional probabilities of internal nodes are represented us-ing justifications like these:

〈lA, lB, aC|AB → lC〉, 〈lA, lB, aC|AB → lC〉, . . .

We also have justifications to ensure consistency, e.g.:〈aA, aA → ⊥〉.

With the above encoding, the ATMS algorithm willcompute labelings that represent the prior probabilities


in the Bayes network. In the above example, the literalswill be labeled as follows:

lA {{aA}}lA

{{aA

}}lB {{aB}}lB

{{aB

}}lC

{{aA, aB, aC|AB

},{

aA, aB, aC|AB

},{

aA, aB, aC|AB

},{

aA, aB, aC|AB

}}lC . . .

From the above labelings we can recover the priorprobabilities by multiplying together the conditionaland unconditional probabilities associated with eachassumption node. Posterior probabilities can be com-puted by collecting the set of environments consistentwith the observation set and normalizing the prior prob-abilities accordingly.

Computing System-Z+ κ values may be done in asimilar way, with some differences to account for thedifferences in the calculi. If we wish to find posteriorκs for a set of observations ω, we must find the setof minimum-cost environments consistent with ω. Un-fortunately, this cannot simply be done by combiningthe labels of the literals in ω. For one thing, the envi-ronments in the labels for the different literals are notguaranteed to be independent.

We may find the environments we want using asearch algorithm, whose search states are pairs of en-vironments (sets of assumptions) and sets of reports.

1. let R be the set of reports; let O be the openlist

2. O := list(〈∅,R)

3. choose s = 〈E,R〉 from O

4. if R = ∅ then s is a solution

5. choose a report, r ∈ R

6. for each environment, E′ ∈ label(r):

7. unless nogood(E′ ∪ E) add 〈E′ ∪ E,R− r〉 to O.

The set of environments produced by this search al-gorithm are sufficient to partition the set of eventsinto likely, plausible, and unlikely subsets: An eventis likely if it is entailed by all the minimum-κ environ-ments. An event is plausible, if both it and its negationappear in some minimum-κ environments. An event is

unlikely if it appears in none of the minimum-κ envi-ronments.

Some notes are worth making: First, we must findall the minimum-cost (minimum κ) solution environ-ments. We are free to choose whatever search methodwe wish to execute the above search. Initially we useda naive A∗ algorithm; later we switched to depth-firstiterative deepening in order to avoid excessive memoryrequirements. For greater efficiency, we perform infer-ence on individual closed subgraphs of the ATMS net-work, rather than the entire network.

The CP builds networks of events and reports withthe following structures:

Event → report links Associated with each report is aset of events that it supports (i.e., provides evidencefor).

Event → event part-of links There are a limited num-ber of exploits that have distinguished parts that canbe detected independently.

Event → event specialization links Because the dif-ferent IDSes have different classifications of events,it is possible that one IDS will report an event E,while another will report an event E′ where E′ is amore specific event class than E.

To summarize, we build an ATMS network that, foreach record, considers the possibility that the recordcorresponds to a true detection or a false positive. Wealso represent the causal and logical relations amongthe different events in the set of events. Note that thisalgorithm can either be used to create an ATMS net-work for the full set of events in the database, or anyclosed subset of the set of events, for incremental rea-soning.

We found two special cases that were challengingfor EA inference. One was the simple case of an ATMSnetwork with only a single event. While this is a triv-ial case of inference, it caused problems when therewere many reports (in some cases, thousands of re-ports). We wrote a special-purpose System-Z+ solverfor such networks, significantly improving throughput.Another optimization we made was to filter symmetri-cal environments out of the search; again this providedsignificant inference speedups.

7. Intrusion Reference ModelThe process of knowledge-based model construction isdriven largely by extensive models of existing IDSes.

Goldman, Harp, ILC 2009 10 2009/5/13

Classes

Instances Attributes of an instance

Figure 6. Protege screenshot, showing the Scyllarus IRM.

The task of putting the various sorts of IDS reports ona common semantic footing has proved more challeng-ing than expected. There is little consistency in termi-nology between (or even within) IDSes and often quitedifferent principles of detection are employed, makingnominally similar messages less than fully comparable.

An extensive ontology for expressing the IRM hasevolved over several versions of Scyllarus to becomethe foundation of our approach to this problem. All ofthe IRM concepts are expressed in this modular ontol-ogy, maintained in the Protege (Noy et al. 2001) tool.Part of this ontology is used to model the protectedcomputers and network, while other parts are devotedto the characterstics of the defenses. In particular, anIDS ontology module exists in the IRM for each sort ofIDS supported. These models pertain both to hypothe-sis formation and evaluation, so we will discuss thembriefly here.

Each type of IDS is capable of emitting a range ofdifferent reports describing some aspect of a possibleintrusion it has observed. Commonly these differentmessages will be derived from different discrete ele-ments such as rules or detection modules within theIDS. Some have a repertoire of just a few messages(e.g. firewalls) while others have thousands (e.g. Snort).Scyllarus maintains an explicit model of each messagegenerating element.

An IDS report will ususally mention three sorts ofdetails. First, it will almost always contain an identify-ing string that describes the generic exploit or vulner-ability implicated. An example is the report generatedby snort rule with SID 1635, “POP3 APOP overflowattempt”, which is an attempt to exploit a vulnerabilityin an XMail POP server, allowing the attacker to ex-ecute arbitrary commands via overflowing an internalbuffer.

Goldman, Harp, ILC 2009 11 2009/5/13

This sometimes comes with a standardized citationof a cataloged vulnerability, such as a code from CVE,the Common Vulnerability Enumeration (CVE). Theaforementioned vulnerability has such a number, CVE-2000-0841. However many IDS reports do not havesuch an association, or the association is not unique,and we have had to rely on the vendor-specific identi-fiers. In cases where this identifier has not been unique,we have modified it to make it so, since we are inter-ested in the performance characteristics of individualrules or detection modules. Cases of differential perfor-mance within a single type of IDS have arisen, for ex-ample, multiple signatures that diagnose the presenceof a particular computer virus wherein some rules arehighly specific while others are prone to be triggeredby normal traffic.

Two other details that nearly all IDS reports provideare the identifiers of the initiator and intended victim ofthe attack. This may include a particular host identifier,a network address, a user name, process identifier, etc.depending on the type of IDS and nature of the oberva-tion. In network IDSes the designation of source anddestination in the observed network packets usually,but not always, corresponds to the attacker and victim.Sometimes the easiest way to observe an attack is bydetecting the response of the victim, thus reversing theusual network order. Scyllarus uses the IDS models tonormalize these designations.

Each message-generating element in a supportedtype of IDS has one or more models in the IDS spe-cific ontology module. These models, which we callreport signatures, are causal interpretations of the re-port in terms of the ontology. A signature is a tem-plate that describes a possible attack or other event thatmay give rise to the given sort of report. It uses sev-eral extensive hierarchies of IRM concepts to do so.The first, is a taxonomy of operations. Operations areelementary actions on the protected system that maybe undertaken for good or ill, such as reading a file,starting a process, or executing a step in a protocol.Scyllarus has an a-kind-of hierarchy listing hundredsof operations. A different IRM taxonomy models thepossible intentions of a causal agent. The intent maybe specified in a report signature to cast a benign ormalevolent interpretation of the operation. The intentclassification also provides a rough measure of the se-riousness of the event. Modeled intentions vary fromspecific sorts of denial of service, the seizing of privi-

leges, as well as administrative (wholesome) intentionssuch as “achieve file-server archival backup to tape.”

Many signatures also cite specific victim softwareor systems, vulnerabilities (Scyllarus incorporates CVEand other classification schemes), or certain “malware”(malicious programs) that are implicated. An importantpart of the signature model is the qualitative false posi-tive rate of generating element, and the presumed rarityof the interpretation. Various other details of the repre-sentation are omitted here for brevity.

8. ExperienceScyllarus was used to monitor an operational networkof over 500 workstations and servers using three differ-ent types of network intrusion detector and two differ-ent types of host intrusion detectors located at variouspoints in the network over a period of 4 years, at whichtime the sensors were relocated to a small test networkwith limited access to network traffic.5 Over the periodof its use, Scyllarus proved itself to be a substantial ad-vance in the state of the art for IDS fusion.

Scyllarus routinely handled quiet day traffic of 10,000– 20,000 IDS reports per day. On more “exciting” days,the traffic was considerably heavier; e.g., on the day ofthe release of the Code Red worm, Scyllarus receivedmore than 1,000,000 reports.

We tested Scyllarus with controlled exploits on ournetwork and the system has responded appropriately.We were also able to detect an episode of penetrationtesting conducted without warning by an independentsecurity team.

In 2003, the ability of Scyllarus attacks was demon-strated in an evaluation conducted as part of the DARPACyber Panel program (Haines et al. 2003). In this eval-uation, a number of network attacks were launched bya dedicated Red Team in a simulated warfare planningenvironment.

Scyllarus addresses the information overload facedby IDS users. On quiet days Scyllarus is able to winnowthe flood of reports down to a handful of events that areworthy of investigation. See Figure 1 for representativedata on Scyllarus’s report filtering.

9. Related WorkSecurityFocus has developed the Attack Registry andIntelligence Service (ARIS) (ARIS). The ARIS extrac-tor collects IDS reports from four different IDSes, for-5 Walt Heimerdinger, personal communication

Goldman, Harp, ILC 2009 12 2009/5/13

mats them in XML, and presents them in an incidentconsole. However, it makes no attempts to fuse the re-ports or weigh the evidence for and against them.

MetaSTAT is a fusion system that is built on a setof STAT-based IDSes (Vigna et al. 2001). STAT is asignature-based IDS that detects events by matchingagainst extended finite-state event models. MetaSTATuses finite-state models of across-sensor events to con-sume at a higher level the events generated by lower-level sensors. MetaSTAT does not attempt to judge theplausibility of different events.

EMERALD/eBayes (Valdes and Skinner 2001) fu-sion is the most similar to Scyllarus. The eBayessensors are Bayes net-based, and the correlation ap-proach allows “upstream” sensors to adjust the priorson “downstream” sensors. eBayes fusion is limited toclustering together alerts that meet a similarity crite-rion; they do not have models of high-level events as inthe Scyllarus IRM.

Prelude Correlator (Vandoorselaere 2008) is partof the open source Prelude IDS information system,and allows users to analyze reports sent to Preludefrom compatible IDSs. Users provide rules written inLua (Ierusalimschy et al. 2006), a scripting languageinspired by Scheme and Icon. Its function is closest tothe Scyllarus clustering preprocessor, but knowledgeresides in stateful rules instead of an ontology of at-tacks.

A commercial product, Arcsight Enterprise SecurityManager (ArcSight 2008), also ties correlated IDS re-ports to an installation’s security goals and vulnerabil-ity information.

10. Lisp Lessons LearnedThe Scyllarus project has involved long term devel-opment, maintenance and modification of a large andcomplex Common Lisp code system. We have seenclear benefits from many aspects of Common Lisp,most notably the facilities for interactive development,hot-patching, etc. We have also discussed how well CLand Protege work together, providing major assistancein building and maintaining Scyllarus’s IRM. However,our lessons learned include some ways in which CLwas not helpful. One of these was less a problem withCL than an occasion where we missed an opportunityto profit from CL. We also found challenges in CL codemaintenance.

One possible advantage of CL is providing bettersupport for unit testing. Complex networks of inter-related objects are very challenging for unit testingregimes because setting up unit test situations often re-quires an inordinate amount of effort to recreate suchnetworks outside of a fully-functioning system. TheJava community, for example, has devoted a great dealof effort to this issue, along the way spawning a thickjargon of “mocks,” testing frameworks, etc. The Rubycommunity has been quick to claim that its “duck typ-ing” provides solutions to some of the problems posedby Java’s more rigid type scheme.

Many of CL’s features offer opportunities to ease theprocess of unit test development. CLOS provides all ofthe flexibility of Ruby’s “duck typing” together withmethod combination and eql methods to ease the needfor full-fledged mock object frameworks. Another ad-vantage is not so much to do with CL itself as withthe lisp mindset: it is our observation that lispers tendto assume that data should almost always have a read-able and writeable representation. Having a readableprintrep substantially simplifies the process of scriptingtests, as anyone who has seen Java code with literallyhundreds of lines to initiate relatively simple objectswill attest. The :around methods and dynamic scop-ing are also powerful tools for setting up (and automat-ically tearing down) unit tests.

We say all this in some humility, because most ofScyllarus was written before unit testing became stan-dard practice. Converts to unit testing, we have foundthat it has been enormously difficult to take the exist-ing code, with its complex interrelationships, and de-compose it so as to make it more unit testable. Doingso requires substantial refactoring to enable us to makeuse of the facilities we refer to above.

In one way, however, CL unit testing can present dif-ficulties beyond those of more conventional languages,and that is that we often wish to conduct unit tests in thecontext of a running system. If one is writing a largeJava system, for example, it is typically acceptable tocompile the system, start it up, run the unit tests, andthen close the system down. However, in developinga large lisp system, we often would like to be able tounit test our system without damaging its function; wedo not want to shut it down after testing. This presentssubstantial additional challenges to test design.

We have found two substantial challenges in main-taining such a long-lived CL code base. The first is a

Goldman, Harp, ILC 2009 13 2009/5/13

lack of strong data hiding mechanisms, notably limi-tations on access to object internals. By and large in-formation hiding is managed by convention rather thanlanguage support — in our opinion the package mecha-nism is too heavyweight to simply hide access to someclass slots. We are wary of the reliance on coding con-ventions, since they have not been at all stable over thelifetime of at least this code base, which was created ata very low time in CL’s fortunes. We are also not find-ing a “lisp culture” strong enough to orally transmit theculture of lisp to enough new programmers, making re-liance on convention difficult (although we hope thatthe current renaissance will ease this problem). We areaware that many have argued that the absence of stronginformation-hiding is an advantage of CL, favoring ex-pressiveness and flexibility, but we have seen develop-ment of Scyllarus move in fits and starts, and anythingthat would clarify interfaces in an “in your face” waywould be helpful.

A second challenge has been the lack of linguisticsupport for APIs. We have found when, for example,adding a new subclass to an existing class, it is verydifficult for a developer to know what new methodsneed to be coded. Sonia Keene (Keene 1989) and theCLIM standard (McKay and York 1993) both suggestusing protocol specifications to overcome this prob-lem. In our opinion this flies in the face of experi-ence which indicates that documentation divorced fromcode (especially code developed in the informal, inter-active style of CL) tends to stray from the implementa-tion rapidly and lose its value. We favor some mecha-nism that “lives in” the code, evolves as the code doesand, preferably, provides the programmer with warn-ings when he has failed to adequately implement a pro-tocol. In related work (Goldman and Maraist 2008), fora limited case, we have developed macros for genericfunctions that warn programmers when they fail tofully implement an evolving interface. However thatwork is not sufficiently general.

There were also some minor nuisances that had to dowith the age of the ANSI CL spec, which we encoun-tered when we were making our code more portable.Scyllarus has run on Solaris, Linux, MacOSX andWindows. It was originally was developed in Alle-gro Common Lisp (ACL), and employed a numberof proprietary ACL libraries, notably for sockets andmulti-threading. We have developed our own portableCL libraries for these functions, and have successfully

tested Scyllarus on Steel Bank Common Lisp (SBCL),as well. We expect that it would be easy to modify itto run on other CLs as well, although our socket andthreading libraries would need to be extended. Finally,we doubt that it will come as an enormous surpriseto serious common lispers that logical pathnames, andeven conventional pathnames, were a nuisance to workwith, and presented us with several portability chal-lenges.

AcknowledgmentsThis material is based on work funded by the DARPAScalable Network Monitoring program under contractto SPAWAR Systems Center. Approved for Public Re-lease, Distribution Unlimited.

ReferencesArcSight. Arcsight enterprise security manager. http:

//www.arcsight.com/product info esm.htm, 2008.

ARIS. Attack registry & intelligence service. http://aris.securityfocus.com/AboutAris.asp, 2003.ARIS analyzer Data Sheet.

Eugene Charniak and Robert P. Goldman. A logic forsemantic interpretation. In Proceedings of the AnnualMeeting of the Association for Computational Linguistics,pages 87–94, 1988.

Johan deKleer. An assumption-based TMS. Artificial Intel-ligence, 28:127–162, 1986.

Kenneth D. Forbus and Johan deKleer. Building ProblemSolvers. MIT Press, Cambridge, Massachusetts, 1993.

Robert P. Goldman and John S. Maraist. Shopper: Interpreterfor a high-level web services language. Unpublished MSSsubmitted to ILC 2009, 2008.

Moises Goldszmidt and Judea Pearl. Stratified rankings forcausal modeling and reasoning about actions. In Pro-ceedings of the Fourth International Workshop on Non-monotonic Reasoning, pages 99–110, Plymouth, VT, May1992a.

Moises Goldszmidt and Judea Pearl. Rank-based systems:A simple approach to belief revision, belief update andreasoning about evidence and actions. In Proceedingsof the Third International Conference on Principles ofKnowledge Representation and Reasoning, Cambridge,MA, October 1992b.

Moises GoldszmidtMoises Goldszmidt and Judea Pearl.Qualitative probabilities for default reasoning, belief revi-sion and causal modeling. Artificial Intelligence, 84(1–2):57–112, 1996.

Joshua Haines, Dorene Kewley Ryder, Laura Tinnel, andStephen Taylor. Validation of sensor alert correlators.

Goldman, Harp, ILC 2009 14 2009/5/13

IEEE Security and Privacy, 1(1):46–56, 2003. ISSN1540-7993. doi: http://doi.ieeecomputersociety.org/10.1109/MSECP.2003.1176995.

Roberto Ierusalimschy, Luiz Henrique de Figueiredo, andWaldemar Celes. Lua 5.1 Reference Manual. Lua.org,2006.

Sonya E. Keene. Object-oriented Programming in CommonLisp. Addison-Wesley, 1989.

W. Lee, L. Me, and A. Wespi, editors. Recent Advances inIntrusion Detection (RAID 2001), number 2212 in LNCS,October 2001. Springer-Verlag.

John Leydon. IDS users swamped with false alerts. TheRegister, 2001. http://www.theregister.co.uk/content/55/23420.html.

Richard Lippmann, Joshua W. Haines, David J. Fried,Jonathan Korba, and Kumar Das. Analysis and resultsof the 1999 DARPA off-line intrusion detection evalu-ation. In RAID ’00: Proceedings of the Third Interna-tional Workshop on Recent Advances in Intrusion Detec-tion, pages 162–182, London, UK, 2000. Springer-Verlag.ISBN 3-540-41085-6.

Scott McKay and William York. Common lisp in-terface manager release 2.0 specification, Oc-tober 1993. Available on the web at http://www.labri.fr/perso/strandh/Teaching/MTP/Common/CLIM-spec/cover.html.

N. F. Noy, M. Sintek, S. Decker, M. Crubezy, R. W. Ferger-son, and M. A. Musen. Creating semantic web contentswith protege-2000. IEEE Intelligent Systems, 16(2):60–71, 2001.

Judea Pearl. Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference. Morgan Kaufmann Pub-lishers, Inc., Los Altos, CA, 1988.

Malcolm Pradhan, Max Henrion, Gregory Provan, Bren-dan Del Favero, and Kurt Huang. The sensitivity of beliefnetworks to imprecise probabilities: an experimental in-vestigation. Artificial Intelligence, 85(1–2):363–397, Au-gust 1996.

Gregory Provan. An Analysis of ATMS-based Techniquesfor Computing Dempster-Shafer Belief Functions. InProceedings of the 11th International Joint Conference onArtificial Intelligence, pages 1115–1120. Morgan Kauf-mann Publishers, Inc., 1989.

Alfonso Valdes and Keith Skinner. Probabilistic alert cor-relation. In Lee et al. (2001). URL http://www.sdl.sri.com/papers/raid2001-pac/.

Yoann Vandoorselaere. Prelude correlator. https://trac.prelude-ids.org/wiki/PreludeCorrelator,March 2008. Prelude Correlator online documentation.

Giovanni Vigna, Richard A. Kemmerer, and P. Blix. De-signing a Web of Highly-Configurable Intrusion Detec-

tion Sensors. In Lee et al. (2001), pages 69–84.Wikipedia. Mantis shrimp. Wikipedia entry, 2008. URL

http://en.wikipedia.org/wiki/Mantis shrimp.

Goldman, Harp, ILC 2009 15 2009/5/13

Date post:	26-Sep-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Model-based Intrusion Assessment in Common...

Documents