+ All Categories
Home > Documents > OpenSAFE: Employing a Programmable Network Fabric...

OpenSAFE: Employing a Programmable Network Fabric...

Date post: 23-May-2018
Category:
Upload: dodieu
View: 217 times
Download: 2 times
Share this document with a friend
14
OpenSAFE: Employing a Programmable Network Fabric for Measurement and Monitoring Jeffrey R. Ballard, Aaron Gember, Brian Kroth and Aditya Akella University of Wisconsin–Madison {ballard,agember,bpkroth,akella}@cs.wisc.edu ABSTRACT Administrators of today’s networks are highly interested in monitoring traffic for purposes of collecting statistics, de- tecting intrusions, and providing forensic evidence. Unfor- tunately, network size and complexity can make this a daunt- ing task. Aside from the problems in analyzing network traf- fic for this information—an extremely difficult task itself—a more fundamental problem exists: how to direct the traffic for network analysis and measurement in a robust, high per- formance manner that does not impact production network traffic. Current solutions fail to address these problems in a man- ner that allows high performance and easy management. In this paper, we propose OpenSAFE, a system for enabling the arbitrary direction of traffic for security monitoring applica- tions at line rates. Additionally, we describe ALARMS, a flow specification language that greatly simplifies manage- ment of network monitoring appliances. Finally, we demon- strate our OpenSAFE implementation using both live net- work traffic and controlled traffic. Analysis shows that our OpenSAFE implementation handles higher traffic volumes than our existing monitoring infrastructure. 1. INTRODUCTION Networks are traditionally monitored for many purposes including performance optimization, usage tracking, secu- rity and intrusion detection, auditing, compliance checking, and forensic analysis. Each of these application domains places different constraints on monitoring techniques. Un- fortunately the constraints can work against each other mak- ing the monitoring functions ill-suited to work on the same underlying platform and/or hardware. In addition, growing link speeds and network fan-out are making effective net- work monitoring even more challenging. There are two common approaches to monitoring and mea- surement today. The first is to deploy on-path middleboxes. By their very nature, middleboxes directly affect network traffic. This has both positive and negative effects. For in- stance, to avoid being overrun, the middlebox has the ca- pability to slow traffic down to correctly process it. How- ever, this affects measurements of how this traffic would nat- urally behave. Furthermore, general-purpose middleboxes that support a variety of monitoring functions are expensive, typically lack in flexibility since changing them would inter- rupt connectivity, and can present a management challenge. The second, more preferred solution to monitoring traffic is to employ a mirror (also called a copy or span) of network traffic at interesting points of the network. This has the ben- efit of not altering the production network traffic. Unfortu- nately, processing mirrored traffic is not without key issues either. First, hardware limitations often prevent the ability to provide multiple traffic mirrors, which limits the number of monitoring devices that can be used. Second, the arriv- ing network traffic can quickly overwhelm the monitoring computer, rendering it useless. Thus, network administrators have to employ expensive special purpose hardware and un- dertake painstaking “tuning” of software and hardware func- tions on the monitoring equipment to avoid being overrun by the incoming traffic. Third, the introduction of new monitor- ing mechanisms often requires difficult physical rewiring of the network connections and may encounter other hardware limitations such as limited mirror ports. The focus of our paper is on improving mirroring-based approaches. We introduce a new measurement and monitor- ing framework called OpenSAFE to address the drawbacks of existing approaches. OpenSAFE uses a commodity pro- grammable network fabric to direct mirrored traffic in ar- bitrary ways. Several inexpensive, unoptimized monitoring devices can be “plugged into” OpenSAFE to process sub- streams data. We also introduce a high-level language to en- able network administrators to express rich policies to con- trol how traffic is acted on by various monitoring devices. OpenSAFE also allows monitoring devices to dynamically signal interest in, and receive, particular traffic. We describe the design, implementation, and evaluation of OpenSAFE in the rest of this paper. We first provide the background and motivation for our proposed system in Sec- tion 2. We then describe the key abstractions in OpenSAFE— namely, those for mirror ports, monitoring devices of dif- ferent kinds, and various mechanisms for controlling traf- fic. These form the building blocks for deploying diverse and flexible monitoring configurations (Section 3). We then describe ALARMS, a high-level language that represents OpenSAFE’s abstractions in a simple policy language syn- tax. ALARMS employs simple intuitive constructs to fa- cilitate arbitrary selection of traffic sub-streams to monitor, 1
Transcript

OpenSAFE: Employing a Programmable Network Fabric forMeasurement and Monitoring

Jeffrey R. Ballard, Aaron Gember, Brian Kroth and Aditya AkellaUniversity of Wisconsin–Madison

{ballard,agember,bpkroth,akella}@cs.wisc.edu

ABSTRACTAdministrators of today’s networks are highly interested inmonitoring traffic for purposes of collecting statistics, de-tecting intrusions, and providing forensic evidence. Unfor-tunately, network size and complexity can make this a daunt-ing task. Aside from the problems in analyzing network traf-fic for this information—an extremely difficult task itself—amore fundamental problem exists: how to direct the trafficfor network analysis and measurement in a robust, high per-formance manner that does not impact production networktraffic.

Current solutions fail to address these problems in a man-ner that allows high performance and easy management. Inthis paper, we propose OpenSAFE, a system for enabling thearbitrary direction of traffic for security monitoring applica-tions at line rates. Additionally, we describe ALARMS, aflow specification language that greatly simplifies manage-ment of network monitoring appliances. Finally, we demon-strate our OpenSAFE implementation using both live net-work traffic and controlled traffic. Analysis shows that ourOpenSAFE implementation handles higher traffic volumesthan our existing monitoring infrastructure.

1. INTRODUCTIONNetworks are traditionally monitored for many purposes

including performance optimization, usage tracking, secu-rity and intrusion detection, auditing, compliance checking,and forensic analysis. Each of these application domainsplaces different constraints on monitoring techniques. Un-fortunately the constraints can work against each other mak-ing the monitoring functions ill-suited to work on the sameunderlying platform and/or hardware. In addition, growinglink speeds and network fan-out are making effective net-work monitoring even more challenging.

There are two common approaches to monitoring and mea-surement today. The first is to deployon-path middleboxes.By their very nature, middleboxes directly affect networktraffic. This has both positive and negative effects. For in-stance, to avoid being overrun, the middlebox has the ca-pability to slow traffic down to correctly process it. How-ever, this affects measurements of how this traffic would nat-urally behave. Furthermore, general-purpose middleboxesthat support a variety of monitoring functions are expensive,

typically lack in flexibility since changing them would inter-rupt connectivity, and can present a management challenge.

The second, more preferred solution to monitoring trafficis to employ a mirror (also called a copy or span) of networktraffic at interesting points of the network. This has the ben-efit of not altering the production network traffic. Unfortu-nately, processing mirrored traffic is not without key issueseither. First, hardware limitations often prevent the abilityto provide multiple traffic mirrors, which limits the numberof monitoring devices that can be used. Second, the arriv-ing network traffic can quickly overwhelm the monitoringcomputer, rendering it useless. Thus, network administratorshave to employ expensive special purpose hardware and un-dertake painstaking “tuning” of software and hardware func-tions on the monitoring equipment to avoid being overrun bythe incoming traffic. Third, the introduction of new monitor-ing mechanisms often requires difficult physical rewiring ofthe network connections and may encounter other hardwarelimitations such as limited mirror ports.

The focus of our paper is on improving mirroring-basedapproaches. We introduce a new measurement and monitor-ing framework calledOpenSAFEto address the drawbacksof existing approaches. OpenSAFE uses a commodity pro-grammable network fabric to direct mirrored traffic in ar-bitrary ways. Several inexpensive, unoptimized monitoringdevices can be “plugged into” OpenSAFE to process sub-streams data. We also introduce a high-level language to en-able network administrators to express rich policies to con-trol how traffic is acted on by various monitoring devices.OpenSAFE also allows monitoring devices to dynamicallysignal interest in, and receive, particular traffic.

We describe the design, implementation, and evaluationof OpenSAFE in the rest of this paper. We first provide thebackground and motivation for our proposed system in Sec-tion 2. We then describe the key abstractions in OpenSAFE—namely, those for mirror ports, monitoring devices of dif-ferent kinds, and various mechanisms for controlling traf-fic. These form the building blocks for deploying diverseand flexible monitoring configurations (Section 3). We thendescribe ALARMS, a high-level language that representsOpenSAFE’s abstractions in a simple policy language syn-tax. ALARMS employs simple intuitive constructs to fa-cilitate arbitrary selection of traffic sub-streams to monitor,

1

Building 1

Building 3

Building 2

Building 4

Building 5

Building 6

Building 7

Router 2

CampusBackboneRouters

Router 1

x5

x2

x2

x2

x2

x2Dashed links = 10 GbpsSolid links = 1 Gbps

Connections to the College:2 x 10 Gbps links22 x 1 Gbps links

x2

Figure 1: The network diagram of a large college.

arbitrary interposition of monitoring devices, flexible load-balancing mechanisms, and arbitrary signaling of interest intraffic (Section 4). Finally, we describe our implementa-tion of OpenSAFE and ALARMS using the NOX/OpenFlowplatform [11, 14] (Section 5). On the whole, OpenSAFE of-fers several practical advantages: itallows fast and scalablemonitoring by supporting several monitoring devices run-ning in parallel, it iseasy to (re)configure and extendwitharbitrary monitoring functions, and it’s reliance on commod-ity hardware and unoptimized devices makes itinexpensive.

We conduct a thorough evaluation of the system we cre-ated using both production traffic as well as synthetic traces(Section 6). OpenSAFE is shown to outperform an exist-ing, highly-opt- imized monitoring device in a multiple dayhead to head test. We examine 54% more packets and gen-erate 30% more security alerts during our four day test usinganunoptimizedteam of machines and OpenSAFE. Further-more, we show that OpenSAFE scales with increasing band-width. Some limitations of OpenSAFE, mainly caused bydeficiencies in the underlying switch fabric, are described inSection 7.

Related work is discussed in Section 8 and we presentconclusions in Section 9.

2. BACKGROUND

2.1 Challenges in MonitoringHigh fanout and high-speed network links make even sim-

ple monitoring tasks difficult in modern enterprise networks.Consider the real-world example shown in Figure 1. Thisshows the network of a large college at a research univer-

sity. The network consists of 7 buildings with 22× 1 Gbpsconnections and 2× 10 Gbps connections. Two key chal-lenges arise with regard to monitoring in this setting: whereto place measurement devices and how to actually go aboutmonitoring the traffic.

Device placement is challenging due to the diverse traf-fic bisection and distributed nature of the network. In theexample above, a sizable amount of traffic goes to the twoprimary datacenters in buildings 1 and 2 as well as vari-ous network hot-spots (such as computational clusters andlabs) scattered amongst all seven buildings. Thus, determin-ing where to place monitoring and measurement devices is anon-trivial challenge. Two possibilities arise in the above ex-ample: monitoring at the router or within each building. Thetrade-off is that either many monitors capable of 1 Gbps or10 Gbps of traffic would be required, or finely tuned, high-bandwidth monitors are needed at the router.

After deciding on where to place devices, the next chal-lenge is how to actually monitor the network. There are gen-erally two methods: middleboxes or mirror ports.

2.1.1 Middleboxes

On-path middleboxes are commonly used, however in con-temporary networks they require rewiring. The rewiring canbe either physical or logical, however, the problems are gen-erally the same. Additions, deletions, modifications, or fail-ures of middleboxes lead to outages and reconfiguration ofnetwork gear. This results in network interruptions and per-formance loss.

2.1.2 Mirrors

Instead of middleboxes, another method commonly usedis to take amirror (or tap) of the traffic at a border point andexamine it off-path. On the positive side, the monitoring de-vice does not impact production network traffic. This resultsin fewer problems perceived by end-users and a generallymore stable network. An example of this is in Figure 2.

However, hardware limitations often prevent the ability toprovide multiple traffic mirrors. This limits the number ofmonitoring devices that can effectively participate. For ex-ample, the Cisco Catalyst 6000 series is limited to two mirrorports per device. Making it worse, often other network func-tions reduce available mirror ports. For example, enablingmulticast on a Cisco FireWall Services Module (FWSM),one of the mirror ports has been consumed–leaving only onefor monitoring.

Another challenge is that since the monitoring device isnot actively participating in the network conversation it canbecome overwhelmed with traffic and lose large amounts ofdata by randomly dropping packets. Often the drop func-tion is not optimized and packets will be lost from randomflows. This affects the ability for the monitoring device toboth fully examine all the traffic, and accurately reassemblenetwork flows. Commonly, network operators connect thismirror port into an expensive computer that has been heav-

2

Network B

Network A

IDS

Network BFirewall

Figure 2: A logical configuration for network monitoringtoday.

...

Network B

Network A

Programmable Network Layer

Network BFirewall

MonitoringDevice 2

MonitoringDevice 1

MonitoringDevice n

FilteringDevice(s)

Figure 3: The desired dynamic layer.

ily optimized to move traffic very fast (for example, by usingPF RING [7]), leaving little room for error. The heavy tun-ing often results in brittle software configurations. At timeseven slightly different revisions of software make a huge im-pact in these monitoring computers.1

2.2 Our Insight: Employing Programmablenetwork fabrics

The key problem in monitoring using mirror ports is notbeing able to exercise fine-grained control over the mirrorednetwork traffic. We observe that by inserting a programma-ble network fabric at this point we can dramatically increasethe utility of the mirrored traffic while at the same time dra-matically reducing the effort needed to engineer and managethe monitoring functionality. An example of this is shown inFigure 3. The framework, which we call OpenSAFE, demul-tiplexes high-bandwidth packet streams into several lower-bandwidth flows that are directed to different monitoring de-vices. We briefly outline the key advantages of OpenSAFE.

(1) Our approach allows forflexible, fast and scalablemonitoring. This has at least two different aspects to it.

1For example, in our initial tests we found that a minor revision ofour IDS software dropped almost 50% more packets on our pro-duction monitoring system.

First, as mentioned above, mirror ports are very limited onnetworking devices. Our approach allows for flexible shar-ing of a single mirror port across multiple devices. Second,current mirroring-based solutions typically require rewiringto change the network configuration. With a programmablenetwork fabric, network flows are directed on demand, andnew monitoring functions can easily be added/reconfigured.Ideally this will leverage the support of an intuitive declar-ative language to control traffic. Related to this is the issueof scaling the throughput of monitoring functions, particularthose that are very intensive. In such cases, OpenSAFE al-lows multiple devices to work on disjoint subsets of trafficin parallel thereby improving the throughput significantly.

(2) Another benefit of using a programmable network fab-ric is that it worksper-flow. Since decisions are made per-flow in hardware, unlike other packet distribution techniques[17] [16], working per-flow facilitates the scale-up of net-work monitoring in an more manageable way. Fundamen-tally, software performing intrusion detection or deep packetinspection will reassemble out of order packets to correctlyprocess TCP flows. This process is dramatically stream-lined, however, with having per-flow operations in the pro-grammable switch fabric. In addition, when seeing all thepackets from a flow as opposed to a random subset of pack-ets (eg: due to some round-robin policy), the monitoringsoftware is generally able to more accurately collect usefuldata.

(3) Our approach iscost-effective. Programmable net-work fabrics have become available for commodity pricestoday; for example, OpenFlow[14]–an approach to programa switch’s flow table is supported on commodity network-ing hardware today. In addition, the ability to demultiplextraffic into multiple low-bandwidth flows has the immediateadvantage of allowing the use of commonly-available, inex-pensive, and easy to manage 1 Gbps NICs on the monitoringdevices rather than 10 Gbps NICs.

3. OPENSAFEWe propose OpenSAFE (Open Security Auditing and Flow

Examination), a unified system for network monitoring andmeasurement. Leveraging a programmable network fabric,our system can direct mirrored network traffic in arbitraryways. OpenSAFE consists of three components: a set of de-sign abstractions for codifying the flow of network traffic;ALARMS, a policy language for easily specifying and man-aging paths (Section 4); and a controller that implements thepolicy using OpenFlow (Section 5).

3.1 PathsTo make the direction of network flows for network moni-

toring both flexible and easy, OpenSAFE is designed aroundseveral simple primitives. We use the notion of apathas thebasic abstraction of describing the selection of traffic flowsand the direction these flows should take. Fundamentally, wewish to support the construction of paths that allow desired

3

MirrorPort: 80

Counter TCP Dump

Figure 4: A basic monitoring path.

Input SinksSelect Filters

Figure 5: Abstractions used to describe monitoringpaths.

traffic to enter the system on a particular network port andbe directed to one or more network monitoring systems, re-gardless of physical configuration. A basic example of thisis shown in Figure 4, where mirrored HTTP traffic is sentfirst through a counter appliance and then to a TCP dumpappliance.

In OpenSAFE, the articulation of paths occurs incremen-tally along the desired route port-by-port. Paths are com-posed of several components:inputs, select, filters, andsinks.At a high level, each path begins with an input, applies anoptional select criteria to select a desired subset of networkflows, directs matching traffic through zero or more filters,and ends in one or more sinks (Figure 5). Inputs can onlyproduce traffic, sinks can only receive traffic, and filters mustdo both.

If we take Figure 4 and view it with these abstractions, itbecomes Figure 6. This shows traffic entering on a mirrorport (input) that matches our select criteria of port 80 (se-lect), directed through a counter (filter), and finally sent toa TCP dump (sink). A more complicated example involv-ing multiple filters is shown in Figure 7, demonstrating howpaths can be extended. A typical OpenSAFE configurationconsists of multiple paths treated in aggregate.

3.2 Parallel Filters and SinksTo monitor large networks at line rates it is possible (and

quite likely) that a single filter or sink will not be able to copewith all the network traffic. To address this problem, weallow traffic to be sent to multiple filters or sinks operatingin parallel within a path. Figure 9 shows such a path, withHTTP traffic sent to multiple IDS appliances.

The division of traffic between multiple filters or sinks ishandled usingdistribution rules. Rules are applied on a per-flow basis. OpenSAFE supports five methods of distributionamongst a set of parallel components:

• ALL—send a flow to every component in the set

• RR—alternate flows between components of the set us-ing Round Robin

• ANY—randomly select a component from the set

• HASH—apply a hash function on the first packet of aflow to select a component

• PROB—apply a probability function to load balanceflows amongst components

Mirror Counter TCP DumpPort 80

Figure 6: A basic logical monitoring path (Figure 4) withcoded abstractions.

Mirror Counter TCP DumpPort 443 Decryption

Figure 7: A logical monitoring path with multiple filters.

Distribution rules (except forALL rules) are considereddynamic—the path a particular flow follows is determined atruntime when the first packet of a flow traverses the distribu-tion rule. Hooks, described below, are also dynamic. In con-trast, the other portions of a path are consideredstatic—thepath taken by all flows is constant. The difference betweenstaticanddynamicrules has implications for how paths areimplemented in the programmable network fabric, describedin Section 5.

3.3 HooksOne issue that can arise when splitting monitoring traffic

among multiple devices is that flows from a particular host(or a potential adversary) can be directed to separate ma-chines. While information about the flows can be aggregatedafter the fact, it may be useful for monitoring software to ex-amine all future traffic from a host after suspicious activityis detected. This requires the capability to add new paths atruntime. In OpenSAFE,hooksprovide this functionality.

Monitoring devices can make hook requests at runtime tohave new paths added to the current OpenSAFE configura-tion. A hook request effectively duplicates the path contain-ing the hook and appends the path specified by the moni-toring device. For example, Figure 10 shows a path with ahook and Figure 4 shows a potential resulting path based ona hook request to send HTTP traffic tocounterfollowed byTCP dump.

3.4 Overall DesignThe overall design of OpenSAFE is shown in Figure 8.

The input is a connection from the incoming mirrored traf-fic at the chosen network aggregation point to a port on ourprogrammable switch. Some number of filters are in use, at-tached to various network ports. Finally, output is directedinto some number of sinks. Optionally, multiple switchescan be used, assuming they are directly connected; paths canbe defined between ports on any of the switches.

4. ALARMS: A LANGUAGE FORARBITRARY REDIRECTION FORMEASURING AND SECURITY

To enable network administrators to easily manage andupdate their monitoring infrastructure, we introduce ALARMS,

4

OpenFlow Switch

InputSink1

Filter1 Filterm

Sinkn

...

...

OpenFlowController

Figure 8: The overall design of OpenSAFE, using ourabstractions.

Mirror

TCPDump 1

Port 80

TCPDump 2

Figure 9: A monitoring path with parallel sinks.

a language to enable the arbitrary redirection of networkflows for measuring and security purposes. ALARMS rep-resents the abstractions mentioned in Section 3 in a simplepolicy language syntax. Each component is defined witha name and parameters, and paths are defined between thenamed components. In this section we present the syntax ofALARMS; details about the implementation of policies aredescribed in Section 5.

4.1 Component DeclarationsIn ALARMS, all components of a path are given unique

types and names. Specifically, the policy file declares thefollowing components:Switches, Inputs, Sinks, Selects, Hooks,andWaypoints. We describe the language specification andparameters for each of these components below.

4.1.1 Switches

Each switch is declared with a unique name and the iden-tifier of the programmable switch fabric:

switch sw = 0x00000021;

4.1.2 Inputs and Sinks

As shown in Figure 5, paths begin with inputs and endwith sinks. Both are simply named switch ports (as in Fig-ure 8), declared like so:

input mirror = sw:0;sink tcpdump = sw:1;

Since inputs can only transmit traffic and sinks can onlyreceive traffic, each named input or sink is restricted to a sin-gle port. Traffic, however, can be directed to multiple sinksusing distribution rules (Section 4.2.2). ALARMS includesa special default sink nameddiscard that drops all trafficsent to it.

Mirror hook1

TCP DumpPort 80

Mirror TCP DumpPort 80

hook1

Original:

Request:

Result:

Figure 10: An example of a hook (top line). The middleline is a hook request made by a device and bottom lineis the resulting path implemented by OpenSAFE.

4.1.3 Filters

Filters are middleboxes within an OpenSAFE network.They are shown as the third item in Figure 5. A filter is acombination of a sink plus corresponding inputs. As such,filters are declared similar to inputs and sinks—as namedswitch ports—but with more flexibility as they are able toboth transmit and receive traffic. Each filter must define asingletofrom switch port (to both receive and transmit onthe same port) or both ato and afrom port (to delegatereceiving and transmitting, respectively, to separate ports).Multiple to , from , and tofrom ports for a single nameare not permitted. Filters are specified in the policy languageas follows:

filter to counter = sw:2;filter from counter = sw:3;

4.1.4 Selects

Selects are named criteria used to limit traffic flows basedon fields in packet headers. Currently, ALARMS supportsselecting on any of 9-different header fields: Ethernet sourceand destination addresses, EtherType, VLAN identifier, net-work source and destination addresses, transport protocol,and transport source and destination ports. Additionally, anarbitrary number of bits can be declared as wildcards for net-work source and destination addresses to provide for CIDR-like address ranges. Limited boolean logic (AND, OR) canbe used in a select definition to specify criteria on multipleheader fields. Any header fields not specified in the selectare treated as wildcards. The select example below producesonly HTTP traffic whose source or destination port is 80:

select http = tp_src: 80 || tp_dst: 80;

4.1.5 Hooks

Hooks are expanded to paths when requests are made atruntime. In ALARMS they are declared with only a name:

hook hook1;

5

Mirror Port 80

Mirror

Counter TCP Dump

Port 443 Decryption

Web

Figure 11: A logical monitoring path with a waypoint.

4.1.6 Waypoints

The final component type in ALARMS is an abstractionadded as a convenience to ease the creation and managementof multiple, semi-redundant paths. In a system of a reason-able size, it is possible—even probable—to have multiplepaths configured with common attributes. For instance, sup-pose that an administrator wants to perform some degree ofprocessing on one of two sets of traffic, then send the re-sults of both to the same filter and sink, as shown in Figure 6and Figure 7. This quickly becomes a maintenance problemas modifying the common end-component of the paths mayinvolve editing many different paths.

Waypointsserve as “virtual destinations” and “virtual sources”allowing administrators to aggregate paths and reduce repe-tition. A path using a waypoint is displayed in Figure 11,where HTTP and HTTPS traffic is sent to aweb waypointbefore being passed to a counter filter and TCP dump sink.Waypoints are not physical destinations or sources, they onlyexist within the ALARMS language. Declaring waypointsrequires only a name:

waypoint web;

4.2 PathsNow that all named components have been declared, we

can connect these components to form paths. Paths mustconform to the following specification:

1. Paths can begin with an input, waypoint, or filter.

2. Paths can end with a sink, waypoint, filter, hook, orrule.

3. Selects can be applied to any connection between com-ponents.

The simplest form of a path connects an input directly toa sink:

mirror -> tcpdump;

This can be modified to include, for example, a filter anda select as shown in Figure 6:

mirror[http] -> counter -> tcpdump;

4.2.1 Paths with Selects

A select limits the traffic seen by all components in thepath downstream from the select. In the path above, the

filter (counter ) will only see HTTP traffic coming fromthe input (mirror ) and the sink (tcpdump ) will only seeHTTP traffic leaving the filter. Each connection in the pathis limited to having one select.

If a path has multiple connections with selects, the selectsdownstream further restrict upstream selects, with the down-stream select taking precedence in the case where both spec-ify criteria for the same header field(s). For example, a re-vised path with additional select on the filter:

select webserver = nw_src: 10.0.0.1|| nw_dst: 10.0.0.1;

mirror[http] -> counter[webserver]-> tcpdump;

will result in the filter still seeing all HTTP traffic from themirror port, while the sink now sees only HTTP traffic for aparticular webserver.

4.2.2 Distribution Rules

The distribution of traffic between multiple components(excluding inputs) is handled by distribution rules, appliedon a per-flow basis.

The first three distribution rule types,ALL, RR, andANY,each take a list of components to act on. For example, thefollowing rule will round-robin distribute new http flows be-tween two counter filters before sending it along to a tcp-dump sink.

mirror[http] -> {RR, counter1, counter2}-> tcpdump;

TheHASHandPROBrules take an additional argument—the name of the hash or probability function—and rely on theoutput of this function to determine the destination. Proba-bility rules are designed to allow OpenSAFE to distributetraffic based on the current load of distribution components,so the user must also provide a way for the function to re-ceive load information from components. For example, thefollowing policy instructs OpenSAFE to use a user definedhash functionmyhash to distribute new flows between twocounter filters before they proceed to a tcpdump sink. Thiscould be more desirable than aRRrule since it can be deter-ministic.

mirror[http]-> {HASH(myhash), counter1, counter2}-> tcpdump;

4.2.3 Expanding Waypoints and Filters

Paths ending in waypoints or filters causes ALARMS toexpand the component to any paths beginning with the samewaypoint or filter. The implementation of Figure 11 involv-ing a waypoint is:

mirror[http] -> web;mirror[https] -> decrypt -> web;web -> counter -> tcpdump;

6

ALARMS expands the waypoint at the end of the first twopaths, appending the path that begins with the waypoint. Theeffective policy becomes:

mirror[http] -> counter -> tcpdump;mirror[https] -> decrypt -> counter

-> tcpdump;

If no path begins with a particular waypoint, uses of thewaypoint are expanded to have a destination of the specialdiscard sink.

4.2.4 Overlapping Paths

Multiple paths can begin with the same component, butusually have different selects applied. In order to deal withthese overlaps, all paths in a policy file are considered asa single union. If more than one path begins with the samecomponentandhas the same select applied (or lack thereof),the two paths will be internally combined. For example, theset of paths:

mirror -> tcpdump1;mirror -> tcpdump2;

are combined to form the new path:

mirror -> {ALL, tcpdump1, tcpdump2};

5. PROGRAMMING THE FABRICDirection of traffic is realized by programming the net-

work fabric based on an ALARMS policy file. Programmingthe fabric consists of three tasks:

1. Parse the policy file written in ALARMS.

2. Install static flows when a new switch connects to thecontroller .

3. Install dynamic flows when a packet is received by thecontroller, or upon hook request.

Our network fabric consists of an OpenFlow [14] switchand NOX controller [11]. An OpenFlow switch forwardspackets in the data plane based on a programmable flow ta-ble. The flow table consists of entries that contain values forup to ten different packet header fields (known as theOpen-Flow 10-tuple): incoming port, Ethernet source and destina-tion addresses, EtherType, VLAN identifier, network sourceand destination addresses, transport protocol, and transportsource and destination port. Any item in the10-tuple forwhich values are not specified are treated as wildcards. Eachentry also contains an action that should be applied to pack-ets matching that entry: drop packets, output to one or moreports, or send to the controller. When a packet arrives atan OpenFlow switch, it is matched against the entries in theflow table. The actions of the highest priority matching entryare applied to the packet. If the packet does not match anyentry in the flow table, it is forwarded to the controller for

Network B

Network A

Network BFirewall

dsniff

Snort

Decryption

OpenFlowController

OpenFlow

Figure 12: What OpenSAFE could look like. An incom-ing mirror port is connected to an OpenFlow switch.

a decision to be made. While other all-ASIC options exist(such as ones from GigaMon[9]), OpenFlow is readily avail-able today on commodity hardware. Figure 12 is an exampleOpenSAFE setup.

5.1 Policy ParsingInternally, the policy file is represented as a series of ob-

jects, organized into lists of switches, selects, components,and paths. Paths are checked to verify they meet the threecriteria outlined in Section 4.2. Overlapping paths are de-tected by comparing the first component and (optional) se-lect in every path; a set of overlapping paths is combined byapplying the following transformation rule:Given a set ofpathsα → β1, ..., α → βn, wheren ≥ 2, remove the exist-ing paths and add new pathswaypointα1 → β1, ..., waypointαn →βn, α → {ALL, waypointα1 , ..., waypointαn}. The finalstep in policy file parsing builds three dictionaries of paths,one for each possible starting component type: input, filter,and waypoint. These dictionaries are used later when ex-panding paths ending in filters and waypoints.

5.2 Static Flow InstallationThe process to program the network fabric based on an

ALARMS policy begins with a fundamental observation:hardware is faster than software. In OpenFlow, forward-ing a packet which matches an existing flow table entry isfaster than sending a packet to the controller. To preservehigh performance, we pre-compute as many routes as possi-ble and install them in the flow table of the OpenFlow switchon startup. This avoids the need to contact the controller forevery new flow and prevents the controller from being over-loaded with traffic—a distinct possibility when operating athigh line rates. We call these pre-computed flow table entriesstatic flowssince they remain in the switch’s flow table theentire time OpenSAFE is running. Static flow table entriesare installed forstaticpath components (see Section 3.2) andto send traffic fordynamiccomponents to the controller.

5.2.1 Default Drop

By default, an OpenFlow switch automatically sends to

7

the controller any traffic for which there is no matching flowtable entry. In contrast, ALARMS specifies paths for onlycertain traffic, assuming all other traffic is discarded. To rec-oncile the differences between ALARMS and OpenFlow, weinstall low-priority wildcard rules to drop all traffic enteringthe switch from inputs or filters. These drop rules avoid theoverhead of sending unwanted packets to the controller. Allpaths defined in the ALARMS policy file are installed withhigher priority so desired traffic is not dropped.

5.2.2 Input Paths

Static flow installation processes each path beginning withan input. At the beginning of a path, it is assumed that allflows (i.e. a10-tupleof all wildcards) will traverse the path.The current set of flows becomes more specific as each com-ponent and selection in the path is processed. Input or filtercomponents cause the input port in the current set of flowsto be updated. A selection adds new tuple items or overridesexisting values in the current of flows.

New flow entries are typically installed at the switch foreach transition (i.e. arrow,-> ) in a path. For example, thepath shown in Figure 4 written in ALARMS as

mirror[http] -> counter -> tcpdump;

results in flow entries {tp src=80, in port=0}→ output:2{tp dst=80, in port=0} → output:2

installed for the first transition, and entries{tp src=80, in port=3} → output:1{tp dst=80, in port=3} → output:1

installed for the second transition. Filters, waypoints, anddistribution rules within paths all require special attention.

5.2.3 Filters and Waypoints

Paths ending in filters, require “expanding” the path toalso process all paths preceding from the filter. The set ofadditional paths to process is obtained from the dictionaryof filter paths built during policy file parsing. For example,in the policy

mirror[http] -> counter;counter -> tcpdump;

the first path is processed, followed by processing of thepath beginning with thecounter filter. When processingthe filter paths, the set of flows that existed at the end of theoriginal path are used as the starting setting of flows for eachfilter path. If we did not use the flow set from the end of theoriginal path, our flow table entry for the filter path wouldhave been{in port=3} → output:1

This entry is incorrect; the only traffic that should leavecounteris the HTTP traffic that came in. Therefore, we start process-ing the second path with a current flow set of HTTP traffic.

Paths ending in waypoints are treated similarly to pathsending in filters. The waypoint is “expanded” and process-ing continues along each path beginning with that waypoint.The difference is that waypoints are merely conceptual and

do not correspond to any physical ports on the OpenFlowswitch. Flow entries are installed originating from the com-ponent preceding the waypoint in the original path, to thecomponent(s) following the waypoint in the waypoint path(s).For example, the paths

mirror[http] -> web;web[webserver] -> tcpdump;

result in only one set of flow entries{tp src=80, nwsrc=10.0.0.1, inport=0} → output:2{tp src=80, nwdst=10.0.0.1, inport=0} → output:2{tp dst=80, nwsrc=10.0.0.1, inport=0} → output:2{tp dst=80, nwdst=10.0.0.1, inport=0} → output:2

It is important to note that the current set of flows is first lim-ited by thehttp selection when the inputmirror is pro-cessed, and the set of flows is further limited by thewebserverselection when theweb waypoint is “expanded.”

5.2.4 Distribution Rules

The static flow table entries installed for distribution rulesvary depending on the type of distribution rules. AnALL rulecan be treated statically—no packets need to be sent to thecontroller and all flow table entries can be installed a priori.For example, the path

mirror[http] -> {ALL, tcpdump1, tcpdump2};

results in the flow entries{tp src=80, in port=0} → output:1, output:4{tp dst=80, in port=0} → output:1, output:4

If more components exist in the path following the rule, pathprocessing continues along the path as normal.

The other types of distribution rules (RR, ANY, HASH, andPROB) require packets to be sent to the controller for theappropriate dynamic flow entries to be installed. For thesetypes of rules, static flow entries are installed with the actionof sending to the controller. For example, the path

mirror[http] -> {RR, tcpdump1, tcpdump2};

results in the flow entries{tp src=80, in port=0} → controller{tp dst=80, in port=0} → controller

When packets matching these entries are sent the controller,it is necessary to know which rule should be applied. There-fore, we store the set of flows and the associated distributionrule in a dictionary for later reference.

5.3 Dynamic Flow InstallationDynamic flow entries are installed for distribution rules

(excludingALL rules) and upon receipt of a hook request.These flow entries cannot be pre-computed at OpenSAFEstartup because the destination switch ports are unknown un-til flows arrive or requests are received.

5.3.1 Distribution Rules

Flow entries for dynamic rules are installed when a newflow matches an entry whose action is “send to controller.”

8

The controller receives the first packet in the flow and usesits dictionary of distribution rules to determine which ruleshould be applied to the flow. Only the matching rule needsto be processed; the rest of the path containing the rule isalready processed during static flow installation. ForHASHor PROB rules, the controller calls user specified code toselect one or more destination components. A destinationcomponent for anANY rule is selected by generating a ran-dom number. ARR rule selects the next component in thelist of possible destinations based on prior state. Dynamicflow entries for distribution rules contain a full10-tuplewithall values specified based on the packet headers. Entries arealso installed for flows going in the reverse direction to en-sure that both halves of a flow traverse the same path. Open-SAFE uses a default timeout of 30 seconds for dynamic flowentries.

5.3.2 Hooks

The controller listens on a network socket for hook re-quests. Monitoring devices send an XML fragment whichcontains the name of the hook, the name of the component towhich traffic should be sent, values for one or more fields inthe10-tuple, and the duration the hook entry should last. Thecontroller installs a high-priority flow entry with the appro-priate10-tuple, timeout, and output action. Thein portvalue in the flow table entry is determined based on the com-ponent that precedes the hook component in the hook path.

Dynamic flow entries installed for hooks are not combinedwith the rest of the paths specified in an ALARMS policy. Ifa hook request overlaps with an existing path, the hook re-quest takes precedence. For example, assume the followingset of paths:

mirror[http] -> tcpdump1;mirror -> hook1;

A request forhook1 to send all HTTP traffic to tcpdump2will result in all HTTP traffic going to tcpdump2 instead oftcpdump1 for the duration of the request. After the hookrequest times out, the flow entry for the first path will againtake effect.

6. EVALUATIONOpenSAFE needs to handle traffic volumes at high line

rates to be able to serve as a feasible network monitoringsystem. We verify OpenSAFE meets this requirement bymeasuring its performance using both real-world and syn-thetic traffic. First, we compare our implementation againstan existing monitoring infrastructure and show that Open-SAFE loses less traffic (Section 6.1). Second, we run ourimplementation with varying rules sets using a constant setof synthetic traffic (Section 6.2). We demonstrate that Open-SAFE handles sustained amounts of high traffic volume.

Our implementation of OpenSAFE uses an OpenFLOW0.8.9 enabled NEC IP8800 10 gigabit switch. The controlleris written as a python module for NOX 0.6.0.

0 K

10 K

20 K

30 K

40 K

50 K

60 K

Sun00:00

Sun12:00

Mon00:00

Mon12:00

Tue00:00

Tue12:00

Wed00:00

Wed12:00

Pac

ket

s/s

rece

ived

ProductionOpenSAFE

Figure 13: Packets per second received by the optimizedproduction IDS system compared versus the OpenSAFEIDS team.

0 K

5 K

10 K

15 K

20 K

25 K

30 K

Sun00:00

Sun12:00

Mon00:00

Mon12:00

Tue00:00

Tue12:00

Wed00:00

Wed12:00

Pac

ket

s/s

exam

ined

ProductionOpenSAFE

Figure 14: Packets per second examined by the pro-duction IDS system software versus the OpenSAFE IDSteam.

6.1 Comparison to Existing InfrastructureIn this section we describe the comparative tests we per-

formed against an existing monitoring system. We begin bydescribing the existing setup at the University of Wisconsin—Madison, followed by our OpenSAFE setup, and concludewith a discussion of the test results.

6.1.1 Test Setup

The existing production monitoring setup has been highlyoptimized by the college of Engineering’s local security ad-ministrator with technologies such as PFRING [7] and TNAPI[8]. PF RING is a network socket type which can avoidexcessive kernel memory copying operations that can causepackets to be lost during high bandwidth captures. TNAPIis a threaded network device polling method that is designed

9

0 %

20 %

40 %

60 %

80 %

100 %

Sun00:00

Sun12:00

Mon00:00

Mon12:00

Tue00:00

Tue12:00

Wed00:00

Wed12:00

Per

cent

of

pac

ket

s ex

amin

ed

ProductionOpenSAFE

Figure 15: Percentage of packets received that were ac-tually examined by the IDS software (note: the IDS doesnot examine all types of packets).

to make handling interrupts on multi-cored machines moreefficient. Both technologies require special kernel and appli-cation modifications that can make the system quite brittle.However they have both been shown to improve standardpacket capture techniques by as much as 100% [6, 7, 8].

The production system runs three pieces of monitoringsoftware: Suricata [18], Barnyard2 [3], and nProbe [15],each compiled with PFRING support. Suricata is a multi-threaded content matching IDS like Snort that uses the samerules and logging. Barnyard2 is a program that will read IDSalert logs and consolidate them in a remote location like aBASE [4] database. nProbe is a tool for collecting flow datain a distributed sensor fashion and reporting the data back toa collector. The production system’s hardware is comprisedof a single Dell PowerEdge 2950 with a 2.0 GHz Dual CoreXeon 5130 CPU and a 10-Gigabit Intel 82598EB XF LRserver fiber adapter.

Our OpenSAFE monitoring setup was composed of sixspare desktop machines attached to the NEC OpenFLOWswitch. The six machines’ hardware specs included two HPxw4300s (Pentium 4 3.4 GHz), one Dell GX620 (Pentium 43.4 GHz), two Dell Optiplex 755s (3.0 GHz Core2 E8400),and HP dc5800 (2.6 GHz Core2 Q9400). We refer to thisgroup of machines as the ”OpenSAFE team.” Each machinewas setup with the same set of monitoring software, con-figurations, and rules sets as the production system with theexception that we did not use PFRING or TNAPI in any ofthese machines.

All traffic from the border router was sent to both the pro-duction machine and OpenSAFE by using a 50/50 opticalsplitter on the single mirror port available on the router. Fig-ure 16 is a picture of the head-to-head testing platform in theMDF in one of our buildings. While the comparative datawe present in this paper comes solely from Suricata, we in-cluded the other monitoring software to maintain a fair test

50/50 LR optical split

NEC 8800 switch

Current IDS server

Bench test servers

OpenSAFE team

3 additional(not shown)

Figure 16: The head-to-head test platform.

and to illustrate the diversity in monitoring techniques.From the OpenFlow switch we further split the traffic amongst

our OpenSAFE IDS machines by statically partitioning thecollege’s local subnets between the machines. Since neitherour machines nor subnets are of equal capacity we used thetraffic counts at configuration time as well as the load aver-age of the individual IDS machines to attempt to manuallybalance the traffic. Given that traffic fluctuates over time thisconfiguration was almost certainly suboptimal. A portion ofour ALARMS policy file is included below.

### define switchesswitch switch1 = 0x12f2c720cc;

### define input portsinput mirror = of1:0;

### define sink portssink ids1 = of1:1;...sink ids6 = of1:6;

### define selectsselect vlan1 = nw_dst: 10.0.1.0 && nw_dst_n_wild: 8

|| nw_src: 10.0.1.0 && nw_dst_n_wild: 8;...select vlan36 = nw_dst: 10.0.36.0 && nw_dst_n_wild: 8

|| nw_src: 10.0.36.0 && nw_dst_n_wild: 8;

### define rulesmirror[vlan1] -> ids1;mirror[vlan2] -> ids1;...mirror[vlan31] -> ids6;mirror[vlan36] -> ids6;

We initially hoped to use a performance reporting devicewe wrote for use with aPROBdistribution rule to dynam-ically load-balance the traffic. However, the NEC switch’s

10

flow table is limited to around 3000 entries. The college bor-der sees an average of 330 new flows/second, so we rapidlyoverflow the NEC’s flow table in a matter of seconds. Asnoted above, we avoided this issue by using a limited num-ber of solely static rules. An alternative solution is to de-crease the timeout for dynamic flow entires. State is re-claimed faster allowing the flow table to keep pace with thefrequency of new flows. However, if entries are removedtoo quickly, packets will frequently need to be directed tothe controller increasing latency and resulting in poor per-formance. This remains an open issue in OpenSAFE.

According to Antonatos et al [1] [2], drop count (ie: thenumber of packets received versus the number of packetsexamined) is the most useful comparison metric for contentmatching IDS software. We compare Suricata’s reports ofthe number of packets it processed to the number and ofpackets the device or PFRING saw. We combine the countsfrom all of the OpenSAFE team members and summarizethe data into 30 minute averages.

6.1.2 Results

We ran our test over four days including one weekend.Figure 13 shows the average number of packets/second re-ceived by the existing system compared with the OpenSAFEteam. Here we see a typical diurnal traffic pattern. TheOpenSAFE team received almost the same amount of trafficas the existing system (96% in total). The major exceptionoccurs on Sunday afternoon when one of the older desktops(an HP xw4300) missed a large subset of traffic. On Mondayevening we examined the data, determined that the machinewas experiencing hardware problems and replaced it withan HP dc7900 (3.0 GHz Core2 E8400) on Tuesday evening.During our replacement we simply updated the ALARMSpolicy file to direct those subnets to the new machine. Weinclude this event in part to illustrate the flexibility of oursystem in action.

Figure 14 displays the average number of packets/secondexamined by the IDS software. Unlike the OpenSAFE team,which is able to closely follow the number of packets re-ceived, their is a clear threshold past which the productionsystem cannot keep up. Here the OpenSAFE team examined54% more packets in total. In addition, we saw that afteronly the first day the OpenSAFE team had registered 4,926more alerts, 30% more than the production system registeredduring the same time. This demonstrates that the additionalparallelism obtained through multiple lower bandwidth IDSmachines allows OpenSAFE to scale better.

Figure 15 shows the number of a packets examined as apercentage of all those received. Here we note that althoughthe OpenSAFE team outperforms the existing system, it stilldoes not reach 100%. We believe that this is caused by IDSsoftware only considering certain packets for examination(eg: only TCP, UDP, ICMP).

OpenFlowController

OpenFlow Switch

Port 0Port 1

Port 2

Testing ServerTraffic

Generator

TrafficSink

tap eth1

Figure 17: Synthetic traffic generation and measurementsetup

0

200

400

600

800

1000

0 1 2 3 4 5

Lat

ency

in µ

s

Gbps of cross traffic

Figure 18: Gigabits per second of cross traffic versus av-erage individual packet latency in microseconds.

6.2 Synthetic LoadsIn order to verify that OpenSAFE does not introduce ex-

cessive latency in the monitoring system we replayed multi-ple real-world traces from our previous comparison in vary-ing configurations. We were able to show that increased net-work load has almost no effect on individual packet latency.

We modified the payload of our trace data so that eachpacket stored a unique identifier so that we could match pack-ets up in order to calculate the amount of time required tosend the packet through OpenSAFE.

We used six Dell PowerEdge R210 servers to generate andmeasure our synthetic traffic loads. The servers are equippedwith 2.4 GHz Quad-core Xeon CPUs and two 1 Gbps NICs.Each server served as both an input and a sink (Figure 17).Each machine replayed traffic across the fabric according tothe paths defined by OpenSAFE. In this case it sent throughone of its interfaces across the switch, though a patch ca-ble to another switch port, and back to the same interfaceon the server.2 This was represented using the followingALARMS policy:

2We did this because OpenFlow does not allow us to send back tothe same port.

11

input poweredge2out = of:2;sink poweredge2in = of:2;filter to patch2 = of:0;filter from patch2 = of:1;poweredge2out -> patch2 -> poweredge2in;

In this way the patch cable is effectively represented as anull filter. This allowed us to capture both the send and re-ceive side of packets on the same interface which increasesour timing accuracy since we avoid clock skew between ma-chines. We rantcpdump on both of the server’s NICs tocapture the time a packet is sent and the time the same packetis received on the other interface. Usingtcpdump allows usto measure the overall system latency from kernel to kernelwithout any userspace queuing effects.

We ran three iterations of each replay test, each consistingof 18.6M packets. At each test we increased the amountof cross traffic by having another one of the R210 serversreplay traffic.

As can been seen from Figure 18 there is almost no changein the average latency or jitter. From this we see that whenused with static rules the OpenSAFE fabric does not imposeany additional performance overhead.

7. DISCUSSIONOpenSAFE is designed for both flexibility and high per-

formance. As our results show, it outperforms an existingmonitoring infrastructure and scales with increasing band-width. Our major concerns are state exhaustion, matchingability, and flow insertion latency. Further performance im-provements and extensibility depends on new capabilities inthe programmable switch and resolution of some unique im-plementation quirks.

7.1 Dynamic Rule LatencyLatency to send packets to the controller is an important

concern for dynamic rules. The OpenFlow version 0.8.9specification does not have explicit hashing functions, re-quiring OpenSAFE to utilize the controller for emulatingcertain ALARMS rules. Packets destined for aANY, RR,HASH, or PROBrule need to be sent to the controller for theappropriate function to be applied. After computing the des-tination, the controller needs to send messages to the switchto install flow entries based on the outcome of the function.This results in a relatively long round-trip time to the con-troller for each dynamic rule along the path that a flow willtake. In addition, the controller has the potential of beingoverwhelmed if large numbers of packets are sent for dy-namic flow installation.

As we have stated, we avoid these problems by carefullyconstructing OpenFlow entries that minimize the number offlows that are sent to the controller. Additional study shouldbe done in the area of pre-computing more dynamic distri-bution rules and aggregation of similar rules. It is possiblethat a particular hash function could be covered by a specific

RTT 1 RTT 2 RTT 3

Time

Incoming packets tocontroller.

Incoming packets

follow flow table. Also,

packets returning

from RTT1.

Steady state.Incoming packets

follow flow table.

Figure 19: For a particular flow, packets may arrive outof order.

set of static OpenFlow rules; this is obviously not generalto all hash functions, but it could be used to improve per-formance in some cases. Additionally, simple dynamic dis-tribution rules that do not require any state, likeANY, couldbe added to the OpenFlow specification to reduce activity onthe controller.

7.2 The Packet Ordering ProblemOne surprising finding is how OpenFlow handles off-path

traffic. Traditional OpenFlow routing is on-path. This meansthat as a network connection is being established, the trafficitself is delayed by the time that it takes the controller tomake a decision and install OpenFlow rules into the switch.We will call this timeRTTcontroller to denote the round triptime from when the first packet of a flow comes in, to whenthe controller’s rule has been inserted into the switch.

In OpenSAFE, however, OpenFlow is handling a copy ofnetwork traffic. Unlike the on-path situation above, trafficcontinues while the controller making a decision. If it isnecessary to have the controller process this flow, packetsmust be sent to the controller. However, many packets couldarrive at the mirror port before the controller can respond.

Until matching flow has been installed into the flow tableon the OpenFlow switch, packets from that flow will be sentto the controller. As is shown in Figure 19, during the firstRTTcontroller all packets are sent to the controller. Duringthe secondRTTcontroller packets from the first round triptime return from the controllerwhile new incoming packetsare routed per the flow table in the switch. Beginning withthe thirdRTTcontroller, packets will be ordered correctly.

This issue is minimized in two ways. The easiest way is tominimize theRTTcontroller and thereby drastically reducingthe number of packets that come out of order.

The other minimizing factor is that monitoring softwarealready gracefully handles unassisted reordering of packets.In particular, to correctly decode a TCP stream, reassemblyis required unless the traffic is being handled by a proxy.

If desired a filter could be added that could provide thereassembly. This could be facilitated by either out of bandsignaling from the controller or by inspecting the packets for

12

proper order.It is important to note this problem does not occur with

precomputed flows as effectively theRTTcontroller is zero.Packets being flowing in the steady state as shown in Fig-ure 19 immediately.

8. RELATED WORKCasado et al. describe using Ethane [5] switches to en-

force middlebox policies and propose a language,Pol-Eth,for describing these policies. However, their work onPol-Eth is primarily designed around reachability and the ideathat middleboxes would still be on the logical path of a flow(even if not explicitly on the physical path). Our work differsin that OpenSAFE implements a policy language, ALARMS,which handles acopyof the network traffic instead of on amiddlebox inserted into the network. As such OpenSAFEdoes not handle end-to-end connectivity but rather unidirec-tional flows.

Joseph et al. propose a similar architecture to Ethane intheir work on policy-aware switching [13]. However, theydo away with OpenFlow’s concept of a centralized controller,instead relying on each switch to individually determine thenext hop and forward packets immediately. This improvesthroughput, especially with large quantities of brief flows(where the overhead of contacting the controller is signifi-cant), but makes some aspects of network management moredifficult, as no single entity has a complete view of the net-work. Additionally, the policy specification language de-scribed in their work is still centered around deciding ap-propriate paths for a flow, rather than a higher-level conceptof what network monitoring needs to be applied.

A Flow-Based Security Language (FSL) [12] for express-ing network policy has been suggested by Hinrichs et al.FSL, a variant of Datalog, allows specification of policiessuch as access controls, isolation, and communication paths.This specification is flexible and fast, capable of performinglookup and enforcement at high line rates. Again, however,the language is generally focused on end-to-end reachabil-ity and path selection, without specific thought to networkmonitoring.

Flowstream, proposed by Greenhalgh et al [10] considersusing OpenFlow as the connecting fabric on a middlebox.Conceptually, Flowstream has similar ideas to OpenSAFE,however, it operates on-path and does not described a high-level policy language.

9. CONCLUSIONNetwork security monitoring in today’s large-scale net-

works is a difficult task. Rather than attempting to solveall parts of the problem, including how to analyse networktraffic, we focused on how to route traffic to monitoring ap-pliances. Current solutions for routing monitored traffic areexpensive, difficult to manage, and have problems scaling tohigh line rates.

OpenSAFE is a cost-effective approach which allows forflexible, fast, and scalable monitoring. It uses a widely avail-able OpenFlow-enabled switch to direct copies of traffic throughthe monitoring infrastructure and scale to line rates. Man-agement is facilitated by ALARMS, a language to enablethe arbitrary redirection of network flows for measuring andsecurity purposes. We showed that OpenSAFE outperformsan existing, finely-tuned monitoring setup, examining 54%more packets. We also showed that OpenSAFE scales tomeet the demands of increasing bandwidth.

OpenSAFE makes monitoring large scale networks easierthan before. It can be combined with other security monitor-ing improvements to efficiently and effectively monitor hightraffic volumes. Most importantly, it lays the groundwork formonitoring infrastructures to meet the changing and growingdemands of enterprise networks.

AcknowledgmentsWe are extremely grateful for the help of David De Costerand the College of Engineering at the University of Wisconsin–Madison for their assistance in this project. Ian Rae con-tributed on the initial version of this work. This work issupported in part by a grant from the GENI Project Office(”Campus Trials of E-GENI”).

10. REFERENCES

[1] S. Antonatos, K. Anagnostakis, and E. Markatos.Generating realistic workloads for network intrusiondetection systems.ACM SIGSOFT SoftwareEngineering Notes, 29(1):207–215, 2004.

[2] S. Antonatos, K. Anagnostakis, E. Markatos, andM. Polychronakis. Performance analysis of contentmatching intrusion detection systems. InProceedingsof the 4th IEEE/IPSJ Symposium on Applications andthe Internet (SAINT 2004). Citeseer, 2004.

[3] Barnyard2: Snort Output Spool Reader.http://www.securixlive.com/barnyard2/index.php .

[4] BASE: Basic Analysis and Security Engine.http://base.secureideas.net/ .

[5] M. Casado, M. J. Freedman, J. Pettit, J. Luo,N. McKeown, and S. Shenker. Ethane: Taking Controlof the Enterprise. In J. Murai and K. Cho, editors,SIGCOMM, pages 1–12. ACM, 2007.

[6] G. Cascallana and E. Lizarrondo. Collecting PacketTraces at High Speed. InIEEE Workshop onMonitoring, Attack Detection and Mitigation(MonAM). Citeseer, 2006.

[7] L. Deri et al. Improving passive packet capture:Beyond device polling. InProceedings of SANE,volume 2004. Citeseer, 2004.

[8] L. Deri and F. Fusco. Exploiting CommodityMulticore Systems for Network Traffic Analysis,2009.

13

[9] GigaMon GigaVue Switch.http://www.gigamon.com/gigavue-2404.php .

[10] A. Greenhalgh, F. Huici, M. Hoerdt, P. Papadimitriou,M. Handley, and L. Mathy. Flow processing and therise of commodity network hardware.ACMSIGCOMM Computer Communication Review,39(2):20–26, 2009.

[11] N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado,N. McKeown, and S. Shenker. Nox: towards anoperating system for networks.SIGCOMM Comput.Commun. Rev., 38(3):105–110, 2008.

[12] T. Hinrichs, N. Gude, M. Casado, J. Mitchell, andS. Shenker. Expressing and Enforcing Flow-BasedNetwork Security Policies. Technical report,University of Chicago, 2008.

[13] D. A. Joseph, A. Tavakoli, and I. Stoica. APolicy-aware Switching Layer for Data Centers. InV. Bahl, D. Wetherall, S. Savage, and I. Stoica, editors,SIGCOMM, pages 51–62. ACM, 2008.

[14] N. McKeown, T. Anderson, H. Balakrishnan,G. Parulkar, L. Peterson, J. Rexford, S. Shenker, andJ. Turner. OpenFlow: Enabling Innovation in CampusNetworks.ACM SIGCOMM ComputerCommunication Review, 38(2):69–74, 2008.

[15] nProbe: An Extensible NetFlow/IPFIX NetworkProbe.http://www.ntop.org/nProbe.html .

[16] S. Snapp, J. Brentano, G. Dias, T. Goan, L. Heberlein,C. Ho, K. N. Levitt, B. Mukherjee, S. Smaha,T. Grance, et al. DIDS (distributed intrusion detectionsystem)-motivation, architecture, and an earlyprototype. InProceedings of the 14th NationalComputer Security Conference, pages 167–176.Citeseer, 1991.

[17] T. Sproull and J. Lockwood. Distributed InstrusionPrevention in Active and Extensible Networks.Lecture Notes in Computer Science, 3912:54, 2007.

[18] Suricata Open Source Intrusion Detection andPrevention Engine.http://www.openinfosecfoundation.org/ .

14


Recommended