+ All Categories
Home > Documents > Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg...

Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg...

Date post: 12-Jan-2016
Category:
Upload: norma-ray
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
23
Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring Tasks for the Borealis Stream Processing Engine
Transcript
Page 1: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Master’s Thesis (30 credits)By: Morten Lindeberg

Supervisors: Vera Goebel and Jarle Søberg

Design, Implementation, and Evaluation of Network Monitoring

Tasks for the Borealis Stream Processing Engine

Page 2: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 2

Outline

• Problem description• Application domains• Data stream management system (DSMS)• Borealis• Design• Experiment Setup• Implementation• Evaluation• Conclusion• Future Work

Network monitoring tasks

Page 3: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 3

Problem Description

• Design, Implementation, and Evaluation of Network Monitoring Tasks for the Borealis Stream Processing Engine

• Network Monitoring Tasks:– Task-1: Verify Borealis load shedding mechanisms. – Task-2: Measure the average load of packets and network

load per second over a one minute interval. – Task-3: How many packets have been sent to certain ports

during the last five minutes? – Task-4: How many bytes have been exchanged on each

connection during the last ten seconds? – Task-5: Identify possible SYN flood attacks

Page 4: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 4

Application Domains

• Network monitoring (Controlling and measuring the Internet or parts of it)

– Challenges• Traffic volumes• Get relevant data• Privacy

– On-line network measurements• Passive: Our network tasks• Active: E.g. Traceroute and Ping

– Off-line network measurements• Passive: E.g. InTraBase (Siekkinen, 2006)• Active: Pandora FMS(Pandora, 2007)

N.M

Private netw

ork

DB

Looks at all passing packets

Push - based

Page 5: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 5

Cont. Application Domains

• Sensor networks– TinyDB

• Financial tickers– Traderbot

Pull-based

Push-based

Page 6: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 6

DSMS

• Stream Data Model– Definition:A data stream is a real-time, continuous, ordered

sequence of items (Golab, 2003)

n

Page 7: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 7

Cont. DSMS

• Requirements– Continuous query language

– Data reduction techniques• Sampling• Load shedding• Aggregations with window techniques

Without sliding windows aggregations would be a blocking operator, since one never will see the whole stream at once

– Adaptive

– Integration with a traditional database

– Low latency and high throughput

Hopping windows

Tumbling windows

Overlapping windows

Window techniques:

Windows are either time-based or tuple-based

Streaming tuples should only be kept in main

memory, never written to disk (too slow)

Page 8: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 8

Cont. DSMS• Existing systems:

Name: Language:

TelegraphCQ (Berkeley Uni.) SQL-like

STREAM (Stanford Uni.) SQL-like

Aurora (Brown, M.I.T++) Boxes and arrows

Medusa (Brown, M.I.T++) Boxes and arrows

Borealis (Brown, M.I.T++) Boxes and arrows

Gigascope ($ AT&T) SQL-Like

Page 9: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 9

Borealis

• Stream processing engine (SPE)– Academic research / Public domain– Distributed queries – General purpose

• Multi-player first person shooter game• Network monitoring

• Continuous query language– Operator boxes and stream arrows– XML + GUI– E.g., operators: Map, Aggregate, Join, Filter,

Random Drop and operators for integration with statically stored tables

n2 n5n3 n4

n1

n6

Distributedquery

Data stream

Result tuples

High Availability

Page 10: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 10

Design

Task 2 - Version 1– Average load and packet

count

Task 1 - Version 1– Mapping

Page 11: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 11

Cont. DesignTask 3 - Version 2

– Port destination cont

Task 4 - Version 2– Exchanged bytes

Page 12: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 12

Cont. Design

Task 5 - Version 1– SYN Flood attack (Several hosts initiate half-open connections to a

server so that it has to deny service to others)– Identifies the relation between the count of SYN packets and

normal packets (Non-SYN). Joins aggregated tuples if SYN count is twice or more the normal packet count.

Page 13: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 13

Cont. Design <box name="synfilter" type="filter" > <in stream="Packet" /> <out stream="Syn" /> <out stream="Normal" /> <parameter name="expression.0” value="syn == 1"/> <parameter name="pass-on-false-port” value="1" /> </box>

<box name="Normalcount" type="aggregate" > <in stream="Normal" /> <out stream="Aggregatenormal" /> <parameter name="aggregate-function.0” value="count()" /> <parameter name="aggregate-function-output-name.0” value="count" /> <parameter name="window-size-by” value="VALUES" /> <parameter name="window-size” value="1" /> <parameter name="advance” value="1" /> <parameter name="order-by” value="FIELD" /> <parameter name="order-on-field" value="timestamp" /> </box>

<box name="Syncount" type="aggregate" > <in stream="Syn" /> <out stream="Aggregatesyn" /> <parameter name="aggregate-function.0” value="count()" /> <parameter name="aggregate-function-output-name.0” value="count" /> <parameter name="window-size-by” value="VALUES" /> <parameter name="window-size” value="1" /> <parameter name="advance” value="1" /> <parameter name="order-by” value="FIELD" /> <parameter name="order-on-field” value="timestamp" /> </box>

<box name="SynfloodJoin" type="join" > <in stream="AggregateNormal" /> <in stream="AggregateSyn" /> <out stream="Result" />

<parameter name="predicate" value = "left.count * 2 &lt; right.count

and left.count &gt; 0" /> <parameter name="left-buffer-size" value = "1" /> <parameter name="left-order-by" value = "VALUES" /> <parameter name="left-order-on-field” value = "timestamp" /> <parameter name="right-buffer-size” value = "1" /> <parameter name="right-order-by” value = "VALUES" /> <parameter name="right-order-on-field” value = "timestamp" /> <parameter name="out-field-name.0” value="timestamp" /> <parameter name="out-field.0" value="left.timestamp" /> <parameter name="out-field-name.1" value="ratio" /> <parameter name="out-field.1” value="right.count / left.count" /> <parameter name="out-field-name.2" value="syn" /> <parameter name="out-field.2" value="right.count" /> <parameter name="out-field-name.3” value="normal" /> <parameter name="out-field.3" value="left.count" /> </box>

Page 14: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 14

Experiment Setup• Scripts executes the different stages of each experiment• TG: Generates traffic• fyaf: Filters packet headers from NIC. Counts the number of packets retrieved by the

C.A• C.A: Transforms the packet headers into tuples. I/O to the Q.P• Q.P: Performs the query on the tuples retrieved from C.A

System resource consumption is logged

by the execution scripts..

fyaf calculates the number of lost

packets..TG controls the

amount ofgenerated traffic

per second..

Page 15: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 15

Borealis

Implementation

• Client application main-method:int main( int argc, const char *argv[] ) {... sock = get_connection(); NOTICE << "Socket opened: " << sock; status = marshal.open();

if ( status ) { WARN << "Could not deply the network."; } else { //Start the timer.. timer = Time::now(); // Send the first batch of tuples. Queue up the next round with a delay. marshal.sentPacket();

// Run the client event loop. Return only on an exception. marshal.runClient(); }...}

fyaf Query processor

Results

<xml-query>

Data streamClient application

Page 16: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 16

EvaluationResults for Task 1 ( The map task )

CPU Maximums

Drop box can lead to increased CPU utilization

Page 17: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 17

Cont. EvaluationResults for Task 2 - (the simple task)

(Lost packets at different network loads)

40 Mbit/s

Page 18: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 18

Cont. EvaluationResults for Task 2 - (the simple task)

(Task result - Measured Load)

Ac 98%

Ac 93%

Ac 96%

Page 19: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 19

Cont. EvaluationResults for Task 3 - Memory Consumption

Low memory consumption. (31 Mbyte). No changes when increasing load.

Static tables causes increased memory consumption,

but not much.

Page 20: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 20

Cont. EvaluationTask Network Load Memory

Consumption

Task 1 30,40 Mbit/s 31 Mbyte

Task 2 40 Mbit/s 31 Mbyte

Task 3 10, 30 Mbit/s 31, 33 Mbyte

Task 4 20 Mbit/s 31 Mbyte

Task 5 20 Mbit/s 30, 50+ Mbyte

Page 21: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 21

Conclusion

• Support complex network monitor queries• Borealis can handle network loads:

– 40 Mbit/s for simple tasks– 20 - 30 Mbit/s for complex tasks– 10 Mbit/s when comparing input packets with several

thousands of statically stored tuples.

• Load Shedding– Not fully working, does not identify overload situations– random_drop box does not significantly increase supported

network load

• Low memory consumption– System code parameters might affect performance

Page 22: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 22

Future Work

• Distribution of queries• Expand client application (fyaf and load

shedding)• Optimization of source code system

parameters• New version of Borealis (Winter 2007)• Comparison with results from TelegraphCQ

(Søberg, 2006) and STREAM (Hernes, 2006)

Page 23: Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.

Slide no. 23

Bibliography

• (Søberg, 2006) - Design, implementation, and evaluation of network monitoring tasks with the TelegraphCQ data stream management system,Master’s Thesis 2006.

• (Hernes, 2006) - Design, implementation, and evaluation of network monitoring tasks with the STREAM data stream management system, Master’s Thesis 2006.

• (Siekkinen, 2006) - Root Cause Analysis of TCP Throughput: Methodology, Techniques, and Applications, Dr. Scient. Thesis 2006.

• (Golab, 2003) - Issues in Data Stream Management, Lukasz Golab and M. Tamer Ötzu, 2003

• (Pandora, 2007) - http://pandora.sourceforge.net


Recommended