+ All Categories
Home > Science > Waterfall: Rapid identification of IP flows using cascade classification

Waterfall: Rapid identification of IP flows using cascade classification

Date post: 21-Jul-2015
Category:
Upload: pawel-foremski
View: 72 times
Download: 1 times
Share this document with a friend
Popular Tags:
30
Waterfall: Rapid identification of IP flows using cascade classification Paweł Foremski, MSc. Eng. The Institute of Theoretical and Applied Informatics of the Polish Academy of Sciences, Gliwice [email protected] Brunów, 24 th June 2014 CN 2014 Conference
Transcript
Page 1: Waterfall: Rapid identification of IP flows using cascade classification

Waterfall:

Rapid identificationof IP flows using

cascade classificationPaweł Foremski, MSc. Eng.

The Institute of Theoretical and Applied Informaticsof the Polish Academy of Sciences, Gliwice

[email protected]

Brunów, 24th June 2014CN 2014 Conference

Page 2: Waterfall: Rapid identification of IP flows using cascade classification

Identification of IP flows?“traffic classification” or “traffic identification”

Page 3: Waterfall: Rapid identification of IP flows using cascade classification

TC: input - output

TrafficClassifier

Input Output

networktraffic

applicationnames

Page 4: Waterfall: Rapid identification of IP flows using cascade classification

TC input• TC input is the object of classification:

o Single IP packet

o IP flow

o Endpoint

o Host

Page 5: Waterfall: Rapid identification of IP flows using cascade classification

TC output• TC output is the result of classification:

o Application name – e.g. Skype, Teamviewer

o Network protocol – e.g. HTTP, SMTP

o Category – e.g. chat, streaming

o Traffic profile – e.g. bulk, interactive

o Content type – e.g. text, image

o Web application – e.g. Google Docs, Facebook

Page 6: Waterfall: Rapid identification of IP flows using cascade classification

TC: the problem• How to identify network traffic?

• How to cope with practical constraints?o With limited resources (on high-speed routers)

o With limited details (only packet headers)

o ...

• How to measure the performance?o Result accuracy

o Reaction time

o Temporal stability

o Spatial stability

o ...

Page 7: Waterfall: Rapid identification of IP flows using cascade classification

TC: applications

HTTP

Skype

BitTorrent

FTP

BitTorrent

Queuing

Quality of Service

Firewall

Access Policy

Monitoring

Routing

...

Page 8: Waterfall: Rapid identification of IP flows using cascade classification

TC: applications

Alessandro Finamore, Marco Mellia, Michela Meo, Maurizio M. Munafò, Dario Rossi, Experiences of Internet Traffic Monitoring with Tstat,IEEE Network "March/April 2011", Vol.25, No.3, pp.8-14, ISSN: 0890-8044, March/April 2011

Page 9: Waterfall: Rapid identification of IP flows using cascade classification

TC: applications

FTTH4 Mbps

ADSL24 Mbps

VoIP, DNS, G

ames,

...

BitTorrent, eMule, YouTube, ...

5-10 ms

50-100 ms

Page 10: Waterfall: Rapid identification of IP flows using cascade classification

TC: existing solutions• Port numbers

• Deep Packet Inspection (DPI) - e.g. [2,3]

• Machine Learning - e.g. [5,9]

• Behavioral analysis - e.g. [4,7,8]

• Classifier fusion - e.g. [6]

Page 11: Waterfall: Rapid identification of IP flows using cascade classification

Waterfall: motivation

Each TC algorithm has advantages and disadvantages.

The problem: Could we integrate these approaches into one system so that we move forward in TC?

How would solving this problem affect classification performance?

Page 12: Waterfall: Rapid identification of IP flows using cascade classification

Waterfall: the idea1. Use existing classifiers as modules2. Implement the rejection option3. Minimize false positives4. Connect in a cascade structure

1

2

3

Page 13: Waterfall: Rapid identification of IP flows using cascade classification

An old (yet new) idea

• Classifier selection• Mixture of experts• Cascade classification

Kuncheva L., “Combining pattern classifiers: methods and algorithms",John Wiley & Sons, 2004

A

A

B

Ax

• Classifier fusion• Majority vote• Weighted vote• Naive Bayes Combination• Behavior Knowledge Space• ...

Page 14: Waterfall: Rapid identification of IP flows using cascade classification

Waterfall: the idea

Page 15: Waterfall: Rapid identification of IP flows using cascade classification

Waterfall: practical system

dstip

dnsclass

portsize

npkts

port

(Python source code available at mutrics.iitis.pl)

Flow features limited to first 10 seconds

Page 16: Waterfall: Rapid identification of IP flows using cascade classification

Waterfall: validation

• Total sum of over 3.5 TB of data

• Validation of spatial and temporal stability

Foremski P., Callegari C., Pagano M., "Waterfall: Rapid identification of IP flows using cascade classification“.Proceedings of the 21st International Conference on Computer Networks, CN2014, CCIS 431, pp. 14-23. Springer, 2014

Page 17: Waterfall: Rapid identification of IP flows using cascade classification

Validation: dataset 1

Foremski P., Callegari C., Pagano M., "Waterfall: Rapid identification of IP flows using cascade classification“.Proceedings of the 21st International Conference on Computer Networks, CN2014, CCIS 431, pp. 14-23. Springer, 2014

Page 18: Waterfall: Rapid identification of IP flows using cascade classification

Validation: dataset 2

Foremski P., Callegari C., Pagano M., "Waterfall: Rapid identification of IP flows using cascade classification“.Proceedings of the 21st International Conference on Computer Networks, CN2014, CCIS 431, pp. 14-23. Springer, 2014

Temporal stability (8 months)

Page 19: Waterfall: Rapid identification of IP flows using cascade classification

Validation: datasets 3 and 4

Foremski P., Callegari C., Pagano M., "Waterfall: Rapid identification of IP flows using cascade classification“.Proceedings of the 21st International Conference on Computer Networks, CN2014, CCIS 431, pp. 14-23. Springer, 2014

Spatial stability

No payloads

Page 20: Waterfall: Rapid identification of IP flows using cascade classification

Experiment 1: >50% is easy

Foremski P., Callegari C., Pagano M., "Waterfall: Rapid identification of IP flows using cascade classification“.Proceedings of the 21st International Conference on Computer Networks, CN2014, CCIS 431, pp. 14-23. Springer, 2014

>50%

>50%

Page 21: Waterfall: Rapid identification of IP flows using cascade classification

Experiment 2: more is faster

Foremski P., Callegari C., Pagano M., "Waterfall: Rapid identification of IP flows using cascade classification“.Proceedings of the 21st International Conference on Computer Networks, CN2014, CCIS 431, pp. 14-23. Springer, 2014

adding specialized modules

Page 22: Waterfall: Rapid identification of IP flows using cascade classification

Discussion• Waterfall is a new architecture for TC• We propose an idea and an open source implementation• A 5-element system yielded very good results

• Findings• More than 50% of traffic in Internet is easy to identify

• Adding more modules to cascade can increase the speed

• Open questions• Quantitative comparison: Waterfall vs. BKS

• How to train the system in an optimal way?

• How to put the modules in a proper order?

Page 23: Waterfall: Rapid identification of IP flows using cascade classification

References1. Foremski P., On different ways to classify Internet traffic: a short review of selected publications.

Theoretical and Applied Informatics 2013; 25(2).2. B.-C. Park, Y. J. Won, M.-S. Kim, and J. W. Hong, Towards automated application signature

generation for traffic identification, in Network Operations and Management Symposium, 2008. NOMS 2008. IEEE, pp. 160–167, IEEE, 2008.

3. S. H. Yeganeh, M. Eftekhar, Y. Ganjali, R. Keralapura, and A. Nucci, CUTE: Traffic Classification Using TErms, in Computer Communications and Networks (ICCCN), 2012 21st International Conference on, pp. 1–9, IEEE, 2012.

4. T. Karagiannis, K. Papagiannaki, and M. Faloutsos, BLINC: Multilevel traffic classification in the dark, in ACM SIGCOMM Computer Communication Review, vol. 35, pp. 229 – 240, ACM, 2005.

5. A. Finamore, M. Mellia, M. Meo, and D. Rossi, KISS: Stochastic packet inspection classifier for udp traffic, Networking, IEEE/ACM Transactions on, vol. 18, no. 5, pp. 1505 – 1515, 2010.

6. A. Dainotti, A. Pescapé, and C. Sansone, Early classification of network traffic through multi-classification, Traffic Monitoring and Analysis, pp. 122 – 135, 2011.

7. Foremski P., Callegari C., Pagano M., DNS-Class: Immediate classification of IP flows using DNS, International Journal of Network Management, John Wiley & Sons, 2014, DOI: 10.1002/nem.1864

8. P. Bermolen, M. Mellia, M. Meo, D. Rossi, and S. Valenti, Abacus: Accurate behavioral classification of P2P-TV traffic, Computer Networks, vol. 55, no. 6, pp. 1394 – 1411, 2011.

9. G. Münz, H. Dai, L. Braun, and G. Carle, TCP traffic classification using Markov models, Traffic Monitoring and Analysis, pp. 127 – 140, 2010.

Page 24: Waterfall: Rapid identification of IP flows using cascade classification

Thank you!

Paweł Foremski, [email protected] website: http://mutrics.iitis.pl/

Page 25: Waterfall: Rapid identification of IP flows using cascade classification

TC: definition

Internet traffic classification (or identification) isthe act of matching IP packets

to the applications that generated them. [1]

Page 26: Waterfall: Rapid identification of IP flows using cascade classification

TC: the problem• How to identify network traffic?• How to do it well?

o With limited resources (on high-speed routers)

o With limited details (only packet headers)

o With good accuracy (no errors)

o In limited time (in real-time)

o For current and future protocols (flexibility and stability)

o For the whole Internet (backbone routers and gateways)

• How to measure the performance?o Result accuracy

o Reaction time

o Temporal stability

o Spatial stability

o Processing time

o Unknown detection

Page 27: Waterfall: Rapid identification of IP flows using cascade classification

Example: dnsclassForemski P., Callegari C., Pagano M., "DNS-Class: Immediate classification of IP flows using DNS",

International Journal of Network Management, John Wiley & Sons, 2014

Page 28: Waterfall: Rapid identification of IP flows using cascade classification

dnsclass: details

Foremski P., Callegari C., Pagano M., "DNS-Class: Immediate classification of IP flows using DNS",International Journal of Network Management, John Wiley & Sons, 2014

Page 29: Waterfall: Rapid identification of IP flows using cascade classification

dnsclass: details

Foremski P., Callegari C., Pagano M., "DNS-Class: Immediate classification of IP flows using DNS", International Journal of Network Management, John Wiley & Sons, 2014

Page 30: Waterfall: Rapid identification of IP flows using cascade classification

dnsclass: motivation

Foremski P., Callegari C., Pagano M., "DNS-Class: Immediate classification of IP flows using DNS", International Journal of Network Management, John Wiley & Sons, 2014


Recommended