Date post: | 03-Dec-2015 |
Category: |
Documents |
Upload: | anonymous-6vuieu1873 |
View: | 16 times |
Download: | 2 times |
Part 2:Deep Packet Inspection Tutorial
Hendrik Schulzeipoque
Wednesday, December 19, 12
Tutorial Scope
• What is DPI?– De!nition– Applications– Technical Motivation
• How does DPI work?– Basic operations– Open source examples
• Skype and DPI– Application– Network behavior– Implications for DPI and analysis
2
Wednesday, December 19, 12
What is DPI?
“Deep Packet Inspection (DPI) … is a form of computer network packet !ltering that examines the data part (and possibly also the header) of a packet as it passes an inspection point, searching for protocol non-compliance, viruses, spam, intrusions or prede!ned criteria to decide if the packet can pass or if it needs to be routed to a different destination, or for the purpose of collecting statistical information. …”
[Wikipedia 02/2012]
3
Wednesday, December 19, 12
What is DPI?
“Deep Packet Inspection (DPI) … is a form of computer network packet !ltering that examines the data part (and possibly also the header) of a packet as it passes an inspection point, searching for protocol non-compliance, viruses, spam, intrusions or prede!ned criteria to decide if the packet can pass or if it needs to be routed to a different destination, or for the purpose of collecting statistical information. …”
[Wikipedia 02/2012]
• This is what a DPI engine does…
4
Wednesday, December 19, 12
What is DPI?
“Deep Packet Inspection (DPI) … is a form of computer network packet !ltering that examines the data part (and possibly also the header) of a packet as it passes an inspection point, searching for protocol non-compliance, viruses, spam, intrusions or prede!ned criteria to decide if the packet can pass or if it needs to be routed to a different destination, or for the purpose of collecting statistical information. …”
[Wikipedia 02/2012]
• This is what a DPI engine does … but these are applications!
5
Wednesday, December 19, 12
What is DPI?
Modi!ed De!nition:
• “Deep packet inspection (DPI) analyses all data of a packet (headers and payload) as it passes an inspection point in order to determine the protocol and/or application transported.
6
• DPI provides meta-data of network traffic– Meta-data is foundation of DPI applications
• DPI is different from decoding– it does not retrieve the data transferred– but: the term “DPI” is often also used for decoding
Wednesday, December 19, 12
Three Use-cases of DPI
7
Protocol/Application Classi!cation
Meta-data Extraction(Predetermined Payload area
Analysis – PPA)
Protocol Decoding(Full Payload area Analysis – FPA)
Wednesday, December 19, 12
What are DPI applications I?
Applications that use Protocol Classi!cation
• Network Operators– Billing– Tiered Services– Bandwidth Optimization– Policy Enforcement
• Security– NG Firewalls: Allow/Block Applications and Protocols– Virus scan only in sensitive traffic
• Network Probing– Statistics– Traffic Interception– Test and Measurement
8
Wednesday, December 19, 12
What are DPI applications II?
Applications that use Meta-data extraction
• Network Operators– Billing– Tiered Services– Policy Enforcement
• Security– NG Firewalls: prevent security relevant actions
• Network Probing– Statistics– Traffic Interception– Quality Measurement
9
Wednesday, December 19, 12
What are DPI applications III?
Applications that use Protocol Decoding
• Network Probing– Statistics– Traffic Interception/Investiagtion– Test and Measurement
10
Wednesday, December 19, 12
Protocol/Application Classification
Wednesday, December 19, 12
What problems does DPI solve?
• Network convergence creates a technical challenge– Different applications have the different requirements– The Internet is practically a single service network
• The Internet does not have a reliable mean of application identi!cation– Differentiated application/protocol handling requires reliable
identi!cation
12
Wednesday, December 19, 12
A single service network?
13
L7 L7
L7 Message
L7 Header
Application Data
TCP/UDP
TCP/UDP
TCP/UDP Message
L7 Header
Application Data
TCP/UDP Header
IP IP
IP Datagram
L7 Header
Application Data
TCP/UDP Header
IP Header
Link Link
L2 Frame
L7 Header
Application Data
TCP/UDP Header
IP Header
L2 Header
PHY PHY
Wednesday, December 19, 12
Application/Protocol Classification?
• By IANA-assigned port numbers (“well known ports”)– http://www.iana.org/assignments/port-numbers– e.g. 80 – HTTP, 20 – FTP data, 21– FTP control, 22 – SSH, 25 – SMTP
1214 – Kazaa, 4662 – eMule, 6881-6999 – BitTorrent, 9001 – Tor, pretty much any - Skype
• Used in most !rewall systems+ Easy and fast look-up in real-time−Many protocols and applications do not adhere to standard
anymore– some protocols, such as P2P, try to deliberately avoid !xed ports– there is no guarantee that the well known ports are used
14
Wednesday, December 19, 12
The Failure of Port-Based Classification
15
Wednesday, December 19, 12
The only way to understand L7 is to look at L7!
16
L7 L7
L7 Message
L7 Header
Application Data
TCP/UDP
TCP/UDP
TCP/UDP Message
L7 Header
Application Data
TCP/UDP Header
IP IP
IP Datagram
L7 Header
Application Data
TCP/UDP Header
IP Header
Link Link
L2 Frame
L7 Header
Application Data
TCP/UDP Header
IP Header
L2 Header
PHY PHY
Wednesday, December 19, 12
How does DPI work?
• What’s behind the buzzword?• It is very simple: use all the information the packet holds to
classify it.– it is not really that deep ;-) usually <1500 bytes– look for protocol-speci!c patterns in the packet payload– track state across several packets of a $ow
17
Wednesday, December 19, 12
Flow Tracking
• First basic DPI operation• Common concept in network security equipment
– e.g. Firewalls• Determines which packets belong to a communication between
two computers (“$ow”)• Based on 5-tuple $ow identi!er (SRC-IP,DST-IP,SRC port,DST
port,L-4 protocol)– required to determine a $ows protocol/application+ speed-up for system performance
18
Wednesday, December 19, 12
19
Pattern matching
Wednesday, December 19, 12
19
XXX
Simple pattern matching
Pattern matching
Wednesday, December 19, 12
19
XXX
Simple pattern matching Basic $ow tracking needed
Pattern matching
Wednesday, December 19, 12
19
XXX
Simple pattern matching
XXX
Pattern matching over multiple packets
YYY ZZZ
Basic $ow tracking needed
Pattern matching
Wednesday, December 19, 12
19
XXX
Simple pattern matching
XXX
Pattern matching over multiple packets
YYY ZZZ
Flow tracking mandatory
Basic $ow tracking needed
Pattern matching
Wednesday, December 19, 12
Behavioral Analysis
20
Pattern matching over multiple packets
short shortshortlong short
three short packets
Wednesday, December 19, 12
Pattern Matching
• Second basic DPI operation• Search for strings, numbers at certain positions
– usually several patterns for each protocol• Can be done with specialized hardware
– e.g. Cavium, RMI+ speed-up for pattern and regular expression matching−bus connection of accelerator cards is bottleneck
21
Wednesday, December 19, 12
Example I: RegEx Patterns
• http://l7-!lter.sourceforge.net
• Based on Net!lter– http://net!lter.org– Linux packet !ltering framework (ie. !rewall)– con!gured with iptables
• Uses regular expressions to match patterns+user-extensible−inefficient ⇒ slow
• Example: http & BitTorrent– http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9] [\x09-\x0d -~]*(connection:|content-type:|content-length:|date:)|post [\x09-\x0d -~]* http/[01]\.[019]
– ^(\x13bittorrent protocol|azver\x01$|get /scrape\?info_hash=)|d1:ad2:id20:|\x08'7P\)[RP]
22
Wednesday, December 19, 12
How L7-Filter Works
int match(packet, protocol){! if(regular expression for the protocol is not compiled yet)! ! compile it and put it in a list of compiled regexps;! else! ! fetch the compiled pattern from the list;
! if(already classified this connection)! ! if(classification matches one we're looking for)! ! ! return true;! ! else! ! ! return false;
! if(seen too many packets with no match)! ! return false;
! Append application layer data to data buffer;
! if(data buffer matches regexp for the protocol we're looking for)! ! Mark the connection as identified;! ! return true;! else! ! return false;}
23
Wednesday, December 19, 12
Example II: Hard-Coded Patterns
• Example from ipoque’s PACE ( Protocol and Application Classi!cation Engine )• Uses hard-coded pattern instead of regular expressions➡ faster, but less $exible• Example: BitTorrent/* test for match 0x13+"BitTorrent protocol" */if (payload[0] == 0x13) { if (memcmp(payload+1, "BitTorrent protocol", 19) == 0) return (IPP2P_BIT * 100);}/* get tracker commandos, all starts with GET /* then it can follow: scrape| announce* and then ?hash_info=*/if (memcmp(payload,"GET /",5) == 0){ /* message scrape */ if ( memcmp(payload+5,"scrape?info_hash=",17)==0 ) return (IPP2P_BIT * 100 + 1); /* message announce */ if ( memcmp(payload+5,"announce?info_hash=",19)==0 ) return (IPP2P_BIT * 100 + 2);}
24
Wednesday, December 19, 12
Behavioral Analysis
• Third basic DPI operation• Pattern matching impossible for encrypted traffic• Instead, look at unencrypted “patterns”:
– Packet sizes– Packet size sequences– Data rates– Packet rates– Number of concurrent $ows– Flow arrival rate
25
Wednesday, December 19, 12
Combining weak patterns
• If A matches --> 80% hit• If B matches --> 90% hit
• Question what is the probability when A and B match?
26
= 100% -(100%-80%)x(100-90%)= 100% - 20% x 10%= 98%
Wednesday, December 19, 12
False positives vs. false negatives
27
pro
babi
lity
of m
iscl
assi!c
atio
n
too loose too strict
false positives
false negatives
Wednesday, December 19, 12
Example: Skype
• Skype is optimized to work under many network conditions• Difficult to detect
➡Requires behavioral analysis• Very difficult to block
• Literature:– An Experimental Study of the Skype Peer-to-Peer VoIP System, Saikat Guha (Cornell University), Neil Daswani, Ravi Jain
(Google), 2/2006– Silver Needle in the Skype, Philippe Biondi, Fabrice Desclaux, EADS, Black Hat Europe 2006, 3/2006– Vanilla Skype (part 1+2), Fabrice Desclaux, Kostya Kortchinsky, EADS, RECON2006, 6/2006– An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol, Salman A. Baset and Henning Schulzrinne, Columbia
University, 1/2006
28
Wednesday, December 19, 12
Popularity
• >500 million registered users by end of 2011• ~30 million users simultaneously online• First half of 2010
– 88.4 billion Skype-to-Skype call minutes– 6.4 billion minutes of calls to landlines and mobiles– 40% video calls
• High diurnal usage variations– 40-50% more users during working hours– 25% more users on weekdays
• Longer call duration compared to PSTN– ∅ PSTN: 3 minutes– ∅ Skype: 27 minutes– most likely because Skype calls are free
29
Wednesday, December 19, 12
Technical Basics
• Peer-to-peer (P2P) network architecture– supernode architecture similar to the KaZaa FastTrack protocol
• Very easy to use• Works in almost any network environment on almost any
operating system• Advanced obfuscation techniques
– both in the code and the network traffic• Generates traffic even when idle
30
Wednesday, December 19, 12
Skype in the Network
• P2P architecture• Uses UDP and TCP, both for signaling and communication• No !xed ports
– a UDP port is randomly selected at installation time and used for all UDP data
– HTTP and HTTPS ports (80 & 443) can be used• Works behind !rewalls and NAT gateways
31
Wednesday, December 19, 12
Skype & Firewalls
• Penetrates most !rewall systems– there is nothing !rewalls can evaluate (such as port numbers, payload
patterns)• Works behind NAT gateways
– uses NAT hole punching techniques similar to STUN and TURN– only requires a single connection to a supernode initiated by the client
to be fully operational
32
Wednesday, December 19, 12
Supernodes
• Supernodes (SN)– implement the Global Index, the Skype user directory– essential for the proper operation
• Relay nodes (RN)– call forwarding for clients behind NAT gateways
• Differentiation between SN and RN not clear– “invisible” network infrastructure
• Every client with a public IP and sufficient resources can become a SN or RN– this can only be disabled for the latest Windows client by tweaking the
Registry– easier to become a RN
33
Wednesday, December 19, 12
DPI Technical Challenges
• No well-known ports or server IP addresses– Bit patterns insufficient for detection
• nearly everything is encrypted or encoded• known bit patterns only cover some !ows, not all• for instance, if Skype uses port 443, it mimics a valid HTTPS connection setup• patterns change from version to version
• If one $ow is blocked, Skype tries something else (i.e. different port, different transport protocol)– successful blocking requires more than just initial detection
34
Wednesday, December 19, 12
Detection
• Behavioral analysis requires tracking of – as many TCP and UDP $ows of a single node as possible
• Different $ow patterns for TCP and UDP• Flow “patterns”, or signatures
– absolute and relative packet sizes– $ow count and arrival rate
• Different connection mechanisms require different signatures• Regular signatures updates necessary
– signatures change often– major changes in Skype 3.0– Skype closely monitors and counters detection efforts
35
Wednesday, December 19, 12
Skype Versions Behave Differently
Skype v2.0.0.43Skype v2.0.0.63Skype v2.0.0.69Skype v2.0.0.73Skype v2.0.0.76Skype v2.0.0.79Skype v2.0.0.81Skype v2.0.0.90Skype v2.0.0.97Skype v2.0.0.103Skype v2.0.0.105Skype v2.0.0.107Skype v2.5.0.72Skype v2.5.0.82Skype v2.5.0.91Skype v2.5.0.113Skype v2.5.0.122Skype v2.5.0.126Skype v2.5.0.130Skype v2.5.0.137Skype v2.5.0.141Skype v2.5.0.146
Skype v2.5.0.151Skype v2.5.0.154Skype v2.6.0.67Skype v2.6.0.74Skype v2.6.0.81Skype v2.6.0.97Skype v2.6.0.103Skype v2.6.0.105Skype v3.0.0.106Skype v3.0.0.123Skype v3.0.0.137Skype v3.0.0.154Skype v3.0.0.190Skype v3.0.0.198Skype v3.0.0.205Skype v3.0.0.209Skype v3.0.0.214Skype v3.0.0.216Skype v3.0.0.217Skype v3.0.0.218Skype v3.1.0.112Skype v3.1.0.144
Skype v3.1.0.150Skype v3.1.0.152Skype v3.2.0.53Skype v3.2.0.63Skype v3.2.0.82Skype v3.2.0.115Skype v3.2.0.145Skype v3.2.0.148Skype v3.2.0.152Skype v3.2.0.158Skype v3.2.0.163Skype v3.2.0.175Skype v3.5.0.107Skype v3.5.0.158Skype v3.5.0.178Skype v3.5.0.202Skype v3.5.0.214Skype v3.5.0.229Skype v3.5.0.234Skype v3.5.0.239Skype v3.6.0.127Skype v3.6.0.159
36
Wednesday, December 19, 12
Skype Detection – Limitations of Behavioral Analysis
• The connection setup is detected– existing connections cannot be detected;
a call may use an existing connection and succeed• It takes some packets to detect a new connection
– a call may appear to succeed, but no conversation will be possible– the contact list may become available periodically
• High percentage of $ows/packets of the Skype node need to be visible
37
Wednesday, December 19, 12
Possible Infrastructure Obstacles
• The detection engine must see all inbound and outbound packets of a Skype client– no asymmetric routing– on other device blocking Skype packets– location of interception point in network topology
• Unusual network conditions– many IPs behind a single NAT– unusual NAT behavior
38
Wednesday, December 19, 12
What is the challenge about application classification
• The Internet is constantly changing
• Finding patterns is (normally) easy– even for encrypted traffic
• Maintaining patterns generates the pain– check for side effects (false positives)– check for updates (false negatives)
39
Wednesday, December 19, 12
What is the challenge about application classification
• The Internet is constantly changing
• Finding patterns is (normally) easy– even for encrypted traffic
• Maintaining patterns generates the pain– check for side effects (false positives)– check for updates (false negatives)
39
Wednesday, December 19, 12
What is the challenge about application classification
• The Internet is constantly changing
• Finding patterns is (normally) easy– even for encrypted traffic
• Maintaining patterns generates the pain– check for side effects (false positives)– check for updates (false negatives)
39
Wednesday, December 19, 12
What is the challenge about application classification
• The Internet is constantly changing
• Finding patterns is (normally) easy– even for encrypted traffic
• Maintaining patterns generates the pain– check for side effects (false positives)– check for updates (false negatives)
39
Wednesday, December 19, 12
What is the challenge about application classification
• The Internet is constantly changing
• Finding patterns is (normally) easy– even for encrypted traffic
• Maintaining patterns generates the pain– check for side effects (false positives)– check for updates (false negatives)
39
Wednesday, December 19, 12
40
Wednesday, December 19, 12
40
Wednesday, December 19, 12
41
Wednesday, December 19, 12
Meta-data Extraction
Wednesday, December 19, 12
43
XXX
Application Signature
Protocol specific meta-data
Application Meta-data
Wednesday, December 19, 12
43
XXX
Application Signature
Protocol specific meta-data
123
Application Meta-data
abc
• Application meta-data is normally located at predetermined payload areas
• The location is de!ned by the network protocol
Wednesday, December 19, 12
request
response
Client ApplicationsWeb Server
44
Wednesday, December 19, 12
request
response
Client ApplicationsWeb Server
Request:
GET /en/home/index.html HTTP/1.1Host: www.ipoque.comUser-Agent: Mozilla/5.0 ...[...]
44
Wednesday, December 19, 12
request
response
Client ApplicationsWeb Server
Request:
GET /en/home/index.html HTTP/1.1Host: www.ipoque.comUser-Agent: Mozilla/5.0 ...[...]
GET
44
Wednesday, December 19, 12
request
response
Client ApplicationsWeb Server
Request:
GET /en/home/index.html HTTP/1.1Host: www.ipoque.comUser-Agent: Mozilla/5.0 ...[...]
Response:
HTTP/1.1 200 OKDate: Sun, 12 Feb 2012 10:37:02 GMTServer: Apache/2.2.14 (Ubuntu)Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000Content-Language: enContent-Type: text/html; charset=utf-8[...]
<html>[ html web site description ]
</html>44
Wednesday, December 19, 12
request
response
Client ApplicationsWeb Server
Request:
GET /en/home/index.html HTTP/1.1Host: www.ipoque.comUser-Agent: Mozilla/5.0 ...[...]
Response:
HTTP/1.1 200 OKDate: Sun, 12 Feb 2012 10:37:02 GMTServer: Apache/2.2.14 (Ubuntu)Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000Content-Language: enContent-Type: text/html; charset=utf-8[...]
<html>[ html web site description ]
</html>
HTTP/
44
Wednesday, December 19, 12
request
response
Client ApplicationsWeb Server
Request:
GET /en/home/index.html HTTP/1.1Host: www.ipoque.comUser-Agent: Mozilla/5.0 ...[...]
Response:
HTTP/1.1 200 OKDate: Sun, 12 Feb 2012 10:37:02 GMTServer: Apache/2.2.14 (Ubuntu)Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000Content-Language: enContent-Type: text/html; charset=utf-8[...]
<html>[ html web site description ]
</html>
1.1 200 OK
44
Wednesday, December 19, 12
request
response
Client ApplicationsWeb Server
Request:
GET /en/home/index.html HTTP/1.1Host: www.ipoque.comUser-Agent: Mozilla/5.0 ...[...]
Response:
HTTP/1.1 200 OKDate: Sun, 12 Feb 2012 10:37:02 GMTServer: Apache/2.2.14 (Ubuntu)Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000Content-Language: enContent-Type: text/html; charset=utf-8[...]
<html>[ html web site description ]
</html>
Content-Type: text/html; charset=utf-8
1.1 200 OK
44
Wednesday, December 19, 12
Flow Correlation
Wednesday, December 19, 12
Control and Data Channel
46
Control Channel
Data Channel
Wednesday, December 19, 12
SIP
RTP
47
VoIP via SIP/RTP
Wednesday, December 19, 12
SIP
RTP
SIP Invite
INVITE sip:[email protected] SIP/2.0 [..]From: Alice <sip:[email protected]>;tag=9fxced76slTo: Bob <sip:[email protected]>Call-ID: [email protected] CSeq: 1 INVITE Contact: <sip:[email protected];transport=tcp> Content-Type: application/sdp Content-Length: 151
v=0 o=alice 2890844526 2890844526 IN IP4 client.atlanta.example.com s=- c=IN IP4 192.0.2.101 t=0 0 m=audio 49172 RTP/AVP 0 a=rtpmap:0 PCMU/8000
47
VoIP via SIP/RTP
Wednesday, December 19, 12
SIP
RTP
SIP Invite
INVITE sip:[email protected] SIP/2.0 [..]From: Alice <sip:[email protected]>;tag=9fxced76slTo: Bob <sip:[email protected]>Call-ID: [email protected] CSeq: 1 INVITE Contact: <sip:[email protected];transport=tcp> Content-Type: application/sdp Content-Length: 151
v=0 o=alice 2890844526 2890844526 IN IP4 client.atlanta.example.com s=- c=IN IP4 192.0.2.101 t=0 0 m=audio 49172 RTP/AVP 0 a=rtpmap:0 PCMU/8000
47
VoIP via SIP/RTP
c=IN IP4 192.0.2.101
m=audio 49172
Wednesday, December 19, 12
48
SIP
RTP
VoIP via SIP/RTP
Wednesday, December 19, 12
48
SIP
RTP
VoIP via SIP/RTP
Wednesday, December 19, 12
48
SIP
RTP
VoIP via SIP/RTP
DPI:
Flow
Cor
rela
tion
Wednesday, December 19, 12