Tracking Internet Hosts Using Unreliable IDs€¦ · Tracking Internet Hosts Using Unreliable IDs...

transcript

Tracking Internet Hosts Using Unreliable IDs Yinglian Xie, Fang Yu, and Martín Abadi

Microsoft Research, Silicon Valley

First 100.0.0.1Then 100.0.0.2

100.0.0.1Blacklist 100.0.0.1 X

Open and anonymous weak accountability IP addresses are not unique, fixed identifiers (because of dynamic IP addresses, proxies, and NATs).

It is hard to identify who is responsible for traffic.

Accountability is weak on the Internet

We should block attack traffic by host rather than by IP address

Track hosts more reliably in spite of dynamic IP addresses

Explore applications that require identifying hosts over time

E.g., security and data-mining

Our approach: Use application IDs and events to track hosts

Goal: Inferring host-IP bindings

AliceAlice

AliceBob

IP1IP2

IP3IP4IP5IP6

t1 t2 t3 t4 t5 t6Alice’s host

IP_1: [t1, t2]IP_2: [t3, t4]IP_5: [t5, t6]

Bob’s hostIP_4: [t3, t4]IP_3: [t5, t6]

Challenges: No 1-1 mappings between hosts and IDs Dynamic IP addresses, proxies and NATs Malicious IDs

Example application: host-based blacklisting

Problem formulation

t1 t2 t3 t4 t5 t6

Input events

e1: <u1, IP1, t1>e2: <u2, IP1, t2>e3: <u3, IP4, t3>

… …en: <u4, IP6, t5>

Identity mapping

A u1, u2

… …

Host-tracking graph

timeIPi

t1 t3t2

u1 u1 u1 u1

GroupUser IDs

Construct tracking

Initial ID-groups

Resolve inconsistency

Updated ID-groupsInput events

Tracking graph with

inconsistent bindings

Update ID groups

Pruned inconsistent

bindings

Host tracking

Initial estimation: u1 : h1u2 : h2

Methodology overview

Our goal : maximize tracked events and tracked IDs

Grouping IPs

Consider the probability of two random IDs appearing together

Resolve inconsistencies

timeIPi

timeIPj

Conflict bindings

Concurrent bindings

Guest removal

Proxy identification

Group splitting

Host-tracking graph

Update estimation iteratively

Calibrate cookie churns

Estimate host population more accurately

Build normal-user profiles

Host-aware blacklists – block by host rather than IP

Post-mortem forensics

Real-time blacklists (Tracklist)

Coverage and Accuracy

Applications

Evaluate accuracy using Windows update data 92% - 96%

Coverage: tracked events / total events 75% - 80%

Implementation with Hotmail data

Query planLINQ query Dryad

select

Automatic query plan generation

Distributed query execution by Dryad

var logentries =from line in logswhere !line.StartsWith("#")select new LogEntry(line);

On Dryad and DryadLINQ

# of blocked users Falsepositives

IP blacklist / infinitely 44.70 million 52.8%

IP blacklist / one hour 27.94 million 34.1%

Tracklist / one hour 16.01 million 4.9%

Tracklist with profile/one hour 14.27 million 0.1%

Seed data: 5.6 million bot-accounts detected by BotGraph in one month

Network security with IP intelligence Our work

IP property

inference engine

Static or dynamic

Proxy, NAT

Residential vs. enterprise

DSL, wireless, dialup

User population

Spam history

Known attack history

… …

Applications

Spammer detection

DDoSprevention

Click frauddetection

Targeted ads

Phishing sitedetection

Windows update

IP properties IP history

•190M dynamic IPs

•40-50% spam

UDmap:

identify

dynamic IPs

Hotmail login data

AutoRE:

derive spam

signatures

•340K botnet IPs

•16-18% spam not

detected today

Sampled emails

BotGraph:

detect bot-

accounts

•26M bot-accounts

•4.5M botnet IPs

Hotmail login data

HostTracker:

track host-IP bindings

Collaboration with WLSP (Jason Atlas, Geoff Hulten, Ivan Osipkov), Hotmail (Hersh Dangayach, Eliot Gillum, Krish Vitaldevara), Bing (Fritz Behr, David Soukal, Roger Yu, Zijian Zheng), and Messenger (Steve Miale)

Routing tables

Windows update log

Web search log

Spam emailsand history

User login records

Messengerlog

SBotMiner:

detect

search bots

•3% of overall

search traffic

Search data

Spammer

detection

Search user

tracking

application?…

Tracking Internet Hosts Using Unreliable IDs€¦ · Tracking Internet Hosts Using Unreliable IDs...

Documents