+ All Categories
Home > Documents > 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and...

2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and...

Date post: 27-Mar-2015
Category:
Upload: jordan-hammond
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
35
2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2 , Roberto Perdisci 3 , Junjie Zhang 1 , and Wenke Lee 1 1 Georgia Tech 3 Damballa, Inc. 2 Texas A&M University
Transcript
Page 1: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

2008-7-31 Guofei Gu BotMiner

BotMiner: Clustering Analysis of Network Traffic for

Protocol- and Structure-Independent

Botnet DetectionGuofei Gu1,2, Roberto Perdisci3, Junjie

Zhang1, and Wenke Lee1

1Georgia Tech 3Damballa, Inc.2Texas A&M University

Page 2: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

2

Roadmap

• Introduction– Botnet problem– Challenges for botnet detection– Related work

• BotMiner– Motivation– Design– Evaluation

• Conclusion

Roadmap

Page 3: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

3

What Is a Bot/Botnet?

• Bot– A malware instance that runs autonomously and automatically on

a compromised computer (zombie) without owner’s consent– Profit-driven, professionally written, widely propagated

• Botnet (Bot Army): network of bots controlled by criminals– Definition: “A coordinated group of malware instances that are

controlled by a botmaster via some C&C channel”– Architecture: centralized (e.g., IRC,HTTP), distributed (e.g., P2P)– “25% of Internet PCs are part of a botnet!” ( - Vint Cerf)

bot

C&C

Botmaster

IntroductionBotMiner

Conclusion

Botnet ProblemChallenges for Botnet DetectionRelated Work

Page 4: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

4

Botnets are used for …

• All DDoS attacks

• Spam

• Click fraud

• Information theft

• Phishing attacks

• Distributing other malware, e.g., spyware

IntroductionBotMiner

Conclusion

Botnet ProblemChallenges for Botnet DetectionRelated Work

Page 5: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

5

Challenges for Botnet Detection

• Bots are stealthy on the infected machines– We focus on a network-based solution

• Bot infection is usually a multi-faceted and multi-phased process– Only looking at one specific aspect likely to fail

• Bots are dynamically evolving– Static and signature-based approaches may not be

effective

• Botnets can have very flexible design of C&C channels– A solution very specific to a botnet instance is not

desirable

Botnet Problem

Challenges for Botnet DetectionRelated Work

IntroductionBotMiner

Conclusion

Page 6: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

6

Why Existing Techniques Not Enough?

• Traditional AV tools– Bots use packer, rootkit, frequent updating to

easily defeat AV tools

• Traditional IDS/IPS– Look at only specific aspect– Do not have a big picture

• Honeypot– Not a good botnet detection tool

IntroductionBotMiner

Conclusion

Botnet Problem

Challenges for Botnet DetectionRelated Work

Page 7: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

7

Existing Botnet Detection Work

• [Binkley,Singh 2006]: IRC-based bot detection combine IRC statistics and TCP work weight

• Rishi [Goebel, Holz 2007]: signature-based IRC bot nickname detection

• [Livadas et al. 2006, Karasaridis et al. 2007]: (BBN, AT&T) network flow level detection of IRC botnets (IRC botnet)

• BotHunter [Gu etal Security’07]: dialog correlation to detect bots based on an infection dialog model

• BotSniffer [Gu etal NDSS’08]: spatial-temporal correlation to detect centralized botnet C&C

• TAMD [Yen, Reiter 2008]: traffic aggregation to detect botnets that use a centralized C&C structure

Botnet ProblemChallenges for Botnet Detection

Related Work

IntroductionBotMiner

Conclusion

Page 8: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

8

Why BotMiner?

• Botnets can change their C&C content (encryption, etc.), protocols (IRC, HTTP, etc.), structures (P2P, etc.), C&C servers, infection models …

bot

bot

bot

bot

bot

C&C

bot

bot

bot

bot

bot

bot

(a) (b)

IntroductionBotMinerConclusion

MotivationDesignEvaluation

Example: Nugache, Storm, …

Page 9: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

9

BotMiner: Protocol- and Structure-Independent Detection

Enterprise-like Network

Horizontal correlation- Bots are for long-term use- Botnet: communication and activities are coordinated/similar

IntroductionBotMinerConclusion

MotivationDesignEvaluation

Internet

Page 10: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

10

Revisit the Definition of a Botnet• “A coordinated group of malware instances that

are controlled by a botmaster via some C&C channel”

• We need to monitor two planes– C-plane (C&C communication plane): “who is talking

to whom”– A-plane (malicious activity plane): “who is doing what”

IntroductionBotMinerConclusion

MotivationDesignEvaluation

Page 11: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

11

BotMiner Architecture

Scan

Spam

A-Plane Monitor

BinaryDownloading

C-Plane Monitor

Flow Log

C-PlaneClustering

NetworkTraffic

Exploit

...

Activity Log

A-PlaneClustering

Cross-PlaneCorrelation

Reports

IntroductionBotMinerConclusion

Motivation

DesignEvaluation

Page 12: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

12

BotMiner C-plane Clustering

• What characterizes a communication flow (C-flow) between a local host and a remote service? – <protocol, srcIP, dstIP, dstPort>

IntroductionBotMinerConclusion

Motivation

DesignEvaluation

Page 13: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

13

How to Capture “Talking in What Kind of Patterns”?

• Temporal related statistical distribution information in– BPS (bytes per

second)– FPH (flow per hour)

• Spatial related statistical distribution information in– BPP (bytes per packet)– PPF (packet per flow)

IntroductionBotMinerConclusion

Motivation

DesignEvaluation

Page 14: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

14

Two-step Clustering of C-flows

• Why multi-step?

• How?– Coarse-grained clustering

• Using reduced feature space: mean and variance of the distribution of FPH, PPF, BPP, BPS for each C-flow (2*4=8)

• Efficient clustering algorithm: X-means

– Fine-grained clustering• Using full feature space (13*4=52)

• What’s left?

IntroductionBotMinerConclusion

Motivation

DesignEvaluation

Page 15: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

15

A-plane Clustering

• Capture “activities in what kind of patterns”

IntroductionBotMinerConclusion

Motivation

DesignEvaluation

Page 16: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

16

Cross-plane Correlation

• Botnet score s(h) for every host h

• Similarity score between host hi and hj

• Hierarchical clustering

AiAj

Two hosts in the same A-clusters and in at least one common C-cluster are clustered together

IntroductionBotMinerConclusion

Motivation

DesignEvaluation

Page 17: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

17

Evaluation TracesIntroductionBotMinerConclusion

Motivation

Design Evaluation

Page 18: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

18

Evaluation Results: False PositivesIntroductionBotMinerConclusion

Motivation

Design Evaluation

Page 19: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

19

Evaluation Results: Detection RateIntroductionBotMinerConclusion

Motivation

Design Evaluation

Page 20: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

20

Summary and Future Work

• BotMiner– New botnet detection system based on Horizontal

correlation– Independent of botnet C&C protocol and structure– Real-world evaluation shows promising results

• Future work– More efficient clustering, more robust features– New faster detection system using active techniques

• BotMiner: offline correlation, and requires a relatively long time for detection

• BotProbe: fast detection by observing at most one round of C&C

– New real-time solution for very high speed and very large networks

IntroductionBotMiner

Conclusion

Summary & Future Work

Correlation-based Botnet Detection Framework

Page 21: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

21

Correlation-based Botnet Detection Framework

Internet

Enterprise-like Network

HorizontalCorrelation

Vertical Correlation

BotHunter(Security’07

)

BotSniffer(NDSS’08)

BotMiner(Security’08

)

Cause-Effect Correlation

BotProbe

Time

IntroductionBotMiner

Conclusion

Summary & Future Work

Correlation-based Botnet Detection Framework

Page 22: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

22

Limitation and Discussion

• Evading C-plane monitoring and clustering– Misuse whitelist– Manipulate communication patterns

• Evading A-plane monitoring and clustering– Very stealthy activity– Individualize bots’ communication/activity

• Evading cross-plane analysis– Extremely delayed task

Appendix

Page 23: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

23

High-Speed Packet Sampling

• Traffic arrives at high rates– High volume– Some analysis scales with the size of the

input

• Possible approaches– Random packet sampling– Targeted packet sampling

Page 24: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

24

Approach

• Idea: Bias sampling of traffic towards subpopulations based on conditions of traffic

• Two modules– Counting: Count statistics of each traffic flow– Sampling: Sample packets based on (1)

overall target sampling rate (2) input conditions

CountingTraffic stream Sampling

Input conditionsInstantaneous

sampling probability

Overall sampling rate

Traffic subpopulations

Page 25: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

25

Challenges

• How to specify subpopulations?– Solution: multi-dimensional array specification

• How to maintain counts for each subpopulation?– Solution: rotating array of counting Bloom filters

• How to derive instantaneous sampling probabilities from overall constraints?– Solution: multi-dimensional counter array, and

scaling based on target rates

Page 26: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

26

Specifying Subpopulations

• Idea: Use concatenation of header fields (“tupples”) as a “key” for a subpopulation– These keys specify a group of packets that

will be counted together

# base sampling ratesampling_rate = 0.01# number of tuplestuples = 2# number of conditionsconditions = 1# tuple definitionstuple_1 := srcip.dstiptuple_2 := srcip.srcport.dstport# condition : sampling budgettuple_1 in (30, 1] ANDtuple_2 in (0, 5]: 0.5

Count groups of packets with the same source and destination IP address

Count groups of packets with the same source IP, source port, and destination port

Page 27: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

27

# base sampling ratesampling_rate = 0.01# number of tuplestuples = 2# number of conditionsconditions = 1# tuple definitionstuple_1 := srcip.dstiptuple_2 := srcip.srcport.dstport# condition : sampling budgettuple_1 in (30, inf] ANDtuple_2 in (0, 5]: 0.5

Sampling Rates for Subpopulations

• Operator specifies– Overall sampling rate– Conditional rate within each class

• Flexsample computes instantaneous sampling probabilities based on this

Sample one in 100 packets on average

Within the 1/100 “budget”, half of sampled packets should come from groups satisfying this condition

Page 28: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

28

Examining the Condition

• Biases sampling towards packets from (source IP, destination IP) pairs which– Have sent at least 30 packets– Have sent packets to at least 5 distinct ports

• Application: Portscan

# base sampling ratesampling_rate = 0.01# number of tuplestuples = 2# number of conditionsconditions = 1# tuple definitionstuple_1 := srcip.dstiptuple_2 := srcip.srcport.dstport# condition : sampling budgettuple_1 in (30, inf] ANDtuple_2 in (0, 5]: 0.5

Page 29: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

29

Sampling Lookup Table

• Problem: Conditions may not be completely specified

• Solution: Sampling budget lookup table– Lookup table for allocating sampling “budget”

to each class

# tuple definitionstuple_1 := srcip.dstiptuple_2 := srcip.srcport.dstport# condition : sampling budgettuple_1 in (30, inf] ANDtuple_2 in (0, 5]: 0.5

Deduced values

Next problem: Determining which condition each packet satisfies

Page 30: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

30

Counting Subpopulations

• Each packet belongs to a particular range in n-dimensional space

• Counts for each condition– Maintain counter (counting Bloom filter) for

each tuple in every subcondition– Rotate counters to expunge “stale” values

Details:1. Number of counters2. How often to rotate

Page 31: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

31

Deriving Instantaneous Sampling Rates

• Problem: Traffic rates are dynamic– Relative fractions of packets in each class

may change

• Solution: Count packets in each sampling class, and adjust probabilities to rebalance according to the lookup table– Instantaneous rate =

overall rate * (target rate) / (actual rate) – Keep track of actual rate using Bloom filter

array and EWMA

Page 32: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

32

Example Evaluation: Portscan

• Parameters as above• Nmap scan injected into

ful one-hour trace from department network

Results

Setup

• FlexSample can capture 10x more of the portscan packets if all sampling budget is allocated to portscan class

• Bias can be configured

Page 33: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

33

Other Applications

• Recovering unique “conversations” in sampled traffic

• Identifying DDoS Attacks

• Identifying heavy hiters, high-degree nodes, etc.

Page 34: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

34

Open Challenges

• Specifying ranges and classes for specific applications

• Scaling the counter array as the number of tuples and ranges increases

• Simultaneously satisfying multiple objectives

Page 35: 2008-7-31 Guofei Gu BotMiner BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Guofei Gu 1,2,

35

Next Steps: BotMiner Integration

• Determine – The traffic rates that BotMiner can support for

online analysis– The subpopulations that will yield the highest

detection rates

• Evaluation on traffic traces that contain botnets of interest


Recommended