+ All Categories
Home > Documents > Detecting and Characterizing Social Spam Campaigns

Detecting and Characterizing Social Spam Campaigns

Date post: 30-Dec-2015
Category:
Upload: garrison-leach
View: 31 times
Download: 1 times
Share this document with a friend
Description:
Detecting and Characterizing Social Spam Campaigns. Yan Chen Lab for Internet and Security Technology (LIST) Northwestern Univ. Detecting and Characterizing Social Spam Campaigns: Roadmap. Motivation & Goal Detection System Design Experimental Validation Malicious Activity Analysis - PowerPoint PPT Presentation
Popular Tags:
39
Detecting and Characterizing Social Spam Campaigns Yan Chen Lab for Internet and Security Technology (LIST) Northwestern Univ.
Transcript
Page 1: Detecting and Characterizing Social Spam Campaigns

Detecting and Characterizing Social Spam Campaigns

Yan Chen

Lab for Internet and Security Technology (LIST) Northwestern Univ.

Page 2: Detecting and Characterizing Social Spam Campaigns

22

Detecting and Characterizing Social Spam Campaigns: Roadmap

• Motivation & Goal

• Detection System Design

• Experimental Validation

• Malicious Activity Analysis

• Conclusions

Page 3: Detecting and Characterizing Social Spam Campaigns

33

Detecting and Characterizing Social Spam Campaigns: Roadmap

• Motivation & Goal

• Detection System Design

• Experimental Validation

• Malicious Activity Analysis

• Conclusions

Page 4: Detecting and Characterizing Social Spam Campaigns

4

Motivation

• Online social networks (OSNs) are exceptionally useful collaboration and communication tools for millions of Internet users.– 400M active users for Facebook alone– Facebook surpassed Google as the most

visited website

Page 5: Detecting and Characterizing Social Spam Campaigns

5

Motivation

• Unfortunately, the trusted communities in OSN could become highly effective mechanisms for spreading miscreant activities.– Popular OSNs have recently become the

target of phishing attacks– account credentials are already being sold

online in underground forums

Page 6: Detecting and Characterizing Social Spam Campaigns

6

Goal

• In this study, our goal is to:– Design a systematic approach that can

effectively detect the miscreant activities in the wild in popular OSNs.

– Quantitatively analyze and characterize the verified detection result to provide further understanding on these attacks.

Page 7: Detecting and Characterizing Social Spam Campaigns

77

Detecting and Characterizing Social Spam Campaigns: Roadmap

• Motivation & Goal

• Detection System Design

• Experimental Validation

• Malicious Activity Analysis

• Conclusions

Page 8: Detecting and Characterizing Social Spam Campaigns

8

Detection System Design

The system design, starting from raw data collection and ending with accurate classification of malicious wall posts and corresponding users.

Page 9: Detecting and Characterizing Social Spam Campaigns

9

Data Collection

• Based on “wall” messages crawled from Facebook (crawling period: Apr. 09 ~ Jun. 09 and Sept. 09).

• Leveraging unauthenticated regional networks, we recorded the crawled users’ profile, friend list, and interaction records going back to January 1, 2008.

• 187M wall posts with 3.5M recipients are used in this study.

Page 10: Detecting and Characterizing Social Spam Campaigns

10

Filter posts without URLs

• Assumption: All spam posts should contain some form of URL, since the attacker wants the recipient to go to some destination on the web.

• Example (without URL):

Kevin! Lol u look so good tonight!!!

Filter out

Page 11: Detecting and Characterizing Social Spam Campaigns

11

Filter posts without URLs

• Assumption: All spam posts should contain some form of URL, since the attacker wants the recipient to go to some destination on the web.

• Example (with URL):

Further process

Um maybe also this:http://community.livejournal.com/lemonadepoem/54654.html

Guess who your secret admirer is?? Go here nevasubevd\t. blogs pot\t.\tco\tm (take out spaces)

Page 12: Detecting and Characterizing Social Spam Campaigns

12

Build Post Similarity Graph

• After filtering wall posts without URLs, we build the post similarity graph on the remaining ones.– A node: a remaining wall post– An edge: if the two wall posts are “similar” and

are thus likely to be generated from the same spam campaign

Page 13: Detecting and Characterizing Social Spam Campaigns

13

Wall Post Similarity Metric

• Two wall posts are “similar” if:– They share similar descriptions, or– They share the same URL.

• Example (similar descriptions):Guess who your secret admirer is?? Go here nevasubevd\t. blogs pot\t.\tco\tm (take out spaces)

Guess who your secret admirer is??

Visit: \tyes-crush\t.\tcom\t (remove\tspaces)

Establish an edge!

Page 14: Detecting and Characterizing Social Spam Campaigns

14

Wall Post Similarity Metric

• Two wall posts are “similar” if:– They share similar descriptions, or– They share the same URL.

• Example (same URL):secret admirer revealed.goto yourlovecalc\t.\tcom (remove the spaces)

hey see your love compatibility !go here yourlovecalc\t.\tcom (remove\tspaces)

Establish an edge!

Page 15: Detecting and Characterizing Social Spam Campaigns

15

Extract Wall Post Clusters

• Intuition:– If A and B are generated from the same spam

campaign while B and C are generated from the same spam campaign, then A, B and C are all generated from the same spam campaign.

• We reduce the problem of extract wall post clusters to identifying connected subgraphs inside the post similarity graph.

Page 16: Detecting and Characterizing Social Spam Campaigns

16

Extract Wall Post Clusters

A sample wall post similarity graph and the corresponding clustering process (for illustrative purpose only)

Page 17: Detecting and Characterizing Social Spam Campaigns

17

Identify Malicious Clusters

• The following heuristics are used to distinguish malicious clusters (spam campaigns) from benign ones:– Distributed property: the cluster is posted by

at least n distinct users.– Bursty property: the median interval of two

consecutive wall posts is less than t.

Page 18: Detecting and Characterizing Social Spam Campaigns

18

Identify Malicious Clusters

A sample process of distinguishing malicious clusters from benign ones (for illustrative purpose only)

from_user >= n && interval <=

t?NO!!

from_user >= n && interval <=

t?Yes!!Malicious

Cluster!!from_user >= n && interval <=

t?Yes!!Malicious

Cluster!!

from_user >= n && interval <=

t?Yes!!

MaliciousCluster!!

BenignCluster!!

from_user >= n && interval <=

t?NO!!Benign

Cluster!!

from_user >= n && interval <=

t?NO!!Benign

Cluster!!BenignCluster!!

BenignCluster!!

from_user >= n && interval <=

t?NO!!

from_user >= n && interval <=

t?NO!!

Page 19: Detecting and Characterizing Social Spam Campaigns

19

Identify Malicious Clusters

• (6, 3hr) is found to be a good (n, t) value by testing TF:FP rates on the border line.

• Slightly modifying the value only have minor impact on the detection result.

• Sensitivity test: we vary the threshold– (6, 3 hr) to (4, 6hr) – Only result in 4% increase in the classified

malicious cluster.

Page 20: Detecting and Characterizing Social Spam Campaigns

2020

Detecting and Characterizing Social Spam Campaigns: Roadmap

• Motivation & Goal

• Detection System Design

• Experimental Validation

• Malicious Activity Analysis

• Conclusions

Page 21: Detecting and Characterizing Social Spam Campaigns

21

Experimental Validation

• The validation is focused on detected URLs.

• A rigid set of approaches are adopted to confirm the malice of the detection result.

• The URL that cannot be confirmed by any approach will be assumed as “benign” (false positive).

Page 22: Detecting and Characterizing Social Spam Campaigns

22

Experimental Validation

• Step 1: Obfuscated URL– URLs embedded with obfuscation are

malicious, since there is no incentive for benign users to do so.

– Detecting obfuscated URLs, e.g.,• Replacing ‘.’ with “dot”, e.g., 1lovecrush dot com• Inserting white spaces, e.g., abbykywyty\t. blogs

pot\t.\tco\tm, etc.• Have a complete such list from anti-spam research

Page 23: Detecting and Characterizing Social Spam Campaigns

23

Experimental Validation

• Step 2: Third-party tools– Multiple tools are used, including:

• McAfee SiteAdvisor• Google’s Safe Browsing API• URL blacklist (SURBL, URIBL, Spamhaus,

SquidGuard)• Wepawet, drive-by-download checking

– The URL that is classified as “malicious” by at least one of these tools will be confirmed as malicious

Page 24: Detecting and Characterizing Social Spam Campaigns

24

Experimental Validation

• Step 3: Redirection analysis– Any URL that redirects to a confirmed malicious URL

is considered as “malicious”, too.

• Step 4: Wall post keyword search– If the wall post contains typical spam keyword, like

“viagra”, “enlarger pill”, “legal bud”, etc, the contained URL is considered as “malicious”.

– Human assistance is involved to acquire such keywords

Page 25: Detecting and Characterizing Social Spam Campaigns

25

Experimental Validation

• Step 5: URL grouping– Groups of URLs exhibit highly uniform features. Some

have been confirmed as “malicious” previously. The rest are also considered as “malicious”.

– Human assistance is involved in identifying such groups.

• Step 6: Manual analysis– We leverage Google search engine to confirm the

malice of URLs that appear many times in our trace.

Page 26: Detecting and Characterizing Social Spam Campaigns

26

Experimental Validation

The validation result. Each row gives the number of confirmed URL and wall posts in a given step.The total # of wall posts after filtering is ~2M out of 187M.

Page 27: Detecting and Characterizing Social Spam Campaigns

2727

Detecting and Characterizing Social Spam Campaigns: Roadmap

• Motivation & Goal

• Detection System Design

• Experimental Validation

• Malicious Activity Analysis

• Conclusions

Page 28: Detecting and Characterizing Social Spam Campaigns

28

Usage summary of 3 URL Formats

• 3 different URL formats (with e.g.):– Link: <a href=“...”>http://2url.org/?67592</a>– Plain text: mynewcrsh.com– Obfuscated: nevasubevu\t. blogs pot\t.\tco\tm

Page 29: Detecting and Characterizing Social Spam Campaigns

29

Usage summary of 4 Domain Types

• 4 different domain types (with e.g.):– Content sharing service: imageshack.us– URL shortening service: tinyurl.org– Blog service: blogspot.com– Other: yes-crush.com

Page 30: Detecting and Characterizing Social Spam Campaigns

30

Spam Campaign Identification

Page 31: Detecting and Characterizing Social Spam Campaigns

31

Spam Campaign Temporal Correlation

Page 32: Detecting and Characterizing Social Spam Campaigns

32

Attack Categorization

• The attacks categorized by purpose.• Narcotics, pharma and luxury stands for the

corresponding product selling.

Page 33: Detecting and Characterizing Social Spam Campaigns

33

User Interaction Degree

• Malicious accounts exhibit higher interaction degree than benign ones.

Page 34: Detecting and Characterizing Social Spam Campaigns

34

User Active Time

• Active time is measured as the time between the first and last observed wall post made by the user.

• Malicious accounts exhibit much shorter active time comparing to benign ones.

Page 35: Detecting and Characterizing Social Spam Campaigns

35

Wall Post Hourly Distribution

• The hourly distribution of benign posts is consistent with the diurnal pattern of human, while that of malicious posts is not.

Page 36: Detecting and Characterizing Social Spam Campaigns

3636

Detecting and Characterizing Social Spam Campaigns: Roadmap

• Motivation & Goal

• Detection System Design

• Experimental Validation

• Malicious Activity Analysis

• Conclusions

Page 37: Detecting and Characterizing Social Spam Campaigns

37

Conclusions

• We design our automated techniques to detect coordinated spam campaigns on Facebook.

• Based on the detection result, we conduct in-depth analysis on the malicious activities and make interesting discoveries, including:– Over 70% of attacks are phishing attacks.– malicious posts do not exhibit human diurnal patterns.– etc.

Page 38: Detecting and Characterizing Social Spam Campaigns

38

Thank you!

Page 39: Detecting and Characterizing Social Spam Campaigns

39

Extract Wall Post Clusters

The algorithm for wall post clustering. The detail of breadth-first search (BFS) is omitted.


Recommended