+ All Categories
Home > Documents > Jialong Zhang, Chao Yang, Zhaoyan Xu , Guofei Gu SUCCESS Lab, Texas A&M University

Jialong Zhang, Chao Yang, Zhaoyan Xu , Guofei Gu SUCCESS Lab, Texas A&M University

Date post: 22-Feb-2016
Category:
Upload: faolan
View: 54 times
Download: 0 times
Share this document with a friend
Description:
PoisonAmplifier : A Guided Approach of Discovering Compromised Websites through Reversing Search Poisoning Attacks. Jialong Zhang, Chao Yang, Zhaoyan Xu , Guofei Gu SUCCESS Lab, Texas A&M University Published in RAID 2012. Outline. Introduction SEO Search Poisoning Attacks - PowerPoint PPT Presentation
Popular Tags:
25
PoisonAmplifier: A Guided Approach of Discovering Compromised Websites through Reversing Search Poisoning Attacks Jialong Zhang, Chao Yang, Zhaoyan Xu, Guofei Gu SUCCESS Lab, Texas A&M University Published in RAID 2012
Transcript
Page 1: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

PoisonAmplifier: A Guided Approach of Discovering Compromised Websites through Reversing Search Poisoning

AttacksJialong Zhang, Chao Yang, Zhaoyan Xu, Guofei

GuSUCCESS Lab, Texas A&M University

Published in RAID 2012

Page 2: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 2

Outline Introduction

– SEO– Search Poisoning Attacks

System Design Evaluation Result Conclusion

Page 3: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 3

Search Engine Optimization (SEO) White hat SEO

– a legitimate means of making websites appear on top of search results pages

Black hat SEO– the malicious way of using SEO– widely used by attackers to make their

spam/malicious websites come up in top search results of popular search engines

Page 4: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 4

Search Poisoning Attacks Mislead victims to malicious websites by

taking advantages of users’ trust on search results– If the requests are referred from specific search

engines, the malicious content will show– If the requests are directly from users, the

compromised websites will return normal content

Page 5: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL

Search Poisoning Attacks Workflow

5

Page 6: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL

Malicious Contents

6

User View Searcher View

Page 7: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 7

In this Paper… Given a small seed set, to identify

(amplify) more websites compromised by the search poisoning attacks– Unlike most existing studies that try to

understand or detect search poisoning attacks

– Ex: SURF: Detecting and Measuring Search Poisoning (CCS 2011)

Page 8: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 8

Main Idea Attackers tend to use a similar set of

keywords in multiple compromised websites

Attackers tend to insert links in Bot View to promote other compromised websites

Attackers tend to compromise multiple websites by exploiting similar vulnerabilities

Page 9: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL

System Architecture of PoisonAmplifier

9

Initial Terms

Seed Compromised

Websites

Seed Collector

PromotedContent

Extractor

PromotedContent

TermAmplifier

VulnerabilityAmplifier

LinkAmplifier

extracts and analyzes the content in User View but not in Bot View

to find more compromised websitesCompares the User

View and Searcher View

Page 10: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 10

Seed Collector

Send HTTP requests with customized values in HTTP header – Searcher View: pretend to visit from

Search Engine by using customized Http Referrer

– User View: customized User-Agent

Page 11: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 11

Promoted Content Extractor Extracts HTML content that appears in the

Bot View but not in the User View– filter web content that is used for displaying

such as HTML Tags, CSS codes…

GoogleBot User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Chrome User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11

Page 12: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 12

Term Amplifier Extract effective query terms

– so that we can obtain as many compromised websites as possible via searching those termstokenizes the promoted content into {Pi|i = 1,2,…,N}

search eachphrase Pi on the search engine

If the number of search results is lower than a threshold TD=1000000

comparing their Searcher Views and User Views

Page 13: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 13

Link Amplifier (1/3) Extracts inner-links and outer-links

– inner-links refer to those links/URLs in the promoted web content

– outer-links refer to those links/URLs in the web content of 3rd-party websites

– Ref.

Page 14: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 14

Link Amplifier (2/3) For each inner-link and outer-link,

Link Amplifier considers the linking website as compromised website if the Searcher View and User View are different– Utilize Google dork to locate the outer-

links– Ex: intext:seed.com intext:seedTerm

Page 15: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 15

Link Amplifier (3/3) We can find more categories of compromised

websites through analyzing those outer-links

Page 16: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 16

Vulnerability Amplifier Analyze possible system/software

vulnerabilities of those compromised websites (collected from Term/Link Amplifier)

In our preliminary work, we only focus on analyzing the vulnerabilities of WordPress– still requires some manual work to extract

search signatures– Vulnerability Amplifier examines whether it is

compromised or not by comparing its Searcher View and User View

Page 17: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 17

Evaluation of PoisonAmplifier Stage I (1 week)

– dataset: seed compromised websites– evaluate effectiveness, efficiency, and

diversity Stage II (1 month)

– dataset: the amplified terms and compromised websites from Stage I

– evaluate constancy

Page 18: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 18

Stage I: Dataset--- Seed Terms Google Trends

– totally 103 unique Google Trends topics Twitter Trends

– totally 64 unique Twitter Trends topics Customized keywords

– specific to scam words of pharmacy words– totally 165 unique pharmacy words from

existing work, manually selection and Google Suggest API

Totally 332 unique seed terms

Page 19: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 19

Stage I: Dataset ---Seed Compromised Websites

Google each unique seed term and collected the top 200 search results

Examine the Searcher View and User View of each search result

Totally 252 unique seed compromised websites were found

Page 20: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 20

Effectiveness How many new compromised websites can

be found Amplifying Rate(AR) of compromised sites

– #new found / seed

starting from only 252 seed compromised

websites, totally around 75,000 unique

compromised websites were discovered

Page 21: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 21

Efficiency Whether the websites visited by

PoisonAmplifier are more likely to be compromised websites

Hit Rate (HR) of the PoisonAmplifier – #new found / total # of websites it visited

Page 22: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 22

Diversity How many compromised websites of each

component are exclusive, which can not be found by other components

Exclusive ratio (ER)– #of compromised websites that are only

found by this component / total # of compromised websites found by this component.Component ER

Term Amplifier 99.56%

Inner-link Amplifier 96.11%

Outer-link Amplifier 89.09%

Vulnerability Amplifier 88.77%

Page 23: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 23

Constancy Whether PoisonAmplifier can continue to

find new compromised websites over time

Link Amplifier and Vulnerability

Amplifier can keep finding new

terms and compromised websites

The daily newly found

compromised websites

decrease quickly due to

the exhaustion of terms

Page 24: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 24

Distribution based on TLD

Page 25: Jialong  Zhang, Chao Yang,   Zhaoyan Xu ,  Guofei Gu SUCCESS Lab, Texas A&M University

A.C. Chen @ ADL 25

Conclusion Starting from a small seed set of

known compromised websites, PoisonAmplifier can recursively find more compromised websites by analyzing poisoned webpages’ special terms, links, and exploring compromised web sites’ vulnerabilities


Recommended