Date post: | 15-Dec-2014 |
Category: |
Engineering |
Upload: | precog |
View: | 155 times |
Download: | 3 times |
Unifying the Global Response to Cybercrime
bit.ly/malicious: Deep Dive into Short URL based e-Crime Detection
Neha Gupta, Anupama Aggarwal, Ponnurangam Kumaraguru
IIIT-Delhi, India
Unifying the Global Response to Cybercrime
Presentation Outline
! Problem ! Contribution ! Dataset ! Results ! Conclusions & Future Work
2
Unifying the Global Response to Cybercrime
What are URL shortening services? Long URL Short URL
…
Others
URL shortening service
! Shortens ~80 million links/day ! 2-3 million suspicious/week
3
Unifying the Global Response to Cybercrime
Abuse URL
shortening service
One-level obfuscation
Long malicious URL
Short malicious URL
Not so popular
URL shortening
service
Long malicious URL
Short malicious URL
Popular URL
shortening service
Multi-level obfuscation
…
is.gd bit.ly
4
Unifying the Global Response to Cybercrime
Major attacks Year 2012
Year 2014
Year 2013
Year 2014
5
Unifying the Global Response to Cybercrime
Bitly's Spam Detection Policies
+
+ More filters..
+
‘‘ ’’
‘‘ ’’
6
Unifying the Global Response to Cybercrime
Research Contribution
! Impact analysis of malicious Bitly links on OSM
! Identification of issues in Bitly’s spam detection
! Machine learning classification to detect malicious Bitly URLs
7
Unifying the Global Response to Cybercrime
Dataset link_encoder_info
link_encoder_link_history
link_info
link_expand
link_clicks
link_referring_domains
link_encoders
Bitly Global Hash
Long URL
#Warnings
Link Dataset (763,160)
Link Metric Dataset (413,119)
Encoder/User Metric Dataset
(12,344)
Phase 1 Phase 2 Phase 3
(54.13%) (100%)
8
Unifying the Global Response to Cybercrime
Domains ! 83.06% suspicious domains non-existent after 5 months
! Click requests (October 2013): 9,937,250
! Created for spamming and die after achieving significant hits
9
Unifying the Global Response to Cybercrime
63.54% 17.69% 18.77%
5,375 users
Network
Why more Twitter than Facebook? ! Doesn't allow users to connect
Facebook brand / fan pages for free
Multiple connections ! 507 malicious users connected
multiple Twitter accounts ! 28 malicious users connected at
least 10 Twitter accounts
Connected OSM network of all encoders
10
Unifying the Global Response to Cybercrime
Network Bitly profiles
(Link history)
Bitly warning check
(Connected Twitter accounts)
(<=200 tweets)
Twitter profile Jaccard Similarity
(Bitly user name)
Bitly profile Jaccard Similarity
Manual annotation based on similarity scores
3 malicious communities detected
11
Unifying the Global Response to Cybercrime
! 2 Bitly users with 9 Twitter accounts each ! Similar explicit pornographic content ! Dormant on Bitly, active on Twitter
Network
12
Unifying the Global Response to Cybercrime
(a) Malicious link detection
! APWG: 86% undetected ! Virustotal: 71.53% undetected ! SURBL: 36.66% undetected (Bitly claims to use SURBL)
Efficiency
(b) Malicious user profile detection
13
Unifying the Global Response to Cybercrime
2,018 /12,344 encoders (16.35%) had a Suspicion Factor=1 ; shortened only suspicious links
Efficiency
12,344
10,326
14
Unifying the Global Response to Cybercrime
Highly suspicious profiles: User has shortened at least 100 links + Suspicion Factor is 1 80 profiles
Promptness Analysis
15
User: bamsesang, Month lag: 24
Unifying the Global Response to Cybercrime
Bitly’s response
16
Unifying the Global Response to Cybercrime
Malicious Bitly Link Detection
Tweets from
Twi,er’s REST API (412,139)
Blacklist + Bitly Warning Check
Extract and expand bitly URLs (34,802)
Malicious
Benign
labeled-dataset
unlabeled-dataset
Collect data
1. Google Safebrowsing 2. SURBL 3. PhishTank 4. VirusTotal
Data Collection Data Labeling
Data Collection and Labeling
17
Unifying the Global Response to Cybercrime
Feature Selection No. Feature Name Feature Description
1 Domain age Difference between domain creation / updation date and expiration date
2 Link Creation domain creation difference
Difference between domain creation date and bitly link creation date
3 Link creation hour Bitly link creation hour
4 Number of encoders
Number of bitly users who encoded a particular link
5 Anonymous and API encoder ratio
Ratio of encoders as ‘’anonymous’’ or from a Twitter based application (Twitterfeed, TweetDeck, Tweetbot) to the total number of encoders
6 Link creation first click difference
Difference in days between bitly link creation date and date of first click received
7 Referring domains - direct by total
Ratio of referring domains from a direct source to the total number of referring domains
WH
OIS
spe
cific
Bitly sp
ec
ific
No
n-C
lick
ba
sed
C
lick b
ase
d
18
Unifying the Global Response to Cybercrime
Evaluation Results Experiment 1
Mix dataset – Click and Non-click All features
Precision (random forest): 81.20%
Experiment 2
Only Non-click data WHOIS + Non-click based features
Precision (random forest): 89.60%
TP
FP
FN
TN TP
FP FN
TN
19
Unifying the Global Response to Cybercrime
Feature Ranks
Rank Feature
1 Type of referring domains
2 Link Creation domain creation difference
3 Domain age
4 Link creation hour
5 Type of encoders
6 Link creation-click lag
7 Number of encoders
Rank Feature
1 Link creation hour
2 Link Creation domain creation difference
3 Domain age
4 Type of encoders
5 Number of encoders
Experiment 1 Experiment 2
20
Unifying the Global Response to Cybercrime
Conclusion & Future Work ! Restricted FB/Twitter connections per profile ! Credibility score per profile
! Bitly specific features in addition to blacklists
! Temporal pattern ! Broaden / generalize features for other URL shorteners ! Browser extension
21
Unifying the Global Response to Cybercrime
Questions?
22
Thanks to Bitly -Brian David Eoff -Mark Josephson
Thank You! [email protected]