Date post: | 25-May-2015 |
Category: |
Education |
Upload: | pra-group-university-of-cagliari |
View: | 216 times |
Download: | 2 times |
Pattern Recognition and Applications Lab
University
of Cagliari, Italy
Department of Electrical and Electronic
Engineering
Evasion attacks against machine learning at test time
Ba#sta Biggio (1) Igino Corona (1), Davide Maiorca (1), Blaine Nelson (3), Nedim Šrndić (2),
Pavel Laskov (2), Giorgio Giacinto (1), and Fabio Roli (1)
(1) University of Cagliari (IT); (2) University of Tuebingen (GE); (3) University of Postdam (GE)
http://pralab.diee.unica.it
Machine learning in adversarial settings
• Machine learning in computer security – spam filtering, intrusion detection, malware detection
legitimate malicious
x1
x2 f(x)
2
http://pralab.diee.unica.it
Machine learning in adversarial settings
• Machine learning in computer security – spam filtering, intrusion detection, malware detection
• Adversaries manipulate samples at test time to evade detection
legitimate malicious
x1
x2 f(x)
3
Trading alert! We see a run starting to happen. It’s just beginning of 1 week promotion … Tr@ding al3rt!
We see a run starting to happen. It’s just beginning of 1 week pr0m0ti0n …
http://pralab.diee.unica.it
Our work
Problem: can machine learning be secure? (1) • Framework for proactive security evaluation of ML algorithms (2) Adversary model • Goal of the attack • Knowledge of the attacked system • Capability of manipulating data • Attack strategy as an optimization problem
4
Bounded adversary!
(1) M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? ASIACCS 2006
(2) B. Biggio, G. Fumera, F. Roli. Security evaluaVon of paWern classifiers under aWack. IEEE Trans. on Knowl. and Data Engineering, 2013
In this work we exploit our framework for security evaluaVon against evasion a)acks!
http://pralab.diee.unica.it
Bounding the adversary’s capability
• Cost of manipulations – Spam: message readability
• Encoded by a distance function in feature space (L1-norm) – e.g., number of words that are modified in spam emails
5
d (x , !x ) ≤ dmaxx2
x1
f(x)
Bounded by a maximum value
xFeasible domain
x 'We will evaluate classifier performance vs. increasing dmax
http://pralab.diee.unica.it
Gradient-descent evasion attacks
• Goal: maximum-confidence evasion • Knowledge: perfect • Attack strategy:
• Non-linear, constrained optimization – Gradient descent: approximate
solution for smooth functions
• Gradients of g(x) can be analytically computed in many cases
– SVMs, Neural networks
6
−2−1.5
−1−0.5
00.5
11.5
x
f (x) = sign g(x)( ) =+1, malicious−1, legitimate
"#$
%$
minx 'g(x ')
s.t. d(x, x ') ≤ dmax
x '
http://pralab.diee.unica.it
Computing descent directions
Support vector machines
Neural networks
7
x1
xd
δ1
δk
δm
xf g(x)
w1
wk
wm
v11
vmd
vk1
…
…
…
…
g(x) = αi yik(x,i∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi )
i∑
g(x) = 1+ exp − wkδk (x)k=1
m
∑#
$%
&
'(
)
*+
,
-.
−1
∂g(x)∂x f
= g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkfk=1
m
∑
RBF kernel gradient: ∇k (x ,xi ) = −2γ exp −γ || x − xi ||2{ }(x − xi )
http://pralab.diee.unica.it
g(x) ! ! p(x|yc=!1), !=0
!4 !3 !2 !1 0 1 2 3 4!4
!2
0
2
4
!1
!0.5
0
0.5
1
• Problem: greedily min. g(x) may not lead to classifier evasion!
• Solution: adding a mimicry component that attracts the attack samples towards samples classified as legitimate
Density-augmented gradient-descent
Mimicry component (Kernel Density Estimator)
8
g(x) ! ! p(x|yc=!1), !=20
!4 !3 !2 !1 0 1 2 3 4!4
!2
0
2
4
!4.5
!4
!3.5
!3
!2.5
!2
!1.5
!1
Now all the aWack samples evade the classifier!
Some aWack samples may not evade the classifier!
minx 'g(x ')−λp(x ' | yc = −1)
s.t. d(x, x ') ≤ dmax
http://pralab.diee.unica.it
Density-augmented gradient-descent
9
∇p(x | yc = −1) = − 2nh
exp −|| x − xi ||2
h#
$%
&
'( x − xi( )i|yic=−1∑KDE gradient (RBF kernel):
http://pralab.diee.unica.it
An example on MNIST handwritten digits
10
• Linear SVM, 3 vs 7. Features: pixel values.
Before attack (3 vs 7)
5 10 15 20 25
5
10
15
20
25
After attack, g(x)=0
5 10 15 20 25
5
10
15
20
25
After attack, last iter.
5 10 15 20 25
5
10
15
20
25
0 500!2
!1
0
1
2g(x)
number of iterations
Without mimicry λ = 0
dmax
5000
Before attack (3 vs 7)
5 10 15 20 25
5
10
15
20
25
After attack, g(x)=0
5 10 15 20 25
5
10
15
20
25
After attack, last iter.
5 10 15 20 25
5
10
15
20
25
0 500!2
!1
0
1
2g(x)
number of iterations
With mimicry λ = 10
dmax
5000
http://pralab.diee.unica.it
Bounding the adversary’s knowledge Limited knowledge attacks
• Only feature representation and learning algorithm are known • Surrogate data sampled from the same distribution as the
classifier’s training data • Classifier’s feedback to label surrogate data
11
PD(X,Y) data
Surrogate training data
f(x)
Send queries
Get labels Learn surrogate classifier
f’(x)
http://pralab.diee.unica.it
Experiments on PDF malware detection
• PDF: hierarchy of interconnected objects (keyword/value pairs)
• Adversary’s capability
– adding up to dmax objects to the PDF – removing objects may
compromise the PDF file (and embedded malware code)!
12
/Type 2 /Page 1 /Encoding 1 …
13 0 obj << /Kids [ 1 0 R 11 0 R ] /Type /Page ... >> end obj 17 0 obj << /Type /Encoding /Differences [ 0 /C0032 ] >> endobj
Features: keyword count
minx 'g(x ')−λp(x ' | y = −1)
s.t. d(x, x ') ≤ dmax
x ≤ x '
http://pralab.diee.unica.it
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
dmax
FN
SVM (Linear), !=0
PK (C=1)LK (C=1)
Experiments on PDF malware detection Linear SVM
13
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1SVM (linear) ! C=1, !=500
dmax
FN
PKLK
• Dataset: 500 malware samples (Contagio), 500 benign (Internet) – 5-fold cross-validation – Targeted (surrogate) classifier trained on 500 (100) samples
• Evasion rate (FN) at FP=1% vs max. number of added keywords
– Perfect knowledge (PK); Limited knowledge (LK)
Without mimicry λ = 0
With mimicry λ = 500
http://pralab.diee.unica.it
Experiments on PDF malware detection SVM with RBF kernel, Neural Network
14
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
Neural Netw. ! m=5, !=500
dmax
FN
PK
LK
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1SVM (RBF) ! C=1, !=1, "=500
dmax
FN
PKLK
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
dmax
FN
SVM (RBF), !=0
PK (C=1)LK (C=1)
0 10 20 30 40 500
0.2
0.4
0.6
0.8
1
dmax
FN
Neural Netw., !=0
PK (C=1)LK (C=1)
(m=5) (m=5)
http://pralab.diee.unica.it
Conclusions and future work
• Related work. Near-optimal evasion of linear and convex-inducing classifiers (1,2)
• Our work. Linear and non-linear classifiers can be highly vulnerable to well-crafted evasion attacks
– … even under limited attacker’s knowledge
• Future work – Evasion of non-differentiable decision functions (decision trees) – Surrogate data: how to query more efficiently the targeted classifier? – Practical evasion: feature representation partially known or difficult to
reverse-engineer – Securing learning: game theory to model classifier vs. adversary
15
(1) D. Lowd and C. Meek. Adversarial learning. ACM SIGKDD, 2005. (2) B. Nelson, B. I. Rubinstein, L. Huang, A. D. Joseph, S. J. Lee, S. Rao, and J. D.
Tygar. Query strategies for evading convex-‐inducing classifiers. JMLR, 2012.
http://pralab.diee.unica.it
? 16
Any ques@ons Thanks for your aWenVon!