Date post: | 25-Apr-2019 |
Category: |
Documents |
Upload: | duongxuyen |
View: | 220 times |
Download: | 0 times |
• Deep learning review
• Adversarial examples
• Introduction
• Houdini: generalize to any machine learning algorithm
Cisse, Adi, Neverova, & Keshet (2017)
• Speaker verification attack
Kreuk, Adi, Cisse, & Keshet, (2017)
• Adversarial Malware
Kreuk, Barak, Aviv-Reuven, Baruch, Pinkas & Keshet (2018)
• Watermarking machine learning models
Adi, Baum, Cisse, Pinkas, and Keshet (2018)
• Defenses and detection of adversarial attacks
Shalev, Adi, and Keshet (2018)
Outline
Measuring performance
– “0-1” loss in binary classification
– Intersection-over-union in object segmentation
1 substitutions
3 insertions
– Word Error Rate in speech recognition It is easy to recognize speech
It is easy to wreck a nice beach
Surrogate loss functions
- Negative log-likelihood:
- Hinge loss function (similar to SVM):
- And there are more...
Optimization for training
The optimum is found using (stochastic) gradient descent method:
for
pick example uniformly at random
compute gradient
update parameters
impersonator
adversarial
perturbation
the real
Milla Jovovich
Sharif, Bhagavatula, Bauer, & Reiter (2016)
Attacking Remotely Hosted Black-Box Models
All remote classifiers were trained on MNIST data set (60,000 training examples and 10 classes)
Papernot, McDaniel, Goodfellow, Jha, Celik, & Swami (2017)
Main result:
We showed that almost any machine learning-based model
can be attacked, including models solving complex and
structured tasks.
Cisse, Adi, Neverova, & Keshet (2017)
Deep networks for structured task
Example: machine translationMachine translation
is awesome!機械翻訳は素晴らしいです!
the set of possible
labels is exponentially
large
Deep networks for structured task
Example: speech recognition 音声認識
the set of possible
labels is exponentially
large
A new loss function called Houdini
Compare to the negative log-likelihood:
Cisse, Adi, Neverova, & Keshet (2017)
Houdini properties I
Strong consistency: the model found using Houdini yields the infimum loss
achievable by any predictor
Cisse, Adi, Neverova, & Keshet (2017)
Houdini properties IV
The Houdini loss function
is very similar to the generalized Probit loss function:
Cisse, Adi, Neverova, & Keshet (2017)
target source
perturbed image
noiseadver sar ial pr edict ion
(a) source image (b) init ial predict ion (c) adversarial predict ion (d) perturbed image (e) noise
Image segmentation: example 1
Cisse, Adi, Neverova, & Keshet (2017)
Image segmentation: example 1
target source
perturbed image
noiseadver sar ial pr edict ion
(a) source image (b) init ial predict ion (c) adversarial predict ion (d) perturbed image (e) noisetarget source
perturbed image
noiseadver sar ial pr edict ion
(a) source image (b) init ial predict ion (c) adversarial predict ion (d) perturbed image (e) noise
Cisse, Adi, Neverova, & Keshet (2017)
target source
perturbed image
noiseadver sar ial pr edict ion
(a) source image (b) init ial predict ion (c) adversarial predict ion (d) perturbed image (e) noise
Image segmentation: example 2
Cisse, Adi, Neverova, & Keshet (2017)
target source
perturbed image
noiseadver sar ial pr edict ion
(a) source image (b) init ial predict ion (c) adversarial predict ion (d) perturbed image (e) noisetarget source
perturbed image
noiseadver sar ial pr edict ion
(a) source image (b) init ial predict ion (c) adversarial predict ion (d) perturbed image (e) noise
Image segmentation: example 2
Cisse, Adi, Neverova, & Keshet (2017)
target source
perturbed image
noiseadver sar ial pr edict ion
(a) source image (b) init ial predict ion (c) adversarial predict ion (d) perturbed image (e) noise(Cisse, Adi, Neverova, & Keshet, 2017)
Image segmentation: example 3
adver sar ial at t ackor iginal semant ic segm ent at ion fr am ewor k com pr om ised sem ant ic segment at ion fr am ewor k(Cisse, Adi, Neverova, & Keshet, 2017)
adver sar ial at t ackor iginal sem ant ic segm ent at ion fr am ewor k com pr om ised sem ant ic segm ent at ion fr am ewor k
Image segmentation: example 3
(Cisse, Adi, Neverova, & Keshet, 2017)
adver sar ial at t ackor iginal sem ant ic segm ent at ion fr am ewor k com pr om ised sem ant ic segm ent at ion fr am ewor kadver sar ial at t ackor iginal sem ant ic segm ent at ion fr am ewor k com pr om ised sem ant ic segm ent at ion fr am ewor k
Image segmentation: example 3
(Cisse, Adi, Neverova, & Keshet, 2017)
adver sar ial at t ackor iginal sem ant ic segm ent at ion fr amewor k com pr om ised sem ant ic segm ent at ion fr am ewor k
Image segmentation: example 3
(Cisse, Adi, Neverova, & Keshet, 2017)
PCKh0.5 = 87.5
SSIM = 0.9802
distort ion 0.2112
iter 1000
PCKh0.5 = 87.5
SSIM = 0.9824
distort ion 0.211
iter 1000
PCKh0.5 = 100.0
SSIM = 0.9922
distort ion 0.145
iter 435
PCKh0.5 = 100.0
SSIM = 0.9906
distort ion 0.016
iter 574
percept ibi l i t y
0.145
percept ibi l i t y
0.016
per cept ibi l i t y
0.211
per cept ibi l i t y
0.210
Pose estimation (Xbox kinect): example 2
Cisse, Adi, Neverova, & Keshet (2017)
Original:
if she could only see Phronsie for just one
moment
Adversarial:
if she ou down take shee throwns purhdress luon ellwon
Speech recognition (Google Voice)
Cisse, Adi, Neverova, & Keshet (2017)
Original: speaker 148 Adversarial: speaker 23
Speaker verification (YOHO)
Kreuk, Adi, Cisse, & Keshet, (2017)
Watermarking Deep Neural Networks by Backdooring
Adi, Baum, Cisse, Pinkas, and Keshet
(2018)
training adapting
training“from scratch”
model
“pre-trained”
model
trigger set
training set
training set and
trigger set
Watermarking Deep Neural Networks by Backdooring
Functionality-preserving a model with a watermark is as accurate as a
model without it.
Non-trivial ownership an adversary is not able to claim ownership of the
model also if he knows the watermarking algorithm.
Unforgeability an adversary, even when possessing several trigger set
examples and their targets, will not be unable to convince a third party about
ownership.
Unremovability an adversary is not able to remove a watermark, even if he
knows about the existence of a watermark.
Adi, Baum, Cisse, Pinkas, and Keshet
(2018)
different vector
representations of words
“computer
keyboard”
Keshet, Aviv-Reuven, and NEC (under submission)
different vector
representations of words
“computer
keyboard”
Keshet, Aviv-Reuven, and NEC (under submission)
Sp
ee
ch
, L
an
gu
age
an
d D
ee
p L
ea
rnin
g L
ab
Moustapha Cisse
Carsten Baum
Benny Pinkas
Morna Baruch
Natalia Neverova
Assi Barak
Gabi Shalev