+ All Categories
Home > Documents > Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning...

Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning...

Date post: 16-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
17
Stealing Machine Learning Models via Prediction APIs Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, Thomas Ristenpart Usenix Security Symposium Austin, Texas, USA August, 11 th 2016
Transcript
Page 1: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

StealingMachineLearningModelsviaPredictionAPIs

FlorianTramèr,FanZhang,AriJuels,MichaelK.Reiter,ThomasRistenpart

Usenix SecuritySymposiumAustin,Texas,USAAugust,11th 2016

Page 2: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

MachineLearning(ML)Systems

2

(1) Gatherlabeleddata

x(1),y(1) x(2),y(2) …

Dependentvariableyn-dimensionalfeaturevectorx

DataBob Tim Jake

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%correct

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%correct

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%correct

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

(3) Usefinsomeapplicationorpublishitforotherstouse

Training

y=

Modelf

x =BobTimJake

(2) TrainMLmodelffromdataf(x)=y

Prediction

Confidence

Application

Page 3: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

MachineLearningasaService(MLaaS)

3

$$$perquery

Modelf

input

BlackBoxclassification

PredictionAPI

Data

TrainingAPI

Goal1:RichPredictionAPIs• HighlyAvailable• High-PrecisionResults

Goal2:ModelConfidentiality• Model/DataMonetization• SensitiveData

Page 4: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

MachineLearningasaService(MLaaS)

4

Service ModeltypesAmazon Logistic regressions

Google ???(announced: logisticregressions,decisiontrees,neuralnetworks,SVMs)

Microsoft Logisticregressions,decisiontrees, neuralnetworks,SVMsPredictionIO Logisticregressions,decisiontrees,SVMs(white-box)BigML Logistic regressions,decisiontrees

SellDatasets– Models– PredictionQueriestootherusers$$$ $$$

Page 5: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

Goal:Adversarialclientlearnscloseapproximationoffusingasfewqueriesaspossible

Applications:

1) Underminepay-for-predictionpricingmodel

2) Facilitateprivacyattacks(

3) Steppingstonetomodel-evasion[Lowd,Meek– 2005][Srndic,Laskov – 2014]

ModelExtractionAttacks

5

Attack Modelf Datax

f(x)f’

Target:f(x)=f’(x) on≥99.9%ofinputs

Page 6: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

Goal:Adversarialclientlearnscloseapproximationoffusingasfewqueriesaspossible

ModelExtractionAttacks(PriorWork)

6

Iff(x)isjustaclasslabel:learningwithmembershipqueries- Booleandecisiontrees[Kushilevitz,Mansour– 1993]- Linearmodels(e.g.,binaryregression)[Lowd,Meek– 2005]

Attack Modelf Datax

f(x)f’

Isn’tthis“justMachineLearning”?No!PredictionAPIsreturnmoreinformationthanassumedinpriorworkand“traditional”ML

Page 7: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

MainResults

7

DataAttack Modelfx

f(x)f’

• LogisticRegressions,NeuralNetworks,DecisionTrees,SVMs

• Reverse-engineermodeltype&features

f’(x)=f(x)on100%ofinputs100s-1000’sofonlinequeries

InversionAttack

x f’(x)

Improved Model-InversionAttacks[Fredrikson etal.2015]

Page 8: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

ModelExtractionExample:LogisticRegressionTask:FacialRecognitionoftwopeople(binaryclassification)

8

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

Modelf

Bob

Alice

Featurevectorsarepixeldatae.g.,n=92*112=10,304

Data

f(x)= 1/(1+e-(w*x+b))

fmapsfeaturestopredictedprobabilityofbeing“Alice”≤0.5classifyas“Bob”>0.5classifyas“Alice”

n+1parametersw,b chosenusingtrainingsettominimizeexpectederror

Generalizetoc>2classeswithmultinomiallogisticregressionf(x)=[p1,p2,…,pc]predictlabelasargmaxi pi

Page 9: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

ModelExtractionExample:LogisticRegression

Goal:Adversarialclientlearnscloseapproximationoffusingasfewqueriesaspossible

9

Attack

Linearequationinn+1unknownsw,b

ln =w*x+bf(x)1- f(x)

()

f(x) =1/(1+e-(w*x+b))

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

The image on the left is a face that was altered by computer processing. It may or may not correspond to one of the faces displayed to the

right of it.

If you believe that it does correspond to one of the other faces, please select the corresponding image. If you do not believe that it corresponds

to one of the other faces, select “Not Present”.

Altered Image

Fig. 10. Task shown to Mechanical Turk workers for reconstruction attack evaluation. The actual tasks shown to workers rendered the “altered” image abovethe other images, while here we show them configured horizontally to save space.

Softmax MLP DAE

overall identified excluded

20

40

60

80

100

%co

rrect

(a) Average over all responses.

overall identified excluded

20

40

60

80

100

(b) Correct by majority vote of responses.

overall identified excluded

20

40

60

80

100

(c) Accuracy with high-performing workers.

Fig. 11. Reconstruction attack results.

In 80% of the experiments, one of the five images containedthe individual corresponding to the label used in the attack.As a control, 10% of the instances used a plain image fromthe data set rather than one produced by MI-FACE. Thisallowed us to gauge the baseline ability of the workers atmatching faces from the training set. In all cases, the imagesnot corresponding to the attack label were selected at randomfrom the training set. Workers were paid $0.08 for each taskthat they completed, and given a $0.05 bonus if they answeredthe question correctly, and workers were generally able toprovide a response in less than 40 seconds. They were allowedto complete at most three tasks for a given experiment. Asa safeguard against random or careless responses, we onlyallowed workers who have completed at least 1,000 jobs onMechanical Turk and achieved at least a 95% approval rating,to complete the task.

algorithm time (s) epochs

Softmax 1.4 5.6MLP 1298.7 3096.3DAE 692.5 4728.5

Fig. 12. Attack performance.

1) Performance: Weran the attack for eachmodel on an 8-coreXeon machine with 16Gmemory. The results areshown in Figure 12.Reconstructing faces out of the softmax model is veryefficient, taking only 1.4 seconds on average and requiring5.6 epochs of gradient descent. MLP takes substantiallylonger, requiring about 21 minutes to complete and on theorder of 3000 epochs of gradient descent. DAE requires lesstime (about 11 minutes) but a greater number of epochs. This

Target Softmax MLP DAE

Fig. 13. Reconstruction of the individual on the left by Softmax, MLP, andDAE.

is due to the fact that the search takes place in the latentfeature space of the first autoencoder layer. Because this hasfewer units than the visible layer of our MLP architecture,each epoch takes less time to complete.

2) Accuracy results: The main accuracy results are shownin Figure 11. In this figure, overall refers to all correctresponses, i.e., the worker selected the image correspondingto the individual targeted in the attack when present, andotherwise selected “Not Present”. Identified refers to instanceswhere the targeted individual was displayed among the testimages, and the worker identified the correct image. Excludedreferes to instances where the targeted individual was notdisplayed, and the worker correctly responded “Not Present”.

Figure 11a gives results averaged over all responses,whereas 11b only counts an instance as correct when amajority (at least two out of three) users responded correctly.In both cases, Softmax produced the best reconstructions,yielding 75% overall accuracy and up to an 87% identifi-

12

Modelf

Bob

AliceData

f(x)=f’(x) on100%ofinputs

Queryn+1randompoints⇒ solvealinearsystemofn+1equations

x

f(x)f’

Page 10: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

f’

f

GenericEquation-SolvingAttacks

10

MLaaS Service

• Solvenon-linearequationsystem intheweightsW- Optimizationproblem+gradientdescent- “NoiselessMachineLearning”

• MultinomialRegressions&DeepNeuralNetworks:- >99.9%agreementbetweenfandf’- ≈1querypermodelparameteroff- 100s- 1,000sofqueries/secondstominutes

random inputsX outputsY

confidencevalues

[f1(x), f2(x), . . . , fc(x)] 2 [0, 1]c

ModelfhaskparametersW

Page 11: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

MLaaS:ACloserLook

11

x

Modelf

f(x)

PredictionAPI TrainingAPI

Data

- Classlabelsandconfidencescores- Supportforpartialinputs

MLModelTypeSelection:logisticorlinearregression

FeatureExtraction:(automatedandpartiallydocumented)

Page 12: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

OnlineAttack:AWSMachineLearning

12

input

Model OnlineQueries Time(s) Price($)HandwrittenDigits 650 70 0.07AdultCensus 1,485 149 0.15

Extractedmodelf’agreeswithfon100%oftestedinputs

FeatureExtraction:Quantile Binning+One-

Hot-Encoding

Reverse-engineeredwithpartialqueries andconfidencescores

prediction

“Extract-and-test”

ModelChoice:LogisticRegression

Page 13: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

Trainingsamplesof40individuals

DataMultinomialLRModelf

Application:Model-InversionAttacksInfertrainingdatafromtrainedmodels[Fredrikson etal.– 2015]

13

Strategy Attackagainst1individual Attack againstall40individuals

OnlineQueries AttackTime OnlineQueries AttackTime

Black-BoxInversion[Fredrikson etal.] 20,600 24min 800,000 16hours

Extract-and-Invert(ourwork) 41,000 10hours 41,000 10hours

Attackrecoversimageofoneindividual

InversionAttack

x

f’(x)

White-BoxAttack

f(x)=f’(x)for>99.9%ofinputs

f’f(x)

ExtractionAttack

x

×40

×1

Page 14: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

ExtractingaDecisionTree

14

Kushilevitz-Mansour(1992)

• Poly-timealgorithmwithmembershipqueriesonly• OnlyforBooleantrees,impracticalcomplexity

(Ab)usingConfidenceValues

• Assumption: alltreeleaveshaveuniqueconfidencevalues• Reconstructtreedecisionswith“differentialtesting”• OnlineattacksonBigML

xConfidencevaluederivedfromclassdistributioninthetrainingset

Inputsxandx’differinasinglefeature x x’

v v’

Differentleavesarereachedó

Tree“splits”onthisfeature

Page 15: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

AttackonLinearClassifiers[Lowd,Meek– 2005]

Howtopreventextraction?

APIMinimization

Countermeasures

15

decisionboundary

f(x)=yPrediction

Confidence

• Prediction=classlabelonly• LearningwithMembership

Queries

n+1parametersw,b

f(x)=sign(w*x+b)classifyas“+”ifw*x+b>0and“-”otherwise

1. Findpointsondecisionboundary(w*x+b =0)- Finda“+”anda“-”- Linesearchbetweenthetwopoints

2. Reconstructw and b (uptoscalingfactor)

Page 16: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

GenericModelRetrainingAttacks

16

• ExtendtheLowd-Meekapproachtonon-linearmodels

• ActiveLearning:- Querypointscloseto“decisionboundary”- Updatef’tofitthesepoints

• MultinomialRegressions,NeuralNetworks,SVMs:- >99%agreementbetweenfandf’- ≈100queriespermodelparameteroff

≈100× lessefficientthanequation-solving

querymorepointshere

Page 17: Stealing Machine Learning Models via Prediction APIs · 2019-12-18 · Stealing Machine Learning Models via Prediction APIs UsenixSecurity’16 August 11th, 2016 Machine Learning

Usenix Security’16 August11th,2016StealingMachineLearningModelsviaPredictionAPIs

Conclusion

17

RichpredictionAPIs Model &dataconfidentiality

EfficientModel-ExtractionAttacks• LogisticRegressions,NeuralNetworks,DecisionTrees,SVMs• Reverse-engineeringofmodeltype,featureextractors• Activelearningattacksinmembership-querysetting

Applications• Sidestepmodelmonetization• Boostotherattacks:privacybreaches,modelevasion

Thanks! Findoutmore:https://github.com/ftramer/Steal-ML


Recommended