arX
iv:1
710.
0172
7v3
[cs
.CV
] 1
1 O
ct 2
017
Privacy-Preserving Deep Inference for Rich User
Data on The Cloud
Seyed Ali Osia ♯, Ali Shahin Shamsabadi ♯, Ali Taheri ♯, Kleomenis Katevas ⋆,
Hamid R. Rabiee ♯, Nicholas D. Lane †, Hamed Haddadi ⋆
♯ Sharif University of Technology
⋆ Queen Mary University of London
† Nokia Bell Labs & University of Oxford
Abstract—Deep neural networks are increasingly being usedin a variety of machine learning applications applied to rich userdata on the cloud. However, this approach introduces a numberof privacy and efficiency challenges, as the cloud operator canperform secondary inferences on the available data. Recently,advances in edge processing have paved the way for moreefficient, and private, data processing at the source for simpletasks and lighter models, though they remain a challenge forlarger, and more complicated models. In this paper, we present a
hybrid approach for breaking down large, complex deep modelsfor cooperative, privacy-preserving analytics. We do this bybreaking down the popular deep architectures and fine-tune themin a particular way. We then evaluate the privacy benefits ofthis approach based on the information exposed to the cloudservice. We also asses the local inference cost of different layerson a modern handset for mobile applications. Our evaluationsshow that by using certain kind of fine-tuning and embeddingtechniques and at a small processing costs, we can greatly reducethe level of information available to unintended tasks applied tothe data feature on the cloud, and hence achieving the desiredtradeoff between privacy and performance.
I. INTRODUCTION
The increasing availability of connected devices such as
smartphones and cameras have made them an essential and
inseparable part of our daily lives. Majority of these de-
vices collect forms of data and transfer it to the cloud
in order to benefit from cloud-based data mining services
like recommendation systems, targeted advertising, security
surveillance, health monitoring and urban planning. Many of
these applications are free, relying on information harvesting
from their users’ personal data. This practice has a number of
privacy concerns and resource impacts for the users [1], [2].
Preserving individuals’ privacy, versus detailed data analytics,
face a dichotomy in this space. Cloud-based machine learning
algorithms can provide beneficial or interesting services (e.g.,
video editing tools or health apps), however, their reliance on
excessive data collection form the users can have consequences
which are unknown to the user (e.g., face recognition for
targeted social advertising).
While complete data offloading to a cloud provider can
have immediate or future potential privacy risks [3], [4], tech-
niques relying on performing complete analytics at the user
end (on-premise solution), or encryption-based methods, also
come with their own resource limitations and user experience
penalties (see Section VII for detailed discussions). Apart
from the resource considerations, an analytics service or an
app provider might not be keen on sharing their valuable
and highly tuned models. Hence, it is not always possible
to assume local processing (e.g., a deep learning model on
a smartphone) is a viable solution even if the task duration,
memory and processing requirements are not important for the
user, or tasks can be performed when the user is not actively
using their device (e.g., while the device is being charged
overnight).
In this paper, we focus on achieving a compromise between
resource-hungry local analytics, versus privacy-invasive cloud-
based services. We design and evaluate a hybrid architecture
where the local device and the cloud system collaborate on
completing the inference1 task. In this way, we can augment
the local device to benefit from the cloud processing efficiency
while addressing the privacy concerns. We concentrate on data
mining applications where in order to get certain services from
a provider, sending the data to the cloud is inevitable. As a
specific exemplar of this general class of services, we consider
image processing applications using deep learning. We address
the challenge of performing certain approved image analytics
in the cloud, without disclosing important information which
could lead to other inferences such as identity leak via face
recognition.
As an exemplar use case for this paper, we consider a
case where we wish to enable specific inference tasks such
as gender classification or emotion detection on face images,
while protecting against a privacy-invasive task such as face
recognition by a cloud operator having access to rich training
data and pre-trained models (e.g., Google and Facebook).
Convolutional Neural Networks (CNNs) are one of the most
powerful instances of deep neural networks for doing image
analysis [5], [6], [7], and we use them to build accurate gender
and emotion predictor models. We will fine-tune these models
with our suggested architecture which brought us identity
privacy, while still keep them accurate (As shown previously
1In this paper, by inference we mean applying a pre-trained deep model onan input to obtain the output, which is different from statistical inference.
in [8]). We will perform our evaluations on smartphones,
but it can also be extended to other devices with limited
memory and processing capabilities e.g., Raspberry Pi and
edge devices. A number of works [9], [10], [11] address
the problem of using deep models on smartphones. However,
using complex and accurate models in smartphones, requires
significant processing and memory resources and our solution
could highly improve their efficiency.
Our approach relies on optimizing the layer separation of
pre-trained deep models. Primary layers are held on the user
device and the secondary ones on the cloud. In this way, the
inference task starts by applying the primary layers as the
feature extractor on the user device, and continues by sending
the resultant features to the cloud and, end by applying the
secondary analyzing layers in cloud. We demonstrate that our
proposed solution does not have the overhead of executing
the whole deep model on the user device, while it will be
favored by a cloud provider as the user does not have access
to their complete model and part of the inference should be
done on the cloud. We introduce a method to manipulate the
extracted features (from the primary layers) in a way that
irrelevant extra information can not leak, hence addressing the
privacy challenges of cloud solution. To do this, we alter the
training phase by applying Siamese network [12] in a specific
manner, and by employing a dimensionality reduction and
noise addition mechanism for increased privacy.
In Section IV, we use three methods to quantify the privacy
guarantees of our approach. One is to use transfer learning
[13] which proves that face recognition is impractical by even
using the state of the art models. The second approach is to
use deep visualization techniques, which tries to reconstruct
the input image by just using the extracted feature in the
intermediate layer [14]. At the end we introduce a new metric
for privacy measurement which is an extension for optimal
Bayes error. We also implement our model on smartphone and
compare the fully on-premise solution with the hybrid solution
presented in this paper.2
Our main contributions in this paper include:
• Proposing a learning framework for privacy-preserving
analytics on a cloud system and embedding deep net-
works on it;
• Developing a new technique for training deep models
based on the Siamese architecture, which enables privacy
at the point of offloading to the cloud;
• Performing evaluation of this framework across two
common deep models: VGG-16 and VGG-S and two
applications: gender classification and emotion detection
in a way that we preserve the privacy relating to face
recognition.
II. HYBRID FRAMEWORK
In this section, we present a hybrid framework for privacy
preserving analytics. Suppose we want to utilize a cloud ser-
vice to infer a primary information of interest (e.g., gender, age
2Our codes are available at https://github.com/aliosia/DeepPrivInf2017
Fig. 1: Hybrid privacy-preserving framework.
or emotion on video footage or images), and at the same time,
we ought to prevent the exposure of sensitive information (e.g.,
identity) to the cloud provider. Hence, the data shared with
the cloud service should possess two important properties: (i)
inferring the primary information is possible; and (ii) deducing
the sensitive information is not possible. The only solution to
build this data, is to process the raw data on the client side and
extract a rich feature with this properties. We can then transfer
this feature to the cloud for further processing, without initial
privacy concerns. Hence, we can consider a hybrid framework,
in which the user and the service provider cooperate with each
other. Figure 1 presents an overview of this framework. We
can break down the analytics process into feature extraction
and analyzation:
• Feature Extractor: This module takes the raw input data,
process it, and outputs a rich feature vector which needs
to keep the primary information, while it should protect
the sensitive information. Usually, these two objectives
are contradictory, i.e., decreasing the sensitive informa-
tion causes a decrease in the primary information too.
Additionally, due to limitations of client side processing,
the feature extraction task needs to have minimal burden;
consequently the designing the feature extractor is the
most challenging task.
• Analyzer: This module takes the intermediate features,
generated by the feature extractor, as its input, and
analyzes it. In practice, this module can be any ordinary
classifier and the privacy of intermediate features was
ensured by the first module.
We also need a protocol between the service provider
and user to establish this framework. Suppose the service
provider knows about the primary (e.g. gender) and sensitive
(e.g. identity) user information. Because of that, the feature
extractor can be designed by the service provider and yielded
to the client. This feature extractor is guaranteed to con-
sider user’s primary and sensitive information, simultaneously.
Demonstrating that the primary information is kept in the
features can be done by showing the efficiency of the Analyzer.
The service provider should also define a verification method
for the privacy preservation; different methods for doing this
is discussed in Section IV.
Our framework is generic and can be used for any privacy-
preserving learning problem. In Section III, we explain how
to embed feedforward neural networks in this framework.
(a) Training simple embedding.
(b) Using simple embedding. Intermediate feature is passedthrough communication channel.
Fig. 2: Simple embedding of a deep network.
III. DEEP-PRIV EMBEDDING
Due to the increasing popularity of deep models in ana-
lytics applications, in this section we address how to embed
an existing deep model, inferring primary information (e.g.
predicting gender or emotion) in the proposed framework.
Complex deep networks consist of many layers which can be
embed in this framework, using a layer separation mechanism.
First, we should choose the intermediate layer from a deep
network and then we can store the layers before that on the
client device as a feature extractor, and the layers after that
in the cloud server as the classifier (see Figure 1). Choosing
the intermediate layer from higher layers of the network,
intrinsically comes with privacy compromises. In [15], the
authors reconstruct an original image from each layer and the
accuracy of reconstruction decreases by using higher layers.
As we go up through the deep network layers, the features get
more specific to the primary information [13] and irrelevant
information (including sensitive information) will be gradually
lost. Hence, by using the layer separation mechanism, we
achieve two important objectives simultaneously: (i) we end
up with the feature extractor easily, and (ii) we benefit from the
intrinsic characteristics of deep models. This approach satisfies
the initial criteria we set for our proposed framework. In this
paper, we refer to this embedding as the simple embedding.
The training and test phase of this embedding can be seen in
Figure 2.
Moreover, experiments show that the accuracy of primary
classification does not decrease, when we reduce the dimen-
sion of the intermediate feature with Principle Component
Analysis (PCA). This can improve privacy due to intrinsic
characteristics of dimensionality reduction. We can also highly
reduce the communication overhead between the client and
server. We refer to this embedding (with PCA applied) as the
reduced simple embedding.
An important challenge with deep models in privacy ap-
plications, is that they learn invariant general features which
(a) Training advanced embedding with Siamese architecture.Weights connected by dashed lines are equal.
(b) Using advanced embedding (with PCA projection andnoise addition in client side and reconstruction and analyzingin server side).
Fig. 3: Advanced embedding of a deep network
are not specific to the target task [16]. This characteristic of
deep networks, adversely affects their privacy. The solution
is to manipulate the intermediate feature and try to specialize
it for the primary variable and make the sensitive variable
unpredictable. One way to do this is to have a many to
one mapping for the sensitive variable. This is the main idea
behind k-anonymity [17], assuming the identity is the sensitive
variable. As an example, Suppose k different male images are
mapped to one point in the feature space. Having this feature,
an attacker will have confusion between k possible identities.
We use the Siamese architecture [12] to accomplish this task,
as much as possible. To the best of our knowledge, this is the
first time that the Siamese architecture is used as a privacy
preservation technique. Fine-tuning with Siamese architecture
results in a feature space where objects with the same primary
classes cluster in together. Due to this transformation, borders
of the sensitive variable classes get faded, consequently clas-
sifying sensitive variable becomes harder or even impossible,
while the primary information is not affected. We refer to this
embedding as the Siamese embedding, where Siamese fine-
tuning is applied. In addition, we can reduce the dimensions
of the intermediate feature without any deficiency; we refer to
this embedding method as the reduced Siamese embedding.
Another method which increases the client privacy and
inference uncertainty of unauthorized tasks is noise addition. A
service provider can determine a noise addition strategy for its
clients in order to increase the uncertainty of other undesired
tasks. We refer to noisy embedding whenever we use noise
addition within the feature extractor. We also refer to the noisy
reduced Siamese embedding as the advanced embedding. In
(a) Traditional Siamese arch. (b) Siamese arch. for privacy
Fig. 4: Siamese architecture usage
order to see the effect of Siamese fine-tuning, dimensionality
reduction and noise addition, advanced embedding is shown in
Figure 3. Hence in the feature extractor module of advanced
embedding, the following steps should be taken:
• Applying primary layers.
• Reducing the dimensionality.
• Adding noise.
The analyzer module should also do these steps:
• Reconstructing the feature vector.
• Applying remaining layers.
In what follows, we discuss our Siamese fine-tuning, di-
mensionality reduction and noise addition strategy in details.
A. Siamese Fine-tunning
The Siamese architecture has previously been used in veri-
fication applications [12]. It provides us with a feature space,
where similarity between the data points is defined by their
euclidean distance. The main idea of training with Siamese
architecture is forcing the representations of two similar points
to become near each other, and the representations of two
dissimilar points become far. In order to do this, our training
dataset should consists of pairs of similar and dissimilar points.
For a pair of points, one function is applied to both of them and
their value distance is computed. A contrastive loss function
should be defined in a way that making this distance maximize
for two dissimilar points and being minimized for two similar
points. An appropriate such loss function is defined in [18]
and we use it in our application:
L(f1, f2) =
{
‖f1 − f2‖22 similar
max(0,margin− ‖f1 − f2‖2)2 dissimilar
(1)
where f1 and f2 are the mappings of data points. The
traditional Siamese architecture is presented in Figure 4a. As
an example, consider the face verification application. We
want to determine whether two images belong to the same
person or not. We should prepare a dataset consists of pairs of
face images, some of them are similar and belong to the one
person and some are not. Then by using a convolutional neural
z1
z2
x1
x2
class 1
class 2
Fig. 5: The effect of noise addition on the two dimensional
feature space. Blue and green points show the objects of the
first and second primary class, respectively. z1 and z2 have
been obtained by adding noise to x1 and x2, respectively.
network as a feature extractor and imposing a contrastive loss
function, we can train a similarity metric between face images.
How can we use this architecture in privacy-preserving
analytic application? as we said before, we have a pre-
trained deep network which predict primary variable. How
can we make one of its intermediate layer private to sensitive
information? Our proposed solution is to define a contrastive
loss on the intermediate layer and build a multi-objective
optimization problem, which tries to increase both privacy of
sensitive variable and accuracy of primary variable prediction.
We just need to properly define similarity between pairs of
input data.
Assuming the sensitive variable is identity and the primary
variable is gender (or emotion), we can define two face image
with the same gender (or emotion) and different identity as
similar. Using this way, we try to map different identities from
the same gender to the same point which brought us privacy.
The architecture of this network is presented in Figure 4b.
B. Dimensionality Reduction
In order to increase privacy and decrease communication
cost, the service provider could reduce the dimensionality of
the intermediate feature by applying PCA or auto-encoder.
In this way, the last layer of the feature extractor and the
first layer of the analyzer should be a dense reduction and
reconstruction matrices, respectively. As we will show in
Section VI, this procedure does not affect significantly on the
primary task accuracy.
C. Noise Addition
Succeeding the dimensionality reduction, we can add a
multidimensional noise to the feature vector, to further increase
the privacy. Siamese fine-tuning tries to map some objects
with different sensitive classes (e.g. identity) to the same point,
while in practice, these points may have small distances from
each other. We can highly increase the uncertainty about the
sensitive variable by adding a random noise to it (see x1 and
z1 in Fig. 5). However, this task may decrease the accuracy
of the primary variable prediction (see x2 and z2 in Fig. 5).
Thus, we face a trade-off between privacy and accuracy, while
increasing the amount of noise. Siamese fine-tuning makes this
trade-off significantly better than the noisy reduced simple
embedding, without fine-tuning in the way discussed. The
reason is that while doing the Siamese fine-tuning, the intra-
class and inter-class variance is decreased and increased for
the primary variable, respectively. Experiments in Section VI
confirm this conclusion, by testing different variance for a
multi-dimensional symmetric Gaussian noise and observing
the trade-off.
IV. PRIVACY MEASUREMENT
In this section, we introduce three different ways to evaluate
the privacy of the feature extractor:
1) Transfer Learning approach [13] can be used to determine
the degree of generality and specificity of the extracted
features.
2) Deep visualization [14] evaluate the capability of recon-
structing back the input image.
3) Probabilistic modelling of the sensitive variable can be
also helpful for defining a metric for privacy.
In the following, we discuss about each of these methods.
A. Transfer Learning
We can measure the amount of specificity of the extracted
feature to the primary task by using transfer learning [13].
Suppose we have a trained network N1 for primary classifica-
tion (Figure 6a). We build and train network N2 for sensitive
variable inference (Figure 6b) with the following procedure:
• Copy weights from the first i layers of N1 to the first i
layers of N2;
• Initialize the reminding layers of N2 randomly (Fig-
ure 6c);
• Freeze the first i layers of N2 (do not update their
weights);
• Train N2 for sensitive variable inference (Figure 6d).
After the training procedure, the accuracy obtained for
sensitive variable prediction is directly related to the degree
of specificity or generality of the extracted feature from i’th
layer. As we get lower general accuracy for sensitive variable
prediction, the feature is more specific to primary task.
B. Deep Visualization
Visualization is a method for understanding the deep net-
works. In this paper, we used an auto-encoder objective
visualization technique [14] in order to measure the amount
of sensitive information in the intermediate feature of the
network, which is trained for primary variable inference. In
[14], a decoder is designed on the data representation of
each layer, in order to reconstruct the original input image
based on the learned representation. So, we can analyze the
preserved sensitive information in each layer, via comparing
the reconstructed images with the original input image.
(a) Trained network for primary classification (N1)
(b) Network for sensitive variable inference (N2)
(c) Primary weight are copied from N1 and frozen. The otherlayers have random weights.
(d) Trained network on sensitive variable inference with trans-fer learning
Fig. 6: Transfer Learning procedure.
C. Privacy Metric
Suppose we have an estimate for the posterior distribution
of the sensitive variable (e.g. identity), given the extracted
feature vector. It can be obtained by using a simple instance-
based model like kernel density estimation or a complex neural
network. How can we measure the amount of information
existed in this distribution? Conditional entropy and classi-
fication Bayes error could be different options for information
measurement; but, here we are going to introduce a more
intuitive method to measure privacy, which is the extension
of Bayes error. In order to get more accurate results, here we
assume that we use both dimensionality reduction and noise
addition.
Suppose we have a dataset and we want to measure the
privacy level of the feature extractor. We can get all the
intermediate features and apply noise to them. Having all these
features ({fi}) and a fixed noisy data point such as z, we can
calculate the conditional likelihood of each sensitive classes.
In order to do this, we can estimate P (z|ci) in this way:
P (z|ci) =
∫
f
P (z, f |ci)df
=
∫
f
P (z|f, ci)P (f |ci)df
(2)
Conditioned on f , ci is independent of z, so we have:
P (z|ci) =
∫
f
P (z|f)P (f |ci)df
= Ef∼P (f |ci)[P (z|f)]
(3)
Assuming Fi = {f1, f2, ..., fNi} is the set of points from
sensitive class ci in our dataset, we can estimate the above
expected value with sample mean; so we can estimate P (z|ci)with:
P (z|ci) =1
Nci
∑
fj∈Fi
P (z|fj) (4)
In this way, we can compute the relative likelihood of each
class given a noisy data point. As we know the correct class
of that point, we can determine the number of classes with a
higher probability than the correct class. Hence, we can define
the rank of the likelihood of the right class, as the privacy of
that noisy point. We want this measure to have a normalized
value between 0 and 1, so we divide it by T , the number of
sensitive classes:
Privacy(z) =Rank(class(z))
T
Now, having intermediate features of N samples (with N
noisy points generated by them), we can estimate the privacy
of the transmitted data by:
Privacy total =
N∑
i=1
Privacy(zi)
N
We can define this as a measure for quantifying privacy. In
the next sections, we simply refer to this metric as Privacy.
With this measure, we can calculate how much privacy is
preserved and also validate the privacy of the transmitted data.
V. APPLICATIONS
In this section, we introduce gender classification and emo-
tion detection as two exemplar primary tasks and consider the
face identity as the sensitive information; so we use the VGG-
16 face recognition model [19] as the adversary, trying to infer
the sensitive variable. We evaluate transfer learning approach
by using the IMDB dataset used in [19] which contains near
2 million images from 2,622 highly-ranked celebrities on the
IMDB website. We randomly select 100 celebrities and divide
their images to training and test sets to evaluate our face
recognition model.
A. Gender Classification
In the problem of gender classification, the goal is to classify
an individuals’ image to Male or Female. This has various
applications in different systems such as human-computer
interaction, surveillance and targeted advertising systems [21].
Some techniques use face image as the input to the classifier,
while others use the whole body image or a silhouette. In this
paper, we use cropped face images for the gender classification
task. Recently, deep convolutional neural networks have been
used for this problem [22], [23], [24]. In this work we use the
Pool1
Relu1-2
Conv1-2
Relu1-1
Conv1-1
Pool2
Relu2-2
Conv2-2
Relu2-1
Conv2-1
Pool3
Relu3-3
Conv3-3
Relu3-2
Conv3-2
Relu3-1
Conv3-1
Relu6
FC6
Pool4
Relu4-3
Conv4-3
Relu4-2
Conv4-2
Relu4-1
Conv4-1
Pool5
Relu5-3
Conv5-3
Relu5-2
Conv5-2
Relu5-1
Conv5-1
Relu7
FC7
Prob
FC8
Fig. 7: 16 layer VGG-16 architecture [20]
Relu6
FC6
Relu7
FC7
Prob
FC8
Pool1
Relu1
Conv1
Pool2
Relu2
Conv2
Relu3
Conv3
Pool5
Relu5
Conv5
Relu4
Conv4
Fig. 8: 8 layer VGG-S architecture [26]
model proposed in [23] with 94% accuracy, based on VGG-
16 architecture, the popular 16-layer deep model for image
classification [20] (see Figure 7).
Rothe et al. [23] prepared a huge dataset, named IMDB-
Wiki, which is useful for age and gender estimation. We use
the Wiki part of this dataset which contains 62,359 images to
fine-tune our models. We use 45,000 images as training data
and the rest as test data. We evaluate our privacy measurement
technique on this dataset. We also use Labeled Face in the Wild
(LFW) dataset [25] to compare our gender classification model
with others. This is an unconstrained face database containing
13,233 images of 5,749 individuals which is very popular for
evaluating face verification and gender classification models.
B. Emotion Detection
Emotion detection from facial expression is becoming ex-
ceedingly important for social media analysis tasks. In this
problem, emotions are classified based on the individuals’
facial expressions on images. Recently, deep learning has been
demonstrated to be effective in solving this problem [27],
[28]. Different deep models are proposed and compared in
[27]. We choose the VGG-S RGB model which is based on
VGG-S architecture [26] (see Figure 8). The accuracy of
doing emotion detection by using this model is 39.5% on
SFEW-2 dataset. Static Facial Expression in the Wild (SFEW)
is an emotion detection benchmark [29]. We use the latest
version [30] which consists of face images in seven emotional
classes. This dataset contains 891 and 431 images for training
and validation respectively.
VI. EXPERIMENTS
In this section we evaluate and analyze the accuracy and
privacy of different embeddings with different intermediate
layers, by using our proposed privacy measurement tools:
transfer learning, visualization and privacy metric. Although
Conv5-1 Conv5-2 Conv5-3
0
10
20
30 29
24
15
5.64.9
3.64.3
32.32
2.8 2.6
Fac
eR
ec.
accu
racy
(%)
simple
reduced simple
Siamese
reduced Siamese
Fig. 9: Gender Classification. Comparison of simple, reduced
simple, Siamese and reduced Siamese embedding on different
intermediate layers, while doing transfer learning.
all of these embedding preserve privacy, applying Siamese
fine-tuning is more efficient in a way that it increase privacy
considerably, whereas it does not decrease the accuracy of
primary task. In addition, we show how dimensionality reduc-
tion has positive effects on privacy. Finally, we evaluate our
hybrid framework on mobile phone and discuss its advantages
regarding to other solutions.
A. Privacy of Gender Classification
In this part, we apply transfer learning, privacy metric
and visualization technique on different intermediate layers
of gender classification and face recognition models, in order
to show the privacy of our framework. We use the VGG-16
model proposed at [23] in the simple embedding and fine-tune
it with the proposed privacy architecture (Figure 4b) to use it in
Siamese embedding. To create the reduced simple and Siamese
embeddings, we apply PCA on the intermediate features of
simple and Siamese embeddings, respectively. We choose 4,
6 and 8 as the PCA dimension for Conv5 3, Conv5 2 and
Conv5 1 respectively.
1) Transfer learning: The result of transfer learning for
different embeddings on different intermediate layers are
presented in Figure 9. Overall, applying (reduced) simple
or Siamese embedding results in a considerable decrease in
the accuracy of face recognition from Conv5 1 to Conv5 3.
The reason of this trend is that as we go up through the
layers, the features of each layer will be more specific to the
gender classification (primary task). That is to say, the features
of each layer do not have information related to identity
(sensitive information) as much as even its previous layer.
In addition, for all of the layers, face recognition accuracy
of Siamese embedding is by far less than the accuracy of
simple embedding. This result has route in training of Siamese
embedding with privacy architecture which causes a dramatic
drop in the accuracy. As it is shown in Figure 9, when Conv5 3
TABLE I: Accuracy of Gender Classification.
Accuracy on LFW
Conv5-1 Conv5-2 Conv5-3
simple 94% 94% 94%reduced simple 89.7% 87% 94%
Siamese 92.7% 92.7% 93.5%reduced Siamese 91.3% 92.9% 93.3%
0 10 20 30 40
84
87
90
93
96
Face Rec. Privacy (%)
Gen
der
Cla
ss.
Acc
ura
cy(%
) noisy reduced simple
advanced
(a) Comparison of presence orabsence of Siamese fine-tunningon layer Conv5-3.
0 10 20 30 40
84
87
90
93
96
Face Rec. Privacy (%)
pool5 conv5 3
conv5 2 conv5 1
(b) Comparison of different lay-ers. Higher layers achieves bettertrade-off.
Fig. 10: Accuracy-Privacy trade-off for gender classification
using VGG-16 architecture.
is chosen as the intermediate layer in Siamese embedding, the
accuracy of face recognition is 2.3%, just ahead of random
accuracy. Another interesting point of this figure is the effect of
dimensionality reduction on the accuracy of face recognition.
The reduced simple and Siamese embeddings has lower face
recognition accuracy than simple and Siamese embedding,
respectively.
In order to assess the way these changes adversely affect
accuracy of desired task which is gender classification, we
report different embeddings accuracies in table I. The results
of table I convey two important messages. First, as the gender
classification accuracy of Siamese and simple embedding are
approximately the same, applying Siamese idea does not
decrease accuracy of desired task. The other important result is
that Siamese embedding is more robust to PCA than the simple
embedding. In other words, gender classification accuracy of
reduced Siamese embedding is close to Siamese embedding,
whereas dimensionality reduction damage the accuracy of
simple embedding. Figure 9 and table I show that applying
the Siamese network and dimensionality reduction results in
preserving privacy while gender classification accuracy does
not decrease dramatically.
2) Privacy metric: In order to validate the feature extractor,
we use the rank measure proposed in Section IV-C. By in-
creasing the symmetric Gaussian noise variance, we get more
privacy and less accuracy. In fact privacy and accuracy can be
considered as two competing constraints in which increasing
the identity privacy causes a decrease in accuracy of gender
classification. We show this trade-off in Figure 10a, where
we can see the superiority of the advance embedding (noisy
reduced Siamese) over noisy reduced simple embedding. Fig-
ure 10a is an evidence that by increasing privacy, gender
classification accuracy decreases more slowly in advanced
embedding than other embeddings. This makes the advanced
embedding the ideal choice as we have better privacy on a
fixed accuracy level. Another interesting experiment shows
that choosing intermediate layers from higher ones, gives us
better privacy for a fixed accuracy. This trend is shown in
Figure 10b, where the accuracy-privacy curve is upper for
higher layers than lower ones and for a fixed accuracy, higher
layer gives us more privacy. This validates our results of
transfer learning in a way that choosing intermediate layers
which are closer to the end of the network results in having a
lower face recognition accuracy.
3) Visualization: Deep visualization can brought us a good
intuition about identity preservation of each layer. We fed
the the intermediate layers of gender classification model as
the input of Alexnet decoder [14] to reconstruct the original
inputs. The reconstructed images leads to visually figure out
the amount of identity information in the intermediate feature
of gender classification model. These images are illustrated in
Figure 11 for different methods. It can be observed that the
genders of all images in the simple and Siamese embeddings
remain the same as the original ones. This is also the case for
the advanced embedding, although it is harder to distinguish it
from the reconstructed images. The original images are almost
restored in the simple embedding. Therefore, just separating
layers of a deep network can not assure acceptable privacy
preservation performance. Siamese embedding performs better
than the simple embedding by distorting the identity due to
intrinsic characteristics of the face. Finally, the Advanced
Embedding provides the best results, because the decoder
was not trainable and nothing can be deduced from images,
including the person’s identity.
B. Privacy of Emotion Detection
We also evaluate our framework on emotion detection task.
We use the VGG-S RGB pre-trained network of [27] in the
simple embedding. We fine tune their model with privacy
architecture (Figure 4b) on the training part of SFEW-2 dataset
and get the Siamese embedding. As VGG-S has smaller
structure in comparison with VGG-16 (8 layer vs. 16 layer),
we just evaluate our embedding on one intermediate layer
which is the fifth convolutional layer (Conv5). We choose 10
as the PCA dimension and get reduced simple and Siamese
embedding.
1) Transfer learning: We test different embeddings with
the transfer learning and the result are shown in Figure 12a.
The accuracy of the face recognition model is decreased for all
embeddings. Similar to the gender classification application,
the Siamese embedding works better than simple embedding
and dimensionality reduction helps with privacy protection.
The effect of different embeddings on emotion detection are
reported in Table II. It is evident that the Siamese embedding
does not decrease emotion detection accuracy significantly,
while dimensionality reduction has major impact on this task.
2) Privacy metric: The results of the feature extractor
validation are shown in Figure 12b, where the advanced
TABLE II: Comparison of Different Emotion Detection Mod-
els. Intermediate Layer is Conv5.
Accuracy on SFEW-2
simple [27] 40%Siamese 38%
reduced simple 31%reduced Siamese 32%
TABLE III: Device Specification
Google (Huawei) Nexus 6P
Memory 3 GB LPDDR4 RAMStorage 32 GBCPU Octa-core Snapdragon 810 v2.1GPU Adreno 430OS Android 7.1.2
embedding curve is above the noisy reduced simple curve.
By having a fixed accuracy level, we can have higher privacy
for advanced embedding.
Results of the both applications show that our framework
is application, and model, independent. The Siamese structure
improves privacy, while reducing the dimensionality does not
hurt the CT1 accuracy and lowers the communication cost. We
can use the validation method to quantify the privacy level,
without access to the cloud-based face recognition model.
C. Mobile Evaluation
In the previous sections we presented different solutions for
learning inferences. Cloud based solutions are robust, but do
not respect the users’ privacy. On the other hand, on-premise
solutions have increased level of privacy but are not power
efficient, decreasing the battery life of each mobile device. In
this section we evaluate a new, hybrid approach, that is based
on the methods explained in the previous sections. By reducing
the complexity of the deep neural network, we managed to also
reduce the loading time, inference time and memory usage,
while at the same time hide the user’s sensitive information.
We evaluated the proposed implementation on a modern
handset device, shown in Table III. In order to have a better
comparison, we focus on the gender classification VGG-16 ar-
chitecture and. We evaluated each solution separately (simple,
reduced) for each of the three intermediate layers (Conv5 1,
Conv5 2, Conv5 3), and compared them with the on-premise
solution (full model). We used Caffe Mobile v1.0 [31] for
Android to load each model and measured the inference
time (Figure 13), model loading time (Figure 14) and model
memory usage (Figure 15) of each of the seven configurations.
We configured the model to only use one core of the CPU,
as the aim of this experiment was a comparison between the
different techniques on the specific device.
Most of the variations of trained model architectures under
the proposed embedding approach report the same loading
time and memory usage performance. There is a large increase
in both memory use (217.66%) and loading time (534.49%)
when loading the on-promise solution, proving the efficiency
Original Image
Simple Embedding
Siamese Embedding
Advanced Embedding
Fig. 11: The first row shows the original images and the others show the reconstructed ones from intermediate representations.
In all reconstructed images, the gender of the individuals is recognized to be the same as the originals. In addition, From simple
to advanced embedding, the identity of the individuals is increasingly removed, illustrating that the advanced embedding has
the best privacy preservation performance.
simp. reduced simp. Siam. reduced Siam.
0
20
40
48
4.3
36
2.6
Fac
eR
ec.
accu
racy
(%)
(a) Comparison of transfer learn-ing results for different models.
0 5 10 15 20
24
27
29
32
34
Face Rec. Privacy (%)
Em
oti
on
Det
.A
ccu
racy
(%) noisy reduced simple
advanced
(b) Comparison of presence orabsence of Siamese fine-tunning.
Fig. 12: Transfer learning and Accuracy-Privacy trade-off on
emotion detection, using VGG-S architecture and Conv5 as
the intermediate layer.
simple(Conv5_1)
reduced(Conv5_1)
simple(Conv5_2)
reduced(Conv5_2)
simple(Conv5_3)
reduced(Conv5_3)
on-premisesolution
7000
7250
7500
7750
8000
8250
8500
8750
Inferenc
e tim
e (m
s)
Fig. 13: Inference time of different deep embeddings on
mobile (60 inferences per configuration).
of our solution. Inference time also increases per configuration
due to the increased size of the model.
We conclude that our approach is feasible to be imple-
mented in a modern smartphone. By choosing a privacy-
complexity trade-off and using different intermediate layers
we were able to significantly reduce the cost when running the
simple(Conv5_1)
reduced(Conv5_1)
simple(Conv5_2)
reduced(Conv5_2)
simple(Conv5_3)
reduced(Conv5_3)
on-premisesolution
1000
2000
3000
4000
Load
ing tim
e (m
s)
Fig. 14: Loading time comparison of different deep embed-
dings on mobile.
simple(Conv5_1)
reduced(Conv5_1)
simple(Conv5_2)
reduced(Conv5_2)
simple(Conv5_3)
reduced(Conv5_3)
on-premisesolution
200
400
600
800
1000
1200
Mem
ory (M
B)
After loading modelAfter first inference
Fig. 15: Memory comparison of different deep embeddings on
mobile.
model on the mobile device, while at the same time preserving
important user information from being uploaded to the cloud.
VII. RELATED WORK
In this section, we describe the prior works on privacy-
preserving learning systems and their intrinsic differences. We
also review the works used deep learning on mobile phones.
A. Learning with privacy
Prior works have approached the problem of privacy in ma-
chine learning from different point of views. Some approaches
attempt to remove the irrelevant information by increasing the
amount of uncertainty, while others try to hide information us-
ing cryptographic operations. Early works in this space mainly
focus on publishing datasets for learning tasks [32], [33],
[17], [34]. They usually concern about publishing a dataset
consists of high level features for data mining tasks (e.g.,
medical database consisting of patients details), while pre-
serving the individuals’ privacy. Solutions such as randomized
noise addition [32], [33] and k-anonymity by generalization
and suppression [35], [36], [37] are proposed and surveyed in
[38]. These methods have some major problems. They are
just appropriate for low-dimensional data due to the curse
of dimensionality [39], hence they are not fit most of the
multimedia data. Also a variety of attacks make many these
methods unreliable [38]. We can categorize these models as
the dataset publishing models. In dataset publishing, training
applicability of a generalized data is important, while in this
paper we deal with the cases where model training has been
done already by a cloud service (e.g., Facebook or Google
using their image data).
Differential privacy [40] is another method provides an
exact way to publish statistics of a database while keeping
all individual records of the database private. A learning
model trained on some dataset can be considered as a high
level statistic of that dataset. So considering the training
data privacy while publishing a learning model is another
important problem, we call it model publishing. Recently,
[41] proposed concern of privacy for deep learning and [42]
provided differential private deep learning model. In model
publishing, mainly the privacy of users participating in training
data is of concern, while in our scenario, user’s data may not
exist in training data and we focus on inference phase of a
learning model.
Hence, neither publishing a learning dataset or a learning
model are directly relevant to our problem. We can name our
problem as the secure inference where the user can not access
the learning model during inference time and should use it in
a secure manner. A popular approach to solve this problem is
reliance on cryptography methods. In [43], the authors provide
a secure protocol for machine learning. In [44], the neural
network is held in cloud. They encrypt the input of neural net-
work in a way that inference becomes applicable on encrypted
message. This approach has important, yet highly complex
operations, making it infeasible. Mainly, the throughput is the
same for inference on a single image or a batch. In addition
neural network should be changed in a complex way to enable
homomorphic encryption taking 250 seconds on a PC, which
makes it impractical in terms of usability on a mobile phones
or simple PCs. Recently [45], [46] tried to improve this work
by implying a mored advance encryption setting, while they
are still using simple deep models in experiments.
Instead of encryption-based methods, we recommend a new
approach to this problem, which is a kind of feature extraction,
applied in a hybrid framework. We address this issue in
an adversarial setting. We optimize a cost function which
consist of data privacy and model accuracy terms. We then
use the Siamese architecture to solve this optimization and
get the private feature which is non-informative about sensitive
information and can be shared with the cloud service.
B. Privacy in image analytics
Privacy preservation has also been addressed in machine
vision community. A good survey of all methods attempted to
provide visual privacy, can be found in [47], which classifies
different methods to five categories: intervention, blind vision,
secure processing, redaction and data hiding. Our work is
similar in spirit to de-identification works, a subcategory of
redaction methods. The goal of these methods is to purturbe
the individuals’ faces in images in such a way that they can not
be recognized by a face recognition system. A fundamental
work in this category is presented in [48], which targets
privacy issue in video surveillance data. The aim of this work
is to publish a transformed dataset, where individuals are not
identifiable. They show that using simple image filtering can
not guarantee privacy and suggest K-same algorithm, based
on k-anonymity, aiming to create average face images and
replace them with the original ones. A shortcoming of this
work is the lack of protection against future analyses on
the dataset. Lots of works followed this idea and tried to
improve it, mainly with the goal of publishing a dataset that
is different from us. Their goal is not to protect privacy of a
new face image, which is our concern. Follow-up works aim
to transform a face image in a way that it is unrecognizable,
while other analytics such as gender classification is possible.
Most of the works in this area use visual filters or morphing
to make the image unrecognizable [49], [22]. One of the main
issues with prior privacy preservation methods is the lack of
a privacy guarantee against new models due to engineering
features against specific learning tasks. In most cases the
learning task is not explicitly defined. Moreover, many works
ignore the accuracy constraints of the learning task in their
privacy preservation method. In this paper we build on our
previous work [8], introduce and develop a privacy measure,
and evaluate the framework on smartphones.
C. Deep learning on mobile phone
Last two years have seen a dramatic increase in the im-
plementation and inference ability of deep neural networks
on smartphones. Using pre-trained deep learning models can
increase accuracy of different sensors; e.g. in [9], Lane et al.
use a 3 layer network which does not overburden the hardware.
Complex networks with more layers need more processing
power. Architectures such as the 16-layer model (VGG-16)
proposed in [20] and the 8-layer model (VGG-S) proposed
in [26] which are more complex, are implemented on the
mobile in [11], and the resource usage such as time, CPU
and energy overhead, are reported. As most of the state of the
art models are pretty large in scale, fully evaluating all the
layers on mobile results in serious drawbacks in processing
time and memory requirements. Some methods are proposed
to approximate these complex functions with simpler ones to
reduce the cost of inference. Kim et al. [11] aim to compress
deep models and in [50] the authors use sparsification and
kernel separation. However, the increase in efficiency of these
methods comes with a decrease in accuracy of the model. In
order to get more efficient results, we can also implement
models on GPU. An implementation on GPU in [11] has
burdens on the battery, hence it is not a feasible solution for
some practical applications that either users frequently use
it or continuously require it for long periods [51]. On the
other hand, recent devices have DSP modules though their
capacity for programming and storage can be limited. To tackle
these problems, Lane et al. [51] have implemented a software
accelerator called DeepX for large-scale deep neural networks
to reduce the resources while the mobile is doing inference by
using different kinds of mobile processor simultaneously.
VIII. DISCUSSIONS AND NEXT STEPS
In this paper, we presented a new hybrid framework for
efficient privacy preserving analytics which consists of a
feature extractor and analyzer, where the former is placed
on the client side and the later on the server side. We
embed deep neural networks, specially, Convolutional neural
networks in this framework to benefit from their accuracy
and layered architecture. In order to protect the data privacy
against unauthorized tasks, we used the Siamese architecture,
creating a feature which is specific to the desired task. This is
in contrast to today’s ordinary deep networks in which the
created features are generic and can be used for different
tasks. Removing the undesired sensitive information from the
extracted feature results in achieving privacy for the user.
Evaluating our framework by splitting the layers between the
mobile and the cloud and by targeted noise addition, we
achieved high accuracy on the primary tasks, while heavily
decreasing any inference potential for other tasks. Also by
implementing the framework on mobile phone, we show that
we can highly decrease the computational complexity on the
user side, as well as the communication cost.
Our framework is currently designed for learning inferences
in the test phase. In ongoing work we are extending our
method by designing a framework for Learning as a Service,
where the users could share their data, in a privacy-preserving
manner, to train a new learning model. Another potential
extension to our framework will be providing support for other
kinds of neural networks such as recurrent neural network and
also other applications for speech or video processing.
ACKNOWLEDGMENT
We would like to thank Sina Sajadmanesh for his valuable
comments and feedbacks.
REFERENCES
[1] N. V. Rodriguez, J. Shah, A. Finamore, Y. Grunenberger, K. Papa-giannaki, H. Haddadi, and J. Crowcroft, “Breaking for commercials:characterizing mobile advertising,” in Proceedings of ACM Internet
Measurement Conference, Nov. 2012, pp. 343–356.
[2] I. Leontiadis, C. Efstratiou, M. Picone, and C. Mascolo, “Don’t killmy ads!: balancing privacy in an ad-supported mobile applicationmarket,” in Proceedings of ACM HotMobile, 2012. [Online]. Available:http://doi.acm.org/10.1145/2162081.2162084
[3] L. Pournajaf, D. A. Garcia-Ulloa, L. Xiong, and V. Sunderam, “Partic-ipant privacy in mobile crowd sensing task management: A survey ofmethods and challenges,” ACM SIGMOD Record, vol. 44, no. 4, pp.23–34, 2016.
[4] M. Haris, H. Haddadi, and P. Hui, “Privacy leakage in mobile computing:Tools, methods, and characteristics,” arXiv preprint arXiv:1410.4978,2014.
[5] J. Rich, H. Haddadi, and T. M. Hospedales, “Towards bottom-upanalysis of social food,” in Proceedings of the 6th International
Conference on Digital Health Conference, ser. DH ’16. NewYork, NY, USA: ACM, 2016, pp. 111–120. [Online]. Available:http://doi.acm.org/10.1145/2896338.2897734
[6] P. N. Druzhkov and V. D. Kustikova, “A survey of deep learning methodsand software tools for image classification and object detection,” Pattern
Recognition and Image Analysis, vol. 26, no. 1, pp. 9–15, 2016.[Online]. Available: http://dx.doi.org/10.1134/S1054661816010065
[7] J. Wan, D. Wang, S. C. H. Hoi, P. Wu, J. Zhu, Y. Zhang, and J. Li, “Deeplearning for content-based image retrieval: A comprehensive study,” inProceedings of the 22nd ACM international conference on Multimedia.ACM, 2014, pp. 157–166.
[8] S. A. Osia, A. S. Shamsabadi, A. Taheri, H. R. Rabiee, N. Lane,and H. Haddadi, “A hybrid deep learning architecture for privacy-preserving mobile analytics,” CoRR, vol. abs/1703.02952, 2017.[Online]. Available: http://arxiv.org/abs/1703.02952
[9] N. D. Lane and P. Georgiev, “Can deep learning revolutionize mobilesensing?” in Proceedings of the 16th International Workshop on Mobile
Computing Systems and Applications. ACM, 2015, pp. 117–122.
[10] N. D. Lane, P. Georgiev, C. Mascolo, and Y. Gao, “Zoe: A cloud-lessdialog-enabled continuous sensing wearable exploiting heterogeneouscomputation,” in Proceedings of the 13th Annual International Confer-
ence on Mobile Systems, Applications, and Services. ACM, 2015, pp.273–286.
[11] Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin, “Compressionof deep convolutional neural networks for fast and low power mobileapplications,” arXiv preprint arXiv:1511.06530, 2015.
[12] S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metricdiscriminatively, with application to face verification,” in 2005 IEEE
Computer Society Conference on Computer Vision and Pattern Recog-
nition (CVPR’05), vol. 1. IEEE, 2005, pp. 539–546.
[13] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable arefeatures in deep neural networks?” in Advances in neural information
processing systems, 2014, pp. 3320–3328.
[14] A. Dosovitskiy and T. Brox, “Inverting visual representations withconvolutional networks,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, 2016, pp. 4829–4837.
[15] A. Mahendran and A. Vedaldi, “Understanding deep image representa-tions by inverting them,” in 2015 IEEE conference on computer vision
and pattern recognition (CVPR). IEEE, 2015, pp. 5188–5196.
[16] Y. Bengio et al., “Deep learning of representations for unsupervised andtransfer learning.” ICML Unsupervised and Transfer Learning, vol. 27,pp. 17–36, 2012.
[17] L. Sweeney, “k-anonymity: A model for protecting privacy,” Interna-
tional Journal of Uncertainty, Fuzziness and Knowledge-Based Systems,vol. 10, no. 05, pp. 557–570, 2002.
[18] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction bylearning an invariant mapping,” in 2006 IEEE Computer Society Con-
ference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2.IEEE, 2006, pp. 1735–1742.
[19] O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,”in British Machine Vision Conference, 2015.
[20] K. Simonyan and A. Zisserman, “Very deep convolutional networks forlarge-scale image recognition,” CoRR, vol. abs/1409.1556, 2014.
[21] C. B. Ng, Y. H. Tay, and B. M. Goi, “Vision-based human genderrecognition: A survey,” arXiv preprint arXiv:1204.1611, 2012.
[22] N. Rachaud, G. Antipov, P. Korshunov, J.-L. Dugelay, T. Ebrahimi,and S.-A. Berrani, “The impact of privacy protection filters on genderrecognition,” in SPIE Optical Engineering+ Applications. InternationalSociety for Optics and Photonics, 2015, pp. 959 906–959 906.
[23] R. Rothe, R. Timofte, and L. Van Gool, “Dex: Deep expectationof apparent age from a single image,” in Proceedings of the IEEE
International Conference on Computer Vision Workshops, 2015, pp. 10–15.
[24] G. Levi and T. Hassner, “Age and gender classification using convo-lutional neural networks,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition Workshops, 2015, pp. 34–42.[25] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled
faces in the wild: A database for studying face recognition in uncon-strained environments,” University of Massachusetts, Amherst, Tech.Rep. 07-49, October 2007.
[26] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, “Return ofthe devil in the details: Delving deep into convolutional nets,” in British
Machine Vision Conference, 2014.[27] G. Levi and T. Hassner, “Emotion recognition in the wild via convolu-
tional neural networks and mapped binary patterns,” in Proceedings of
the 2015 ACM on International Conference on Multimodal Interaction.ACM, 2015, pp. 503–510.
[28] A. Mollahosseini, D. Chan, and M. H. Mahoor, “Going deeper infacial expression recognition using deep neural networks,” in 2016 IEEE
Winter Conference on Applications of Computer Vision (WACV). IEEE,2016, pp. 1–10.
[29] A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, “Static facial expressionanalysis in tough conditions: Data, evaluation protocol and benchmark,”in Computer Vision Workshops (ICCV Workshops), 2011 IEEE Interna-
tional Conference on. IEEE, 2011, pp. 2106–2112.[30] A. Dhall, O. Ramana Murthy, R. Goecke, J. Joshi, and T. Gedeon, “Video
and image based emotion recognition challenges in the wild: Emotiw2015,” in Proceedings of the 2015 ACM on International Conference on
Multimodal Interaction. ACM, 2015, pp. 423–426.[31] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,
S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture forfast feature embedding,” arXiv preprint arXiv:1408.5093, 2014.
[32] R. Agrawal and R. Srikant, “Privacy-preserving data mining,” in ACM
Sigmod Record, vol. 29, no. 2. ACM, 2000, pp. 439–450.[33] D. Agrawal and C. C. Aggarwal, “On the design and quantification
of privacy preserving data mining algorithms,” in Proceedings of the
twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of
database systems. ACM, 2001, pp. 247–255.[34] V. S. Iyengar, “Transforming data to satisfy privacy constraints,” in
Proceedings of the eighth ACM SIGKDD international conference on
Knowledge discovery and data mining. ACM, 2002, pp. 279–288.[35] K. LeFevre, D. J. DeWitt, and R. Ramakrishnan, “Incognito: Efficient
full-domain k-anonymity,” in Proceedings of the 2005 ACM SIGMOD
international conference on Management of data. ACM, 2005, pp.49–60.
[36] A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam,
“l-diversity: Privacy beyond k-anonymity,” ACM Transactions on Knowl-
edge Discovery from Data (TKDD), vol. 1, no. 1, p. 3, 2007.[37] N. Li, T. Li, and S. Venkatasubramanian, “t-closeness: Privacy beyond k-
anonymity and l-diversity,” in 2007 IEEE 23rd International Conference
on Data Engineering. IEEE, 2007, pp. 106–115.[38] C. C. Aggarwal and S. Y. Philip, “A general survey of privacy-preserving
data mining models and algorithms,” in Privacy-preserving data mining.Springer, 2008, pp. 11–52.
[39] C. C. Aggarwal, “On k-anonymity and the curse of dimensionality,”in Proceedings of the 31st international conference on Very large data
bases. VLDB Endowment, 2005, pp. 901–909.[40] C. Dwork, “Differential privacy: A survey of results,” in International
Conference on Theory and Applications of Models of Computation.Springer, 2008, pp. 1–19.
[41] R. Shokri and V. Shmatikov, “Privacy-preserving deep learning,”in Proceedings of the 22Nd ACM SIGSAC Conference on
Computer and Communications Security, ser. CCS ’15. NewYork, NY, USA: ACM, 2015, pp. 1310–1321. [Online]. Available:http://doi.acm.org/10.1145/2810103.2813687
[42] M. Abadi, A. Chu, I. Goodfellow, H. Brendan McMahan, I. Mironov,K. Talwar, and L. Zhang, “Deep Learning with Differential Privacy,”ArXiv e-prints, Jul. 2016.
[43] S. Avidan and M. Butman, “Blind vision,” in European Conference on
Computer Vision. Springer, 2006, pp. 1–13.[44] R. Gilad-Bachrach, N. Dowlin, K. Laine, K. Lauter, M. Naehrig,
and J. Wernsing, “Cryptonets: Applying neural networks to encrypteddata with high throughput and accuracy,” in Proceedings of The 33rd
International Conference on Machine Learning, 2016, pp. 201–210.[45] B. D. Rouhani, M. S. Riazi, and F. Koushanfar, “Deepsecure: Scalable
provably-secure deep learning,” arXiv preprint arXiv:1705.08963, 2017.[46] P. Mohassel and Y. Zhang, “Secureml: A system for scalable privacy-
preserving machine learning.” IACR Cryptology ePrint Archive, vol.2017, p. 396, 2017.
[47] J. R. Padilla-Lopez, A. A. Chaaraoui, and F. Florez-Revuelta, “Visualprivacy protection methods: A survey,” Expert Systems with Applica-
tions, vol. 42, no. 9, pp. 4177–4195, 2015.[48] E. M. Newton, L. Sweeney, and B. Malin, “Preserving privacy by de-
identifying face images,” IEEE transactions on Knowledge and Data
Engineering, vol. 17, no. 2, pp. 232–243, 2005.[49] P. Korshunov and T. Ebrahimi, “Using face morphing to protect privacy,”
in Advanced Video and Signal Based Surveillance (AVSS), 2013 10th
IEEE International Conference on. IEEE, 2013, pp. 208–213.[50] S. Bhattacharya and N. D. Lane, “Sparsification and separation of deep
learning layers for constrained resource inference on wearables,” inProceedings of the 14th ACM Conference on Embedded Network Sensor
Systems CD-ROM. ACM, 2016, pp. 176–189.[51] N. D. Lane, S. Bhattacharya, P. Georgiev, C. Forlivesi, L. Jiao, L. Qen-
dro, and F. Kawsar, “Deepx: A software accelerator for low-powerdeep learning inference on mobile devices,” in 2016 15th ACM/IEEE
International Conference on Information Processing in Sensor Networks
(IPSN). IEEE, 2016, pp. 1–12.