Post on 04-Jan-2016
transcript
Xiaowei Ying, Xintao Wu
Univ. of North Carolina at Charlotte
PAKDD-09 April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
MotivationPrivacy Preserving Social Network Publishing
node-anonymization cannot guarantee identity/link privacy due to
subgraph queries.Backstrom et al. WWW07, Hay et al. UMass TR07
edge randomizationRandom Add/Del Random SwitchK-anonymity
Hay et al. VLDB08, Liu&Terzi SIGMOD08, Zhou&Pei ICDE08
Utility preserving randomizationSpectral feature preserving Ying&Wu SDM08 Real space feature preserving Ying&Wu SDM09
2
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Problem Formalization
3
nnijaA )(),( mnG
Prior belief vs. Posterior belief)1( ijaP )~
|1( GaP ij
?)~,~|1( xmaaP ijijij
Ying&Wu SDM08
similarity measure value between node i and j
nnijaA )~(~
),(~
mnG
This paper
Add k then del k edges
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
:
:
:
:
2
1
C
h
Network of US political books
(105 nodes, 441 edges, r=8%)
Books about US politics sold by Amazon.com. Edges represent frequent co-purchasing of books by the same buyers. Nodes have been given colors of blue, white, or red to indicate whether they are "liberal", "neutral", or "conservative".
http://www-personal.umich.edu/˜mejn/netdata/
4
Polbooks network
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Proportion of true edges vs. similarity
5
After randomly add/delete 200 edges (totally 441 edges)
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Similarity measures vs. Link predictionSimilarity measures
The number of common neighborsAdamic/Adar, the weighted number of
common neighborsKatz, a weighted sum of the number of
paths connecting two nodesCommute time, the expected steps of
random walks from node i to j and back to i.
Similarity measures have been exploited in the classic link prediction problem. Liben-Nowell&Kleinberg CIKM03
6
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Proportion of true edges vs. similarity
7
After randomly add/delete 200 edges (totally 441 edges)
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Calculating Posterior belief
8
The attacker does not know this value,
what he can do?
]/[
/2
2
1
mCkp
mkp
n
Applying Bayes theorem
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
MLE estimation
Estimate based on randomized graph
9
Posterior belief can be calculated by attackers
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Comparison
10
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Comparison
11
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Empirical EvaluationAttacker’s Prediction Strategy
Calculate posterior probability of all node pairs Choose top t node pairs (with highest post.
Prob.) as predicted candidate links
12
For each t, the precision of predictions (k=0.5m)
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Empirical Evaluation
13
mk 5.0
The posteriori beliefs with similarity measures achieve higher precision than that without exploiting similarity measures.
One measure that is best for one data is not necessarily best for another data.
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Determining k to guarantee privacy
14
Data Owner
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Conclusion & Future WorkWe have shown that node proximity measures
can be exploited by attackers to breach link privacy in edge add/del randomized networks
15
How about other topological properties?
?)~
|1( GaP ij)~,~|1( xmaaP ijijij
How about other randomization strategies?
Privacy vs. utility tradeoff
Questions?
Acknowledgments
This work was supported in part by U.S. National Science Foundation IIS-0546027 and CNS-0831204.
Thank You!
16
PAKDD-09, April 28, Bangkok, Thailand
On Link Privacy in Randomizing Social Networks
Graph space :{G: with the given degree seq. & }
Examining proportion of sample graphs with existence a link between node i and j
Ying&Wu,SDM09
17
N
kkij
N
jiGN
SpaceaP
SpaceGGGN
1
21
),(1
)|1(
,,, :samples
Utility preserving randomization
RGS )(
Attacker’s confidence on link (i,j)