New Approach to Quantification of Privacy on Social Network SitesTran Hong Ngoc
Isao Echizen
Kamiyama Komei
Hiroshi Yoshiura
VNU, VietnamNII, JapanUEC, JapanUEC, Japan
IEEE AINA 2010
Presenter: Yu-Song Syu
Social Network Sites Growth of SNSs
Leads to an explosion in online information-sharing
With SNSs People share information with friends Information include sensitive data
Location, age, career, …
Intruders in SNSs By making statistics, Intruders may achieve
personal information: Commercial purpose Identity theft Physical harm …
How to get such information?
http://www.iis.sinica.edu.tw
Usually, people do not know How Much private information they reveal about themselves and others
http://www.iis.sinica.edu.tw
Privacy Metric
Based on probability and entropy
Helps user know how much private information may leak from their blog sentences
Defines the Leaked Privacy Value, Δ, as the amount of knowledge that intruders can learn about a “problem of interest”
Proposed System Model
Info. Retrieval techniquesbased on NLP methods
Quantification of Privacy
System Model Find the information about someone
Prefecture, age, city, university, …
Blog sentences that users post
Event & Blog Set
Event:
Blog Set:
Intersection:
1)(0,|)( xpUxxk
knkkn
i
k xxxxpx
,...,,,1)(0|~21
1
)(
)()()()( ,1)(0,| kki
kki xpxx
Event
BlogSetiBlogSetj
Blog Set / Joint Blog Set
Assumed to never be empty
Example: Prefecture
Math Backgrounds Entropy (Uncertainty)
Conditional Entropy
Joint Entropy
Before Proposed Metric…
Event Possible Value
Why Use Entropy?
Idea: Difference of Uncertainty
Leaked Privacy
Privacy Leakage Metric
Leaked Privacy Value: The change in the privacy value that is had by subtracting
the privacy after sentences are posted from the privacy before the sentences are posted
})~({})({ )()( kk HH ),...,,(})({ )()2()1()( mk HH
)~,...,~,~(})~({ )()2()1()( mk HH ,&
# events
before after
Experiments
Dataset: Statistical Survey Department, Statistics Bureau,
Ministry of Internal Affairs and Communications Problem of Interest:
Gaining information relating to a victim in an accident, which happened in Japan’s subway and were discussed by SNS users
Experiments - Prefecture
Experiments - Age(Age)
AgePrefecture
Experiments – Total Leaked Privacy Total Leaked Privacy Before & After Blogging
Conclusions
Proposed a new metric to quantify how much private information is leaked from blog on SNSs
SNS users can see if the posting carelessly expose private information
Based on probability and entropy, the proposal is simpler then others but effective, as proved in experiments