Date post: | 15-Jan-2016 |
Category: |
Documents |
Upload: | francisco-bartlett |
View: | 218 times |
Download: | 0 times |
unFriendly: Multi-Party Privacy Risks in Social Networks
Kurt Thomas, Chris Grier, David M. Nicol
2
Problem
• Social networks propelled by personal content– Upload stories, photos; disclose relationships– Access control limited to owners
• Content can reference multiple parties– Distinct privacy requirements for each party– Currently, only one policy enforced
• Friends, family inadvertently leak sensitive information
3
Consequences
• One photo or message leaked may be harmless..– Aggregate stories, friends, photos form a composite
• Can infer personal data from these public references– Weighted by perceived importance of relationships
• In practice, can predict personal attributes with up to 83% accuracy– Directly tied to amount, richness of exposed data– Independent of existing privacy controls
4
Solution
• Adapt privacy controls:– Grant users control over all personal references,
regardless where it appears– Includes tags, mentions, links– Allow users to specify global privacy settings
• Prototype solution as a Facebook application– Satisfies privacy requirements of all users referenced– Determines mutually acceptable audience; restricts
access to everyone else
5
Overview
• Existing privacy controls• Sources of conflicting requirements• Inferring personal details from leaks• Inference performance• Devising a solution• Conclusion
6
Existing Controls
Everyone Friends of Friends
Only Friends
Friend List Wall Posts Personal Details Photos, Videos
7
Privacy Conflict
• Social networks recognize only one owner– But data can pertain to multiple users– Each user has potentially distinct privacy
requirement
• Privacy Conflict:– When two or more users disagree on data’s
audience– Results in data exposed against a user’s will
8
Privacy Conflict – Friendships
• Privacy Requirement: Hide sensitive relationships
• Privacy Conflict: Alice reveals her friends
• Link between Alice-Bob revealed by Alice
9
Privacy Conflict – Wall Posts
• Privacy Requirement:Control audience of post
• Privacy Conflict: Anything posted to Alice’s wall is public
• Content written by Bob exposed by Alice
Bob > Alice: Just broke up with Carol..
10
Privacy Conflict – Tagging
• Privacy Requirement: Hide sensitive posts
• Privacy Conflict: Alice shares her posts
• Details about Bob exposed by Alice
Alice: Skipping work with @Bob!
11
Aggregating Leaked Data
• Threat model:– Adversary crawls entire social network– Collects all public references to a user; messages,
friendships, tagged content– Feasible for search engines, marketers, political
groups
• Exposure Set– All public information in conflict with a user’s privacy
requirement
12
Inferring Personal Details
• Given exposure set, analyze whether leaks create an accurate composite of user
• Attempt to predict 8 values from exposure set:– Personal: Gender, religion, political view, relation status– Media: Favorite books, TV shows, movies, music
• Compare predictions to scenario where no privacy conflict exists
13
Inference Approaches
• Friendships:– Base predictions on attributes of friends– Users with liberal, Catholic friends who like
Twilight tend to be…– Weight relationships on perceived importance;
distinguish strong friends from acquaintances• Frequency of communication• Mutual friends; community
– Feed vector of attributes, weights into multinomial logistic regression
14
Inference Approaches
• Wall Content:– Base prediction on content written by private user,
posted to public walls– A user who talks about sports, girlfriends, and cars
tends to be …– Treat content as bag of words, weight terms based
on TF-IDF– Feed vector of words into multinomial logistic
regression
15
Experiment Setup
• Analyze inference accuracy on 80,000 Facebook profiles– 40,000 profiles from 2 distinct networks– Collect all references to a user appearing in public
profiles, walls, friend lists• Simulate private profiles– Used values reported in public profile as ground
truth– Compare prediction against ground truth
16
Frequency Data is Exposed
Statistic Network A Network BProfiles in data set 42,796 40,544Fraction of profiles public 44% 35%
Avg. # relationships per profilein exposure set
42 23
Avg. # wall posts per profilein exposure set
53 43
17
Prediction Accuracy
Gender
Politica
l View
Religion
Relationsh
ipMusic
Movies
TV Sh
owsBooks
0
20
40
60
80
100
Baseline Friends Wall Content
18
More Conflicts, Better Accuracy
19
Improving Privacy
• Privacy must extend beyond single-owner model– Tags, links, mentions can reference multiple users– Rely on these existing features to distinguish who
is at risk• Allow each user to specify global privacy
policy• Enforce policy on all personal content,
regardless page it appears
20
Enforcing Multi-Party Privacy
Alice: Looks like @Bob and @Carol are done for!
Individual Policies U1 U2 U3 U4 U5 U6
Alice Bob Carol Mutual Policy
21
Limitations
• In absence of mutual friends, safe set of viewers tends towards empty set
• Assume friends will consent to not sharing with wider audience
• Content must be tagged; no other way to distinguish privacy-affected parties
• Censorship; prevents negative speech
22
Conclusion
• Privacy goes beyond one person’s expectations– All parties affected must have a say– Existing model lacks multi-party support
• References to other users are common– Outside their control
• Aggregate exposed data contains sensitive features– Predictions will only get better
• By adopting multi-party privacy, can return control back to users
23
Questions?
24
Correlated Features Among Friends
25
Importance of Mutual Friends
26
Importance of Frequent Communication