Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A....

transcript

Personalized Social Recommendations – Accurate or Private?

A. Machanavajjhala (Yahoo!),

with A. Korolova (Stanford), A. Das Sarma (Google)

Social Advertising

• Armani• Gucci• Prada

Recommend ads based on private shopping histories of

“friends” in the social network.

Alice Betty

• Nikon• HP• Nike

Social Advertising … in real world

A product that is followed by your friends …

Items (products/people) liked by Alice’s friends are better recommendations for Alice

Social Advertising … privacy problem

Fact that “Betty” liked “VistaPrint” is leaked to “Alice”

AliceBetty

Only the items (products/people) liked by Alice’s friends are recommendations for Alice

Social Advertising … privacy problem

AliceBetty

Recommending irrelevant items some times improves privacy, but reduces accuracy

Social Advertising Privacy problem

Alice Betty

Alice is recommended ‘X’

Can we provide accurate recommendations to Alice based on the social network, while ensuring that

Alice cannot deduce that Betty likes ‘X’ ?

Outline of this talk• Formal social recommendations problem– Privacy for social recommendations– Accuracy of social recommendations– Example private algorithm and its accuracy

• Privacy-Accuracy trade-off – Properties satisfied by a general algorithm– Theoretical bound

Social Recommendations• A set of agents– Yahoo/Facebook users, medical patients

• A set of recommended items– Other users (friends) , advertisements, products (drugs)

• A network of edges connecting the agents, items– Social network, patient-doctor and patient-drug history

• Problem: – Recommend a new item i to agent a based on the network

Social Recommendations(this talk)• A set of agents– Yahoo/Facebook users, medical patients

• A set of recommended items– Other users (friends) , advertisements, products (drugs)

• A network of edges connecting the agents, items– Social network, patient-doctor and patient-drug history

• Problem: – Recommend a new friend i to target user a based on the

social network

Social Recommendations

Target Node (a)

Candidate Recommendations

u(a, i3)u(a, i2)u(a, i1)

Utility Function – u(a, i) utility of recommending candidate i to target a

Examples [Liben-Nowell et al. 2003]:• # of Common Neighbors• # of Weighted Paths• Personalized Page Rank

Non-Private Recommendation Algorithm

u(a, i3)u(a, i2)u(a, i1)

Algorithm

For each target node a For each candidate i

Compute p(a, i) that maximizes Σ u(a,i) p(a,i) endfor Randomly pick one of the candidates with probability p(a,i) endfor

Example: Common Neighbors Utility

Common Neighbors Utility:“Alice and Bob are likely to be friends if they have many common neighbors”

u(a,i1) = f(2), u(a, i2) = f(3), u(a,i3) = f(1)

Non-Private Algorithm • Return the candidate with max u(a, i)• Randomly pick a candidate with probability proportional to u(a,i)

u(a, i3)u(a, i2)u(a, i1)

Differential Privacy

For every output …

Adversary should not be able to distinguish between any D1 and D2 based on any O

Pr[D1 O] Pr[D2 O] .

For every pair of inputs that differ in one value

< ε (ε>1)log

[Dwork 2006]

Privacy for Social Recommendations• Sensitive information: Recommendation should not

disclose the existence of an edge between two nodes.

Pr[ recommending (i, a) | G1]

Pr[ recommending (i, a) | G2]log < ε

Measuring loss in utility due to privacy • Suppose algorithm A recommends node i of utility ui

with probability pi.

• Accuracy of A is defined as

– comparison with utility of non-private algorithm

Algorithms for Differential PrivacyTheorem: No deterministic algorithm guarantees

differential privacy.

• Exponential Mechanism– Sample output space based on a distance metric.

• Laplace Mechanism– Add noise from a Laplace distribution to query answers.

Privacy Preserving Recommendations

Must pick a node with non-zero probability even if u = 0

Exponential Mechanism[McSherry et al. 2007]

Randomly pick a candidate with probability proportional to exp( ε∙u(a,i) / Δ )

(Δ is maximum change in utilities by changing one edge)

u(a, i3)u(a, i2)u(a, i1)

Satisfies ε-differential privacy

Accuracy of Exponential Mechanism + Common Neighbors Utility

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00%

10%20%30%40%50%60%70%80%90%

Accuracy

WikiVote Network (ε = 0.5)

60% of users have accuracy < 10%

Accuracy of Exponential Mechanism + Common Neighbors Utility

Twitter sample (ε = 1)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00%

10%20%30%40%50%60%70%80%90%

Accuracy

Can we do better?• Maybe common neighbors utility is an especially non-

private utility …– Consider a general utility functions that follow intuitive

axioms

• Maybe the Exponential Mechanism algorithm does not guarantee sufficient accuracy ...– Consider any algorithm that satisfies differential privacy

u(a, i4)

Axioms on Utility Functions

u(a, i3)u(a, i2)u(a, i1)

a Identical with respect to ‘a’.Hence, u(a, i3) = u(a, i4)

Axioms on Utility Functions

“Most of the utility of recommendation to a target is concentrated on a small number of candidates.”

Accuracy-Privacy Tradeoff

Common Neighbors & Weighted Paths Utility*: To achieve constant accuracy for target node a,

ε > Ω(log n / degree(a))

* under some mild assumptions on the weighted paths utility …

Implications of Accuracy-Privacy Tradeoff

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00%

10%20%30%40%50%60%70%80%90%

100%Exponential Mech Theoretical

Accuracy,

WikiVote Network (ε = 0.5)

Implications of Accuracy-Privacy Tradeoff

Twitter sample (ε = 1)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00%

10%20%30%40%50%60%70%80%90%

100%Exponential Mech Theoretical

Accuracy,

Takeaway …• “For majority of the nodes in the network,

recommendations must either be inaccurate or violate differential privacy!”

– Maybe this is a “bad idea”

– Or, Maybe differential privacy is too strong a privacy definition to shoot for.

Intuition behind main result

32Skip >>

u1(a, i), p1(a, i)

u1(a, j), p1(a, j)

u2(a, i), p2(a, i)

u2(a, j), p2(a, j)

p1(a,i)

p2(a,i)< eε

p1(a,i)

p2(a,i)< eε

p3(a,j)

p1(a,j)< eε

Using Exchangeability

p1(a,i)

p2(a,i)< eε

p3(a,j)

p1(a,j)< eε

G3 is an isomorphism of G2.

u2(a,i) = u3(a,j) implies p2(a,i) = p3(a,j)

Using Exchangeability

p1(a,i)

p1(a,j)< e2ε

G3 is an isomorphism of G2.

u2(a,i) = u3(a,j) implies p2(a,i) = p3(a,j)

Using Exchangeability• In general if any node i can be “transformed” to node j in

t edge changes.• Then,

p1(a,i)

p1(a,j)< etε

probability of recommending highest utility node is at most etε times

probability of recommending worst utility node.

Final Act: Using Concentration• Few nodes have high utility for target a– 10s of nodes share a common neighbor with a

• Many nodes have low utility for target a– Millions of nodes don’t share a common neighbor with a

• Thus, there exist i and j such that

p1(a,i)

p1(a,j)< etεΩ(n) =

Summary of Social Recommendations• Question: “Can social recommendations be made while

guaranteeing strong privacy conditions?”– General utility functions satisfying natural axioms– Any algorithm satisfying differential privacy

• Answer: “For majority of nodes in the network, recommendations must either be inaccurate or violate differential privacy!”– Maybe this is a “bad idea”– Or, Maybe differential privacy is too strong a privacy

definition to shoot for.

Summary of Social Recommendations• Answer: “For majority of nodes in the network,

recommendations must either be inaccurate or violate differential privacy!”– Maybe this is a “bad idea”– Or, Maybe differential privacy is too strong a privacy

definition to shoot for.

• Open Question: “What is the minimum amount of personal information that a user must be willing to disclose in order to get personalized recommendations?”

Thank you

Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A....

Documents