+ All Categories
Home > Documents > CS208: Applied Privacy for Data Science Membership & Other...

CS208: Applied Privacy for Data Science Membership & Other...

Date post: 06-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
22
CS208: Applied Privacy for Data Science Membership & Other Attacks (cont.) And Introduction to Differential Privacy James Honaker & Salil Vadhan School of Engineering & Applied Sciences Harvard University February 15, 2019
Transcript
Page 1: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

CS208: Applied Privacy for Data Science Membership & Other Attacks (cont.)

And Introduction to Differential Privacy

James Honaker & Salil Vadhan School of Engineering & Applied Sciences

Harvard University

February 15, 2019

Page 2: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Recap: Membership Attacks

𝒏𝒏 people

0 1 1 0 1 0 0 0 1

0 1 0 1 0 1 0 0 1

1 0 1 1 1 1 0 1 0

1 1 0 0 1 0 1 0 0

1 0 1 1 1 1 0 1 0 Data set X

Alice’s data

Attacker

Population

“In”

“Out”

“In”/ “Out”

Mechanism (stats, ML model, …)

aux

OR

Attacker gets: • Access to mechanism outputs • Alice’s data • (Possibly) auxiliary info about population Then decides: if Alice is in the dataset X

[slide based on one from Adam Smith]

Page 3: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Attacks on Aggregate Stats • What error 𝛼𝛼 makes sense?

– Estimation error due to sampling ≈ 1/ 𝑛𝑛 – Reconstruction attacks require 𝛼𝛼 ≲ 1/ 𝑛𝑛, 𝑑𝑑 ≥ 𝑛𝑛

– Membership attacks: 𝛼𝛼 ≲ 𝒅𝒅/𝒏𝒏 • Lessons

– “Too many, too accurate” statistics reveal individual data – “Aggregate” is hard to pin down

3

𝟏𝟏𝒏𝒏

Reconstruction attacks

Sampling error

Membership attacks Distortion 𝜶𝜶

𝒅𝒅𝒏𝒏

[slide based on one from Adam Smith]

Page 4: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Reconstruction vs. Membership • Reconstruction Attack ⇒ Membership Attack

– Take sensitive bit = 1 iff in dataset. – Use form of reconstruction attack that only requires

knowing identifier for person being attacked (PS1 bonus). – Reconstruction failure probability bounds false positive and

false negative probabilities. • Membership Attack ⇒ Reconstruction Attack

– Test membership in sub-datasets where sensitive bit is 0, and where sensitive bit is 1.

– Pr[reconstruct correctly] ≈ true positive prob. – Pr[reconstruct incorrectly] ≈ false positive prob – Reconstruction fails (⊥) if both tests say “OUT”.

Page 5: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Membership Attacks on ML as a Service

[Shokri et al. 2017] Switch to slides from Reza Shokri’s talk

Page 6: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Another Attack on ML? [Frederickson et al. `14, cf. McSherry `16]

𝒏𝒏 people

0 1 1 0 1 0 0 0 1

0 1 0 1 0 1 0 0 1

1 0 1 1 1 1 0 1 0

1 1 0 0 1 0 1 0 0

1 0 1 1 1 1 0 1 Data set X

Alice’s (known) data

Attacker

Population

Mechanism (stats, ML model, …)

Difference from reconstruction attacks: • Above attack works even if Alice not in dataset. Based

on correlation between known & sensitive attributes. • Reconstruction attacks work even when sensitive bit

uncorrelated.

1

Page 7: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

“Five Views” Responses to Membership Attacks on GWAS

Some points raised: • Limiting access to credentialed researchers • Informed consent • Privacy vs. utility • Individual vs. group privacy • Making reidentication illegal • Maintaining trust and participation

Page 8: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Goals of Differential Privacy • Utility: enable “statistical analysis” of datasets

– e.g. inference about population, ML training, useful descriptive statistics

• Privacy: protect individual-level data – against “all” attack strategies, auxiliary info.

Q: Can it help with privacy in microtargetted advertising? [Korolova attacks]

– inference from impressions? – inference from clicks? – displaying intrusive ads?

Page 9: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Differential privacy

C

M

mechanism

q1

a1 q2

a2

q3

a3

data analysts

Requirement: effect of each individual should be “hidden”

[Dinur-Nissim ’03+Dwork, Dwork-Nissim ’04, Blum-Dwork-McSherry-Nissim ’05, Dwork-McSherry-Nissim-Smith ’06]

Sex Blood ⋯ HIV?

F B ⋯ Y

M A ⋯ N

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

Page 10: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Differential privacy [Dinur-Nissim ’03+Dwork, Dwork-Nissim ’04, Blum-Dwork-McSherry-Nissim ’05, Dwork-McSherry-Nissim-Smith ’06]

C

M

mechanism

q1

a1 q2

a2

q3

a3

Sex Blood ⋯ HIV?

F B ⋯ Y

M A ⋯ N

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y adversary

Page 11: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Differential privacy [Dinur-Nissim ’03+Dwork, Dwork-Nissim ’04, Blum-Dwork-McSherry-Nissim ’05, Dwork-McSherry-Nissim-Smith ’06]

C

M

mechanism

q1

a1 q2

a2

q3

a3

Sex Blood ⋯ HIV?

F B ⋯ Y

M A ⋯ N

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

Requirement: an adversary shouldn’t be able to tell if any one person’s data were changed arbitrarily

adversary

Page 12: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Differential privacy [Dinur-Nissim ’03+Dwork, Dwork-Nissim ’04, Blum-Dwork-McSherry-Nissim ’05, Dwork-McSherry-Nissim-Smith ’06]

C

M

mechanism

q1

a1 q2

a2

q3

a3

Sex Blood ⋯ HIV?

F B ⋯ Y

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

Requirement: an adversary shouldn’t be able to tell if any one person’s data were changed arbitrarily

adversary

Page 13: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Differential privacy [Dinur-Nissim ’03+Dwork, Dwork-Nissim ’04, Blum-Dwork-McSherry-Nissim ’05, Dwork-McSherry-Nissim-Smith ’06]

C

M

mechanism

q1

a1 q2

a2

q3

a3

Sex Blood ⋯ HIV?

F B ⋯ Y

F A Y

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

Requirement: an adversary shouldn’t be able to tell if any one person’s data were changed arbitrarily

adversary

Page 14: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Simple approach: random noise

C “What fraction of people are type B and HIV positive?”

Answer + Noise(𝑂𝑂(1/𝑛𝑛))

𝑛𝑛

Sex Blood ⋯ HIV?

F B ⋯ Y

M A ⋯ N

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

M

Error → 0 as 𝑛𝑛 → ∞

• Very little noise needed to hide each person as 𝑛𝑛 → ∞. • Note: this is just for one query

Page 15: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

DP for one query/release [Dinur-Nissim ’03+Dwork, Dwork-Nissim ’04, Blum-Dwork-McSherry-Nissim ’05, Dwork-McSherry-Nissim-Smith ’06]

C

M

randomized mechanism

q

a

Sex Blood ⋯ HIV?

F B ⋯ Y

F A Y

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

Requirement: for all D, D’ differing on one row, and all q

Distribution of M(D,q) ≈𝜀𝜀 Distribution of M(D’,q)

adversary

Page 16: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

DP for one query/release [Dinur-Nissim ’03+Dwork, Dwork-Nissim ’04, Blum-Dwork-McSherry-Nissim ’05, Dwork-McSherry-Nissim-Smith ’06]

C

M

randomized mechanism

q

a

Sex Blood ⋯ HIV?

F B ⋯ Y

F A Y

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

Requirement: for all D, D’ differing on one row, and all q ∀ sets T, Pr[M(D,q)∈T]≲ (1+ε)⋅ Pr[M(D’,q)∈T]

adversary

Page 17: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

DP for one query/release [Dwork-McSherry-Nissim-Smith ’06]

C

M

randomized mechanism

q

a

Sex Blood ⋯ HIV?

F B ⋯ Y

F A Y

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

Def: M is 𝜀𝜀-DP if for all D, D’ differing on one row, and all q ∀ sets T, Pr[M(D,q)∈T] ≤ 𝑒𝑒𝜀𝜀⋅ Pr[M(D’,q)∈T]

adversary

(Probabilities are (only) over the randomness of M.)

Page 18: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

The Laplace Mechanism

C “What fraction of people are type B and HIV positive?”

Answer + Laplace(1/𝜀𝜀𝑛𝑛)

𝑛𝑛

Sex Blood ⋯ HIV?

F B ⋯ Y

M A ⋯ N

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

M

Density at 𝑦𝑦 ∝ exp (−𝜀𝜀𝑛𝑛 ⋅ 𝑦𝑦 )

• Very little noise needed to hide each person as 𝑛𝑛 → ∞.

[Dwork-McSherry-Nissim-Smith ’06]

Page 19: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

The Laplace Mechanism

C query 𝑞𝑞

𝑞𝑞(𝑥𝑥) + Laplace(GS𝑞𝑞/𝜀𝜀)

𝑛𝑛

Sex Blood ⋯ HIV?

F B ⋯ Y

M A ⋯ N

M O ⋯ N

M O ⋯ Y

F A ⋯ N

M B ⋯ Y

M

• Very little noise needed to hide each person as 𝑛𝑛 → ∞.

Density at 𝑦𝑦 ∝ exp (−𝜀𝜀 ⋅ 𝑦𝑦 /GS𝑞𝑞)

[Dwork-McSherry-Nissim-Smith ’06]

Page 20: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

The Laplace Mechanism • Let 𝒳𝒳 be a data universe, and 𝒳𝒳𝑛𝑛 a space of datasets.

(For now, we are treating 𝑛𝑛 as known and public.) • For 𝑥𝑥, 𝑥𝑥′ ∈ 𝒳𝒳𝑛𝑛, write 𝑥𝑥 ∼ 𝑥𝑥𝑥 if 𝑥𝑥 and 𝑥𝑥𝑥 differ on at one row. • For a query 𝑞𝑞 ∶ 𝒳𝒳𝑛𝑛 → ℝ, the global sensitivity is

GS𝑞𝑞 = max𝑥𝑥∼𝑥𝑥′

𝑞𝑞 𝑥𝑥 − 𝑞𝑞(𝑥𝑥′) .

• The Laplace distribution with scale 𝑠𝑠, Lap 𝑠𝑠 : – Has density function 𝑓𝑓 𝑦𝑦 = 𝑒𝑒−|𝑦𝑦|/𝑠𝑠/2𝑠𝑠. – Mean 0, standard deviation 2 ⋅ 𝑠𝑠.

Theorem: 𝑀𝑀 𝑥𝑥, 𝑞𝑞 = 𝑞𝑞 𝑥𝑥 + Lap(GS𝑞𝑞/𝜀𝜀) is 𝜀𝜀-DP.

[Dwork-McSherry-Nissim-Smith ’06]

Page 21: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Calculating Global Sensitivity 1. 𝒳𝒳 = {0,1}, 𝑞𝑞 𝑥𝑥 = ∑ 𝑥𝑥𝑖𝑖𝑛𝑛

𝑖𝑖=1 , GS𝑞𝑞 =

2. 𝒳𝒳 = ℝ, 𝑞𝑞 𝑥𝑥 = ∑ 𝑥𝑥𝑖𝑖𝑛𝑛𝑖𝑖=1 , GS𝑞𝑞 =

3. 𝒳𝒳 = [0,1], 𝑞𝑞 𝑥𝑥 = mean 𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥𝑛𝑛 , GS𝑞𝑞 =

4. 𝒳𝒳 = [0,1], 𝑞𝑞 𝑥𝑥 = median 𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥𝑛𝑛 , GS𝑞𝑞 =

5. 𝒳𝒳 = [0,1], 𝑞𝑞 𝑥𝑥 = variance 𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥𝑛𝑛 , GS𝑞𝑞 =

Q: for which of these queries is the Laplace Mechanism “useful”?

Page 22: CS208: Applied Privacy for Data Science Membership & Other ...people.seas.harvard.edu/~salil/cs208/spring19/DP-foundations1-lecture.pdf · [slide based on one from Adam Smith] Reconstruction

Proof that the Laplace Mechanism is Differentially Private


Recommended