+ All Categories
Home > Documents > Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin...

Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin...

Date post: 14-Dec-2015
Category:
Upload: wyatt-burbridge
View: 221 times
Download: 1 times
Share this document with a friend
Popular Tags:
54
Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1 Lecture 6 : 590.03 Fall 12
Transcript
Page 1: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 1

Simulatability“The enemy knows the system”, Claude Shannon

CompSci 590.03Instructor: Ashwin Machanavajjhala

Page 2: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 2

Announcements• Please meet with me at least 2 times before you finalize your

project (deadline Sep 28).

Page 3: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 3

Recap – L-Diversity• The link between identity and attribute value is the sensitive

information. “Does Bob have Cancer? Heart disease? Flu?” “Does Umeko have Cancer? Heart disease? Flu?”

• Adversary knows ≤ L-2 negation statements. “Umeko does not have Heart Disease.”

– Data Publisher may not know exact adversarial knowledge

• Privacy is breached when identity can be linked to attribute value with high probability Pr[ “Bob has Cancer” | published table, adv. knowledge] > t

Page 4: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 4

Zip Age Nat. Disease

1306* <=40 * Heart

1306* <=40 * Flu

1306* <=40 * Cancer

1306* <=40 * Cancer

1485* >40 * Cancer

1485* >40 * Heart

1485* >40 * Flu

1485* >40 * Flu

1305* <=40 * Heart

1305* <=40 * Flu

1305* <=40 * Cancer

1305* <=40 * Cancer

Recap – 3-Diverse Table

L-Diversity Principle: Every group of tuples with the same Q-ID values has ≥ L distinct sensitive values of roughly equal proportions.

Page 5: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 5

Outline• Simulatable Auditing

• Minimality Attack in anonymization

• Simulatable algorithms for anoymization

Page 6: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 6

Query Auditing

Database has numeric values (say salaries of employees).Database either truthfully answers a question or denies answering.

MIN, MAX, SUM queries over subsets of the database.

Question: When to allow/deny queries?

Database

Researcher

Query

Safe to publish?

Yes

No

Page 7: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 7

Why should we deny queries?• Q1: Ben’s sensitive value?

– DENY

• Q2: Max sensitive value of males?– ANSWER: 2

• Q3: Max sensitive value of 1st year PhD students? – ANSWER: 3

• But Q3 + Q2 => Xi = 3

Name 1st year PhD

Gender Sensitive value

Ben Y M 1Bha N M 1Ios Y M 1Jan N M 2Jian Y M 2Jie N M 1Joe N M 2

Moh N M 1Son N F 1Xi Y F 3

Yao N M 2

Page 8: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 8

Value-Based Auditing• Let a1, a2, …, ak be the answers to previous queries Q1, Q2, …, Qk.

• Let ak+1 be the answer to Qk+1.

ai = f(ci1x1, ci2x2, …, cinxn), i = 1 … k+1

cim = 1 if Qi depends on xm

Check if any xj has a unique solution.

Page 9: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 9

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

Page 10: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 10

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

-∞ ≤ x1 … x5≤ 10

Page 11: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 11

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)

Ans: 8DENY

-∞ ≤ x1 … x4 ≤ 8 => x5 = 10

Page 12: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 12

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)

Ans: 8DENY

Denial means some value can be

compromised!

Page 13: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 13

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)

Ans: 8DENY

What could max(x1, x2, x3, x4)

be?

Page 14: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 14

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)

Ans: 8DENY

From first answer, max(x1,x2,x3,x4) ≤ 10

Page 15: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 15

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)

Ans: 8DENY

If, max(x1,x2,x3,x4) = 10

Then, no privacy breach

Page 16: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 16

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)

Ans: 8DENY

Hence, max(x1,x2,x3,x4) < 10

=> x5 = 10!

Page 17: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 17

Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)

Ans: 8DENY

Hence, max(x1,x2,x3,x4) < 10

=> x5 = 10!Denials leak information.

Attack occurred since privacy analysis didnot assume that attacker knows the algorithm.

Page 18: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 18

Simulatable Auditing [Kenthapadi et al PODS ‘05]

• An auditor is simulatable if the decision to deny a query Qk is made based on information already available to the attacker. – Can use queries Q1, Q2, …, Qk and answers a1, a2, …, ak-1

– Cannot use ak or the actual data to make the decision.

• Denials provably do not leak informaiton– Because the attacker could equivalently determine whether

the query would be denied. – Attacker can mimic or simulate the auditor.

Page 19: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 19

Simulatable Auditing Algorithm• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.

x1

x2

x3

x4 x5

max(x1, x2 , x3 , x4 , x5)

Ans: 10 10

max(x1, x2 , x3 , x4)Before

computing answer

DENY

Ans > 10 => not possibleAns = 10 => -∞ ≤ x1 … x4 ≤ 10Ans < 10 => x5 = 10

SAFEUNSAFE

Page 20: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 20

Summary of Simulatable Auditing

• Decision to deny answers must be based on past queries answered in some (many!) cases.

• Denials can leak information if the adversary does not know all the information that is used to decide whether to deny the query.

Page 21: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 21

Outline• Simulatable Auditing

• Minimality Attack in anonymization

• Simulatable algorithms for anoymization

Page 22: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 22

Minimality attack on Generalization algorithms

• Algorithms for K-anonymity, L-diversity, T-closeness, etc. try to maximize utility. – Find a minimally generalized table in the lattice that satisfies privacy, and

maximizes utility.

• But … attacker also knows this algorithm!

Page 23: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 23

Example Minimality attack [Wong et al VLDB07]

• Dataset with one quasi-identifier and 2 values q1, q2.• q1, q2 generalize to Q.

• Sensitive attribute: Cancer – yes/no• We want to ensure P[Cancer = yes] < ½.

– OK to know if an individual does not have Cancer.

• Published Table:

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Page 24: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 24

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)Possible Input dataset

3 occurrences of q1QID Cance

r

q1 Yes

q1 Yes

q1 No

q2 No

q2 No

q2 No

QID Cancer

q1 Yes

q1 No

q1 No

q2 Yes

q2 No

q2 No

Page 25: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 25

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)Possible Input dataset

3 occurrences of q1QID Cance

r

q1 Yes

Q No

Q No

q2 Yes

q2 No

q2 NoThis is a better generalization!

Page 26: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 26

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)Possible Input dataset

1 occurrence of q1QID Cance

r

q2 Yes

q1 Yes

q2 No

q2 No

q2 No

q2 No

QID Cancer

q2 Yes

q2 Yes

q1 No

q2 No

q2 No

q2 No

Page 27: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 27

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)Possible Input dataset

3 occurrences of q1QID Cance

r

q2 Yes

Q No

Q No

q2 Yes

q2 No

q2 NoThis is a better generalization!

Page 28: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 28

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)Possible Input dataset

3 occurrences of q1QID Cance

r

q2 Yes

Q No

Q No

q2 Yes

q2 No

q2 No

There must be exactly two tuples with q1

Page 29: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 29

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)

Possible Input dataset2 occurrences of q1

QID Cancer

q1 Yes

q1 Yes

q2 No

q2 No

q2 No

q2 No

QID Cancer

q2 Yes

q2 Yes

q1 No

q1 No

q2 No

q2 No

QID Cancer

q1 Yes

q2 Yes

q1 No

q2 No

q2 No

q2 No

Already satisfies privacy

Page 30: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 30

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)

Possible Input dataset2 occurrences of q1

QID Cancer

q1 Yes

q1 Yes

q2 No

q2 No

q2 No

q2 No

QID Cancer

q2 Yes

q2 Yes

q1 No

q1 No

q2 No

q2 No

Learning Cancer=NO is OK,

Hence, this is private

Page 31: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 31

Which input datasets could have led to the published table?

QID Cancer

Q Yes

Q Yes

Q No

Q No

q2 No

q2 No

Output dataset{q1,q2} Q

(“2-diverse”)

Possible Input dataset2 occurrences of q1

QID Cancer

q1 Yes

q1 Yes

q2 No

q2 No

q2 No

q2 No

This is the ONLY input that results in

the output!

P[Cancer = yes | q1] = 1

Page 32: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 32

Outline• Simulatable Auditing

• Minimality Attack in anonymization

• Transparent Anonymization: Simulatable algorithms for anoymization

Page 33: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 33

Transparent Anonymization• Assume that the adversary knows the algorithm that is being

used.

O: Output table

I(O, A): Input tables that result in O due to algorithm A

I: All possible input tables

Page 34: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 34

Transparent Anonymization• According to I(O, A) privacy must be guaranteed.

– Probability must be computed assuming I(O,A) is the actual set of all possible input tables.

• What is an efficient algorithm for Transparent Anonymization?– For L-diversity?

Page 35: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 35

Ace Algorithm [Xiao et al TODS’10]

Step 1: AssignJust based on the sensitive values, construct (in a randomized fashion) an intermediate L-diverse generation.

Step 2: SplitOnly based on the quasi-identifier values (and without looking at sensitive values) , deterministically refine the intermediate solution to maximize utility.

Page 36: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 36

Step 1: Assign• Input Table

Page 37: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 37

Step 1: Assign• St is the set of all tuples (grouped by sensitive value)

• Iteratively,

– Remove α tuples each from the β (≥L) most frequent sensitive values

Page 38: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 38

Step 1: Assign• St is the set of all tuples (grouped by sensitive value)

• Iteratively,

– Remove α tuples each from the β (≥L) most frequent sensitive values

– 1st iteration β=2, α=2

Page 39: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 39

Step 1: Assign• St is the set of all tuples (grouped by sensitive value)

• Iteratively,

– Remove α tuples each from the β (≥L) most frequent sensitive values

– 2nd iteration β=2, α=1

Page 40: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 40

Step 1: Assign• St is the set of all tuples (grouped by sensitive value)

• Iteratively,

– Remove α tuples each from the β (≥L) most frequent sensitive values

– 3rd iteration β=2, α=1

Page 41: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 41

Intermediate GeneralizationName Age Zip

Ann 21 10000

Bob 27 18000

Gill 60 63000

Ed 54 60000

Don 32 35000

Fred 60 63000

Hera 60 63000

Cate 32 35000

Disease

Dyspepsia

Dyspepsia

Flu

Flu

Bronchitis

Gastritis

Diabetes

Gastritis

Page 42: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 42

Step 2: Split• If a bucket contains α>1 tuples of each sensitive value, split it into

two buckets, Ba and Bb s.t.,

– Pick 1 ≤ αa < α tuples from each sensitive value in bucket B, and put them in bucket Ba. The remaining tuples go to Bb.

– The division (Ba, Bb) is optimal in terms of utility. Name Age Zip

Ann 21 10000

Bob 27 18000

Gill 60 63000

Ed 54 60000

Don 32 35000

Fred 60 63000

Hera 60 63000

Cate 32 35000

Page 43: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 43

Why does the Ace algorithm satisfy Transparent L-Diversity?

• According to I(O, A) privacy must be guaranteed. – Probability must be computed assuming I(O,A) is the actual set of all possible

input tables.

O: Output table

I(O, A): Input tables that result in O due to algorithm A

I: All possible input tables

Page 44: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 44

Ace algorithm analysisLemma 1:

The assign step satisfies transparent L-diversity.

Proof (sketch): • Consider an intermediate output Int• Suppose there is some input table T such that Assign(T) = Int• Any other table T’ where the sensitive values of 2 individuals in

the same group are swapped, also leads to the same intermediate output Int.

Page 45: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 45

Ace algorithm analysis

Both tables result in the same intermediate output.

Page 46: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 46

Ace algorithm analysisLemma 1:

The assign step satisfies transparent L-diversity.Proof (sketch): • Consider an intermediate output Int• Suppose there is some input table T such that Assign(T) = Int• Any other table T’, where the sensitive values of 2 individuals in the same

group are swapped, also leads to the same intermediate output.

• The set of input tables I(Int,A) contains all possible assignments of diseases to individuals within each group of Int.

Page 47: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 47

Ace algorithm analysisLemma 1:

The assign step satisfies transparent L-diversity.Proof (sketch): • The set of table I(Int,A) contains all possible assignments of diseases to

individuals in each group of Int.

• P[Ann has dyspepsia | I(Int,A) and Int] = 1/2

Name Age Zip

Ann 21 10000

Bob 27 18000

Gill 60 63000

Ed 54 60000

Disease

Dyspepsia

Dyspepsia

Flu

Flu

Page 48: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 48

Ace algorithm analysisLemma 2:

The split phase also satisfies transparent L-diversity.

Proof (sketch):• I(Int, Assign) contains all tables where an individual is assigned to

an arbitrary sensitive value within the same group in Int. • Suppose some input table T ε I(Int, Assign) results in the final

output O after Split.

Page 49: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 49

Ace algorithm analysis• Split does not depend on the sensitive values.

Ann Gill

BobEd

dyspepsia flu

Ann Bob

dyspepsia flu Gill Ed

dyspepsia flu

results in

BobEd

AnnGill

dyspepsia flu

Bob Ann

dyspepsia flu Ed Gill

dyspepsia flu

results in

Page 50: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 50

Ace algorithm analysis

If T ε I(Int, Assign), and it results in O after split, Then, T’ ε I(Int, Assign), and it results in O after split

Table T Table T’

Page 51: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 51

Ace algorithm analysis• Lemma 2:

The split phase also satisfies transparent L-diversity.

Proof (sketch)• Let T’ be generated by “swapping diseases” in some bucket. • If T ε I(Int, Assign), and it results in O after split,

Then, T’ ε I(Int, Assign), and it results in O after split.

• For any individual it is equally likely that sensitive value is one of ≥L choices.

• Therefore, P[individual has disease | I(O, Ace)] < 1/L

Page 52: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 52

Summary• Many systems assume privacy/security is guaranteed by assuming

the adversary does not know the algorithm. – This is bad …

• Simulatable algorithms avoid this problem– Ideally choices made by the algorithm should be simulatable by the

adversary.

• Anonymization algorithms are also susceptible to adversaries who know the algorithm or the objective function.

• Transparent anonymization limits the inference an attacker (who knows the algorithm) can make about sensitive values.

Page 53: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 53

Next Class• Composition of privacy • Differential Privacy

Page 54: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12.

Lecture 6 : 590.03 Fall 12 54

ReferencesA. Machanavajjhala, J. Gehrke, D. Kifer, M. Venkitasubramaniam, “L-Diversity: Privacy

beyond k-anonymity”, ICDE 2006K. Kenthapadi, N. Mishra, K. Nissim, “Simulatable Auditing”, PODS 2005R. Wong, A. Fu, K. Wang, J. Pei, “Minimality attack in privacy preserving data publishing”,

PVLDB 2007X. Xiao, Y. Tao & N. Koudas, “Transparent Anonymization: Thwarting adversaries who know

the algorithm”, TODS 2010


Recommended