+ All Categories
Home > Documents > Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution...

Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution...

Date post: 10-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
7
1 14 th ACM CCS Fengjun Li Automaton Segmentation: A New Approach to Preserve Privacy in XML Information Brokering Fengjun Li, Bo Luo, Peng Liu, Dongwon Lee and Chao-Hsien Chu College of Information Sciences and Technology The Pennsylvania State University 14 th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion Outline • Motivation • Solution – The broker-coordinator overlay approach – Automaton segmentation scheme – Query encryption scheme • Evaluation • Conclusion 14 th ACM CCS Fengjun Li Solution Evaluation Conclusion Information Brokering Scenario Data Owner User Motivation 14 th ACM CCS Fengjun Li Solution Evaluation Conclusion Naïve approach Alice California Hospital at Los Angeles Mt. Sinai Hospital at NYC Query Bob Bob know little •Name: Alice •Observed symptoms Motivation 14 th ACM CCS Fengjun Li Solution Evaluation Conclusion Privacy Threats Privacy threats outside the proxy Curious insider at the hospital Link eavesdropper Privacy threats from the proxy Compromised proxy Motivation 14 th ACM CCS Fengjun Li Solution Evaluation Conclusion Privacy Threat: Curious Insider California Hospital at Los Angeles Mt. Sinai Hospital at NYC Motivation Probing query Data location! Q: /provider/…/patient [name()=‘Alice’]//*
Transcript
Page 1: Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion 1 3 5 7 6 2 4 User Super Node Broker-Coordinator Network 1 2 8 10 7

1

14th ACM CCS

Fengjun Li

Automaton Segmentation: A New Approach to Preserve Privacy in

XML Information Brokering

Fengjun Li, Bo Luo, Peng Liu, Dongwon Lee and Chao-Hsien Chu

College of Information Sciences and TechnologyThe Pennsylvania State University

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Outline

• Motivation• Solution

– The broker-coordinator overlay approach

– Automaton segmentation scheme– Query encryption scheme

• Evaluation• Conclusion

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Information Brokering Scenario

Data Owner

UserMotivation

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Naïve approachAlice

California Hospital at Los Angeles

Mt. Sinai Hospital at NYC

QueryBobBob know little

•Name: Alice•Observed symptoms

Motivation

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Privacy Threats

• Privacy threats outside the proxy – Curious insider at the hospital – Link eavesdropper

• Privacy threats from the proxy– Compromised proxy

Motivation

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Privacy Threat: Curious Insider

California Hospital at Los Angeles

Mt. Sinai Hospital at NYC

Bob

Motivation

Probing query

Data location!

Q: /provider/…/patient [name()=‘Alice’]//*

Page 2: Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion 1 3 5 7 6 2 4 User Super Node Broker-Coordinator Network 1 2 8 10 7

2

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Mt. Sinai Hospital at NYC

Privacy Threat: Curious insider & eavesdropping

California Hospital at Los Angeles

Bob

Query?Encrypted!

Motivation

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Mt. Sinai Hospital at NYC

Privacy Threat: Curious insider & eavesdropping

California Hospital at Los Angeles

Bob

Location?From California Hospital, LA

Motivation

To Mt. Sinai Hospital, NYC

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Privacy Threat: Curious insider & eavesdropping

California Hospital at Los Angeles

Mt. Sinai Hospital at NYC

Bob

Motivation

Probing queryIt was Alice!Blood problem?

Q: /provider/…/patient [name()=‘Tom’]//*Q: /provider/…/patient [name()=‘Steve’]//*Q: /provider/…/patient [name()=‘Alice’]//*Q: /provider/…/patient [name()=‘Tom’]//*

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Q

Q

Bob

Major Privacy Concerns: Summary

Q: /provider/…/patient [name()=‘Alice’]/symptom [cancer()=‘blood’]//*

QueryContent

Q: /provider/…/patient [name()=‘Alice’]//*

DataLocation

Motivation

PatientLocation

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Our Solution: Overview

• To block probing queries– in-network access control

• To protect data location privacy, patient location privacy, and metadata privacy– automaton segmentation

• Defeat all the aforementioned privacy threats.

• Achieve superior privacy protection.

Solution

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Preliminary: How the proxy works?

• Routing rules

– Object is an XPath expression– Destination is an IP address.– Routing rules represent physical distribution

of data objects

{ , ( )}indexR object destination s=Solution

Page 3: Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion 1 3 5 7 6 2 4 User Super Node Broker-Coordinator Network 1 2 8 10 7

3

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Example Routing Rules

Solution

R1: {/site/people, 192.168.0.2}

R2: {//africa/item, 192.168.0.15}

R3: {//asia/item, 192.168.0.16}

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

An Example Routing Automaton• Query routing: one-to-many XPath matching.

• Routing automaton: A Non-deterministic Finite Automaton that captures routing rules.

Solution

R1: {/site/people, 192.168.0.2}R2: {//africa/item, 192.168.0.15}R3: {//asia/item, 192.168.0.16}

0

1

34

6

site

ε

*

people2

192.168.0.22

5192.168.0.15

5

7192.168.0.16

7

item

item

Africa

Asia

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

How to block probing queries?

Solution

Global RoutingAutomaton

RoutingRules

Integrated GlobalAutomaton

IntegratedRule

Access Control Automaton

Access control

rules (ACR)

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion 192.168.0.2

The Integrated Automaton

Q’: “/site/regions/asia/item[name=’Abacavir’]/location”

QUERY: “/*/regions/asia/item[name=’Abacavir’]/location”

*

1410

1112 13 15

1617 18

regions

2 3 4

categories

people

**ε

* item9

person

emailaddress

addressname

loca

tion

site

quan

tity

description

name

0

1 5 6 7

8

192.168.0.2192.168.0.2

192.168.0.15192.168.0.16

192.168.0.15

192.168.0.15

192.168.0.15Integrated Automaton

192.168.0.15192.168.0.16

Solution

192.168.0.16

192.168.0.16

192.168.0.16

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

System Architecture

Solution

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Brokers and Coordinators

• Brokers– Connect user– Forward query to the root-coordinator– Before forwarding, do pre-protection

• Coordinator– Root-coordinator, coordinator, and leaf-

coordinator– They form a coordinator tree– The leaf-coordinator does not hold any

automaton piece, but the other two do.• The Super Node

– Initiation and offline maintenance

Solution

Page 4: Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion 1 3 5 7 6 2 4 User Super Node Broker-Coordinator Network 1 2 8 10 7

4

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

1

3 5

7

6

2

4User

Super Node

Broker-Coordinator Network

1

2

810

76

9

5

34

Data Server

Data Server

Automaton Segmentation

site0

categories1

regions *5

2 ε 3

**

item6

loca

tion

quan

tity

description

name7

8

9

10

11

4

Solution

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

1

3 5

7

6

2

4User

Super Node

Broker-Coordinator Network

1

2

10

6

9

5

4

Data Server

Data Server

8

7

3

Automaton Segmentation

site0

categories1

regions*5

2 ε 3

**

item6

loca

tion

quan

tity

description

name7

8

9

10

11

4

Solution

Automata segmentation granularitySegment deployment and replication

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

1

3 5

7

6

2

4User

Super Node

Broker-Coordinator Network

1

2

10

6

9

5

4

Data Server

Data Server

8

7

3

Automaton SegmentationBroker 3 root-coordinator

Q= “/*/regions/asia/item[name=’Abacavir’]/location”

Solution

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Query Segment Encryption

1

2

n

Assume Q = s1s2…sn , where si is a segment.

Solution

s1 s2 … sn

s1 s2 … sn

s1 s2 … sn

s1 s2 … sn

3…

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

1

3 5

7

6

2

4User

Super Node

Broker-Coordinator Network

1

2

10

6

9

5

4

Data Server

Data Server

8

7

3

Query Segment EncryptionBob “/*/regions/asia/item[name=’Abacavir’]/location”Bob, “xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx”

Broker 3, “/*/regions/asia/item[name=’Abacavir’]/location”

Broker 3, “xxxx/regions/asia/item[name=’Abacavir’]/location”Broker 3, “xxxxxxxxxx/asia/item[name=’Abacavir’]/location”

Solution

Broker 3 “xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx”

Broker 3, “xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx”

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Privacy Analysis

• Answers are returned through leaf-coordinator and broker.

• Unauthorized queries are rejected by the coordinators: probing queries are blocked.

• Leaf-coordinators know data server addresses, but nothing about the data– Leaf coordinators cannot see queries.– Leaf coordinators only hold accept states.

• Other coordinators have partial routing rule, but NO location informationData Location Privacy

Solution

Page 5: Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion 1 3 5 7 6 2 4 User Super Node Broker-Coordinator Network 1 2 8 10 7

5

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Privacy Analysis• Query content is encrypted in the communication.• For all proxy components, ONLY the root-

coordinators can see the content of the query.Query Content Privacy (Partially)

• NO coordinator knows user location, and ONLY the broker does

• But the broker does not know query contentUser Location Privacy

• The automaton is split into pieces, NO proxy knows an entire (access control/routing) ruleMetadata Privacy

Solution

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Privacy Analysis

• If eavesdropping, communication between leaf-coordinator and the broker

• If a broker is compromised user location• If a root-coordinator is compromised,

query content• If a coordinator is compromised, one

segment of the automaton• If a leaf-coordinator is compromised, IP

address of data servers• If collusive, …

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Performance Evaluation

• Settings– Coordinators

• Java (JDK5.0)• Windows desktop

– Data• XMark DTD and XML documents• Synthetic rules• Synthetic queries

• Network Assumption– It’s unfair to use our Gigabyte intranet to

measure network latency– Internet latency

Evaluation

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

End-to-End Query Processing Time

• Average query brokering time (TC)– AC enforcement, routing to next coordinator and

encryption– Average value from experiment: 1.9 ms

• Average network transmission time (TN)– Estimated using Internet latency, average latency

between two coordinators: 100ms• Average number of hops (NHOP)

– Estimated from experiment: 5.7

• Without any privacy protection, Tforward = 211 (ms)• Average query evaluation time at data server (TE)• Average backward data transmission time (Tback)

Evaluation

Tforward = TC × NHOP + TN ×(NHOP + 1) = 681 (ms)

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

System Scalability

• Total computation from all the coordinators– Measured by NSeg: total number of

query segments in PPIB system.• Parameters:

– Query frequency– Size of rules

Evaluation

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

System Scalability

• NSeg vs. NQuery

• NSeg vs. NRules0

10000

20000

30000

40000

50000

60000

0 100 200 300 400 500 600 700 800 900 1000

20 Rules 40 Rules 60 Rules80 Rules 100 Rules

0

50

100

150

200

250

300

0 50 100 150 200

Evaluation

Page 6: Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion 1 3 5 7 6 2 4 User Super Node Broker-Coordinator Network 1 2 8 10 7

6

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Conclusion

• Design the first privacy-preserving information brokering system (PPIB).– Integrate query routing with in-broker access

control• Design automaton segmentation scheme

to preserve query content privacy.– Integrate automaton segmentation with query

routing and access control• Provide most comprehensive privacy

protection for IBSs with insignificant performance degradation and scalability.

Conclusion

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Questions?

• Thank you!

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Privacy Analysis

• If the root coordinator is compromised:– PPIB vs. centralized proxy

• We still protect query location and data location privacy

– Full query segment encryption• Also encrypt un-processed XPath steps

– Relaxed: encrypt all predicates

• Coordinators need to be authenticated to decrypt an XPath step

• Extra overhead: very complicated authentication process and key management scheme

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

• If one coordinator is compromised,– What he obtains:

• A segment of the integrated automatonan XPath step

• Public resource: XML schema– What he can infer:

• Pre-path from root to itself– Multiple paths available k-anonymity

• Post-path from itself to the accepted states– Multiple accept states available l-diversity

• Together t-concealment

Solution

Privacy Analysis

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

A Real Example• Assume 5 hospitals are sharing data; Each

hospital has 10,000 patients – Assume each hospital has 10 roles;

• How large is the total amount of index data? – Assume we only index on patient name 10000*5=50K– could be greatly reduced at the router.

• How large is the data server for each hospital? – TB Level: 100M*10K=1T

• How many rules are there for each hospital? – 10-30 rules per role, 100-300 per hospital, 500-1500 in

total– Rules may be identical or similar

• How large is the global access control automata? – A fair guess (MRQ [SUTC 06]): 50 distinct paths

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

A Real Example• How many coordinators are needed in each

category?– A fair guess: 50 leaf coordinators, 10 intermediate

coordinators– Replicates needed

• What is the size for each automaton segment? – One Xpath step per segment– Memory consumption of Java implementation: 10KB

level

• What is the average size of a query? – Consider the size of health care schema (e.g. HL7)– A fair guess: 4-8 paths

Page 7: Bob Querypike.psu.edu/presentations/ccs07.pdf4 14th ACM CCS Fengjun Li Motivation Solution Evaluation Conclusion 1 3 5 7 6 2 4 User Super Node Broker-Coordinator Network 1 2 8 10 7

7

14th ACM CCS

Fengjun Li

Motivation

Solution

Evaluation

Conclusion

Different with other anonymizing services

• Target destinations are known beforehand

• User don’t know where to send the queryQuery routing is necessary and unavoidableProxy with the routing automaton knows too much


Recommended