Download - International Technology Alliance in Network & Information Sciences Knowledge Inference for Securing and Optimizing Secure Computation Piotr (Peter) Mardziel,

International Technology Alliancein

Network & Information Sciences

Knowledge Inference for Securing and

Optimizing Secure Computation

Knowledge Inference for Securing and

Optimizing Secure Computation

Piotr (Peter) Mardziel, Michael Hicks, Aseem Rastogi, Matthew Hammer, Jonathan Katz (UMD)

Mudhakar Srivatsa (IBM TJ Watson)With Towsley et al (Umass), Kasturi Rangan (UCLA)

Annual Meeting of the ITAOctober 2013

Sharing between coalition domains is critical for mission success Sharing between coalition domains is critical for mission success

2

Scout (Coalition A)

Supporting force(Coalition B)

Unmanned Air Vehicle (UAV)(Coalition A)

Back-office Data Analyst(Coalition A)

Satellite Communications – backhaul(Coalition A)

XX

XX

YY

X’X’ X’X’

X’X’

YY

ZZZZ

YY

YYYYMixed force

(Coalition A, C)

ITA Technologies facilitate sharingITA Technologies facilitate sharing

ITA has developed many excellent technologies for sharing information–Gaian DB–Information fabric–Controlled English Store

All harness information and make it available to coalition partners–Provide a query or pub/sub interface

But: there may be risk in sharing all information–Might like to allow some queries but not others

• If the query would reveal too much information about the raw data• If a sequence of queries would do so, even if one would not

3

Our research: Knowledge inferenceOur research: Knowledge inference

Key idea: use program analysis (of the query)

–to understand what the answer reveals about sensitive information to (a rational) recipient

We call this analysis knowledge inference

We have used knowledge inference in a variety of applications

4

Summary of Results (outline)Summary of Results (outline)

Knowledge-based security [CSF’11, NIPSPP’12, JCS’13, HOTNETS’13]– Enforce a security policy based on adversary’s (accumulated) knowledge– Implementation and experimental evaluation– Proof of soundness: will never underestimate adversary knowledge

Knowledge-based security for SMC [PLAS’12]– Adapt knowledge inference to consider multiple parties’ secrets– Proof of soundness

Optimizing SMC [PLAS’13]– Identify inferrable values by knowledge inference

• Do not bother to compute these using SMC– Leads to 30x speedup– Proof of correctness of technique

5

Papers on ITACSPapers on ITACS

[JCS’13] Piotr Mardziel, Stephen Magill, Mike Hicks and Mudhakar Srivatsa, Dynamic Enforcement of Knowledge-based Security Policies, Journal of Comp. Security, Feb’12,

– https://www.usukitacs.com/node/1900. [NIPSPP’12] Piotr Mardziel and Kasturi Rangan, Probabilistic Computation for Information

Security, NIPS Probabilistic Programming Workshop, Dec’12, – https://www.usukitacs.com/node/2234.

[PLAS’12] P. Mardziel, M. Hicks, J. Katz and M. Srivatsa, Knowledge-Oriented Secure Multiparty Computation, Programming Languages and Analyses for Security, June’12,

– https://www.usukitacs.com/node/2003. [PLAS’13] Aseem Rastogi, Piotr Mardziel, Michael Hicks and Matthew Hammer, Knowledge

Inference for Optimizing Secure Multi-party Computation, Programming Languages and Analyses for Security, June’13.

– https://www.usukitacs.com/node/2310 [HOTNETS’13] Z. Shafiq, F. Le, M. Srivatsa and D. Towsley. Cross-Path Inference Attacks

on Multipath TCP, ACM HotNets, July’13. – https://www.usukitacs.com/node/2491

6

Knowledge about the worldKnowledge about the world

Learning about the world from observations.

7

0.5 : Today = not-raining0.5 : Today = raining

weatherOutlook

0.82 : Today = not-raining0.18 : Today = raining

Outlook = sunny inference

Knowledge about secretsKnowledge about secrets

Characterize adversary knowledge.

8

SecretsystemPublic Output

Public Output = “login failed” inference

…0.01 : Secret = 410.90 : Secret = 420.01 : Secret = 43 …

Levels of knowledge?Levels of knowledge?

Characterize system as safe vs. unsafe.

9




1.00 : Secret = 42

inference

approx.inference

unsafesafe

Soundness of knowledgeSoundness of knowledge

Soundly approximate level of knowledge.

10




1.00 : Secret = 42

actual

inference

sound approx.inference

unsafesafe

Technology: probabilistic programmingTechnology: probabilistic programming

Programs–whose inputs and outputs may be distributions rather than values–which may contain uses of probabilistic choice

Effectively represent algorithmic description of a probabilistic model–conditional probability distribution relating inputs and outputs

11

Pr [ Outlook = sunny | Today = not-raining ] = 0.9

weather(today) { if (today == “not-raining”) { if (flip 0.9) return “sunny” else return “overcast” } else if (today == “raining”) { if (flip 0.8) return “overcast” else return “sunny” }}

CODE

• Maintain a representation of each querier’s belief about secret’s possible values• Each query result revises the belief; reject if actual secret becomes too likely

• Cannot let rejection defeat our protection.

time

Q1 Q3… …Q2

Reject

12

Belief ≜probabilitydistribution

Bayesian reasoningto revise

belief OK (answer) OK (answer)

Knowledge-based securityKnowledge-based security

Policy = knowledge thresholdPolicy = knowledge threshold

Answer a query if, for querier’s revised belief, Pr[my secret] < t–Call t the knowledge threshold

Choice of t depends on the risk of revelation

13

αProb: Implementation (CSF’11, JCS’13)αProb: Implementation (CSF’11, JCS’13)

Queries are simple imperative programs

Approach: abstract interpretation for implementing probabilistic operations. Building blocks:

– lattice point enumeration– integer programming

Key idea: abstract interpretation is sound– Never underestimate the knowledge– But may overestimate it

• Improves audit time• May reject some legal queries

Application to sensor networks, location [NIPSPP’12]

– Gave demo earlier in the week Application to MPTCP [HOTNETS’13]

14

Current activity: Modeling time/changeCurrent activity: Modeling time/change

Secrets can change over time.

In progress: formal model, theorems about knowledge of both the stream of secrets and the delta function

15

Pr [ Secret2 = 42 | Secret1 = 42 ] = 0.900392

delta(secret1) { if (flip 0.9) return secret1 else return (uniform 0,255)}

CODE

Pr [ Secret1 = 42 ] = 1.0

Other activitiesOther activities

Expand expressiveness, improve performance–Model continuous distributions, not just discrete ones–Employ other forms of approximation

More applications–Multiparty TCP flows–Sensor networks–Mobility

16

Joint computations over secretsJoint computations over secrets

Rather than asymmetric queries, may want to compute joint results–Coalitions each have sensor networks; use them to answer queries

while hiding details–Coalitions perform joint mission planning; staff mission without knowing

total resources

17

Q = Some function

x yQ (X,Y)

“attack at dawn”

Secure multiparty computationSecure multiparty computation

Multiple parties have secrets to protect. Want to compute some function over their secrets without revealing

them.

18

x yQ(x,y)

True / False

Q = if x ≥ y then out := True else out := False


Use trusted third party.

19

x yTQ(x,y)


True


SMC lets the participants compute this without a trusted third party.

20

Tx y

Q(x,y)

TrueQ = if x ≥ y then out := True else out := False


Nothing is learned beyond what is implied* by the query output.

21

x yQ(x,y)

True / False




–* what is implied can be a lot

22

x = ?

x y=2Q(x,2)


FalseA B




23

x = 1

Q(x,2)


False

x

A

y=2

B




24

x = ?

Q(x,3)


False

x

A

y=3

B




25

x {1,2}∈

Q(x,3)


False

x

A

y=3

B




26

x = ?

Q(x, ∞)


False

x

A

y=∞

B




27

x ≥ 1

Q(x, ∞)


False

x

A

y=∞

B

Knowledge-based security for SMC (PLAS’12)Knowledge-based security for SMC (PLAS’12)

Results (details in paper): –Adapt knowledge inference to SMC setting–Enforce threshold-based policies

• Two techniques: Belief sets and SMC-based belief tracking

–Proof that our methods are sound (never underapproximate adversary knowledge)

Implementation not sufficiently performant for use on-line

28

Goal: Make SMC more performant (PLAS’13)Goal: Make SMC more performant (PLAS’13)

SMC is an appealing technology, but it is very slow

– Implementation based on “garbled circuits”

– Several orders of magnitude slower than normal computation

Recent work has developed general methods to improve SMC performance

– Circuit-level optimizations

– Pipelining circuit generation and execution (increases parallelism and decreases memory)

– But: ultimately SMC is always going to be much slower than normal computation

Idea: use knowledge inference to find opportunities to replace SMC with normal computation in particular programs, with no loss to security

29

Example – Joint Median ComputationExample – Joint Median Computation

{ A1, A2 }, { B1, B2 }

Assume: A1 < A2 and B1 < B2 and Distinct(A1, A2, B1, B2)

a = A1 ≤ B1;

b = a ? A2 : A1;

c = a ? B1 : B2;

d = b ≤ c;

output = d ? b : c;04/20/2330

Can show that Alice and Bob can infer a and d

Can show that Alice and Bob can infer a and d

Secure Computation

Secure Computation

04/20/2331

output = d ? b : c;dd

a = A1 ≤ B1;

b = a ? A2 : A1;

c = a ? B1: B2;

d = b ≤ c;

Knowledge leads to optimized protocolKnowledge leads to optimized protocol

Median Example – Analysis from Bob’s PerspectiveMedian Example – Analysis from Bob’s Perspective

04/20/2332

a = A1 ≤ B1;

b = a ? A2 : A1;

c = a ? B1 : B2;

d = b ≤ c;

output = d ? b : c;

A1 ≤ B1 ∧ A2 ≤ B1 A1 ≤ B1 ∧ A2 > B1 A1 > B1 ∧ A2 ≤ B1 A1 > B1 ∧ A2 > B1

a = (output ≤ B1) Recall: B1 < B2

Formalization of KnowledgeFormalization of Knowledge

04/20/2333

x can be uniquely determined by p’s inputs I and outputs O

x can be uniquely determined by p’s inputs I and outputs O

Party p knows x if:

Two program executions that agree on I and O, also agree on x

Two program executions that agree on I and O, also agree on x

Knowledge in Median ExampleKnowledge in Median Example

04/20/2334

a = A1 ≤ B1;

b = a ? A2 : A1;

c = a ? B1 : B2;

d = b ≤ c;

output = d ? b : c;

Bob knows a, if for all final states σ1 and σ2 s.t.

•σ1[B1] = σ2[B1],

•σ1[B2] = σ2[B2], and

•σ1[output] = σ2[output],

we have,

•σ1[a] = σ2[a]

Results (details in paper)Results (details in paper)

Make the previous definition into an algorithm by using an idea called self-composition

–Allows us to create a formula that, if satisfiable, says whether a variable is known

–Can give this formula to an SMT solver–Result: implementation and proof of correctness (sound and relatively complete)

We have also developed an algorithm that is constructive–Computes formula that witnesses knowledge of the variable

35

Ongoing workOngoing work

Building SMC compiler–Novel programming language for expressing mixed mode multiparty computation (M3PC)

• Combination of joint and local computations–Will employ knowledge-inference optimization to transform SMC programs to M3PC programs

–Developing novel back end based on garbled circuits (standard mechanism) and oblivious RAM

36

SummarySummary

Research agenda based on knowledge inference–Determining what a party can learn about a secret given a run of a program

–Can use this for enforcing security, and optimizing computation

Ongoing work continues this agenda–Time-varying secrets–New applications (greater expressiveness)–New computational platform

37

BACKUPBACKUP

38

ExpressibilityExpressibility

Prior work [CSF’11, JCS‘13], supported limited language features.– distributions: piecewise bounds over discrete domains– possible but inconvenient to express other distributions

39

discretedistributions

upper bounds

lower bounds

Expressibility: continuous distributionsExpressibility: continuous distributions

Continuous distributions for modeling real world processes.

40

Polynomial approximationPolynomial approximation

Improve precision by polynomial bounds (as opposed to constant).

41

Scales better than enumerationScales better than enumeration

=0 ≤ bday ≤ 364

1956 ≤ byear ≤ 1992

=0 ≤ bday ≤ 364

1910 ≤ byear ≤ 2010

1 pp

> 1 pp

42

each equally likely

each equally likely

bday1 small

bday 1 large

43

Birthday query 1+2+special

Performance/precision tradeoffPerformance/precision tradeoff

Intervals very fast generallyIntervals very fast generally

Query Intervals Octagons Polyhedra

Bday1 (small) 0.01 1.87 2.81

Bday1+2 (small) 0.01 2.9 5.25

Bday1+2+spec 0.47 17.8 23.0

Bday1 (large) 0.01 2.1 2.48

Bday1+2 (large) 0.02 3.02 4.52

Bday1+2+spec 0.58 33.6 46.5

Pizza 0.26 92.7 125.5

Photo 0.02 5.47 7.98

Travel 0.48 126.9 154.5

44

Times in secondsAll achieve maximum precision when given unlimited polyhedra

LattE is the performance bottleneckLattE is the performance bottleneck

45

Merging order matters for precision Merging order matters for precision

46

Each point represents a different merging order for the given boundMedian precision point depicted as a box

Semi-interquartile range given in grayBest precision possible is at the very bottom (about 3.8 * 10-4)