International Technology Alliancein
Network & Information Sciences
Knowledge Inference for Securing and
Optimizing Secure Computation
Knowledge Inference for Securing and
Optimizing Secure Computation
Piotr (Peter) Mardziel, Michael Hicks, Aseem Rastogi, Matthew Hammer, Jonathan Katz (UMD)
Mudhakar Srivatsa (IBM TJ Watson)With Towsley et al (Umass), Kasturi Rangan (UCLA)
Annual Meeting of the ITAOctober 2013
Sharing between coalition domains is critical for mission success Sharing between coalition domains is critical for mission success
2
Scout (Coalition A)
Supporting force(Coalition B)
Unmanned Air Vehicle (UAV)(Coalition A)
Back-office Data Analyst(Coalition A)
Satellite Communications – backhaul(Coalition A)
XX
XX
YY
X’X’ X’X’
X’X’
YY
ZZZZ
YY
YYYYMixed force
(Coalition A, C)
ITA Technologies facilitate sharingITA Technologies facilitate sharing
ITA has developed many excellent technologies for sharing information–Gaian DB–Information fabric–Controlled English Store
All harness information and make it available to coalition partners–Provide a query or pub/sub interface
But: there may be risk in sharing all information–Might like to allow some queries but not others
• If the query would reveal too much information about the raw data• If a sequence of queries would do so, even if one would not
3
Our research: Knowledge inferenceOur research: Knowledge inference
Key idea: use program analysis (of the query)
–to understand what the answer reveals about sensitive information to (a rational) recipient
We call this analysis knowledge inference
We have used knowledge inference in a variety of applications
4
Summary of Results (outline)Summary of Results (outline)
Knowledge-based security [CSF’11, NIPSPP’12, JCS’13, HOTNETS’13]– Enforce a security policy based on adversary’s (accumulated) knowledge– Implementation and experimental evaluation– Proof of soundness: will never underestimate adversary knowledge
Knowledge-based security for SMC [PLAS’12]– Adapt knowledge inference to consider multiple parties’ secrets– Proof of soundness
Optimizing SMC [PLAS’13]– Identify inferrable values by knowledge inference
• Do not bother to compute these using SMC– Leads to 30x speedup– Proof of correctness of technique
5
Papers on ITACSPapers on ITACS
[JCS’13] Piotr Mardziel, Stephen Magill, Mike Hicks and Mudhakar Srivatsa, Dynamic Enforcement of Knowledge-based Security Policies, Journal of Comp. Security, Feb’12,
– https://www.usukitacs.com/node/1900. [NIPSPP’12] Piotr Mardziel and Kasturi Rangan, Probabilistic Computation for Information
Security, NIPS Probabilistic Programming Workshop, Dec’12, – https://www.usukitacs.com/node/2234.
[PLAS’12] P. Mardziel, M. Hicks, J. Katz and M. Srivatsa, Knowledge-Oriented Secure Multiparty Computation, Programming Languages and Analyses for Security, June’12,
– https://www.usukitacs.com/node/2003. [PLAS’13] Aseem Rastogi, Piotr Mardziel, Michael Hicks and Matthew Hammer, Knowledge
Inference for Optimizing Secure Multi-party Computation, Programming Languages and Analyses for Security, June’13.
– https://www.usukitacs.com/node/2310 [HOTNETS’13] Z. Shafiq, F. Le, M. Srivatsa and D. Towsley. Cross-Path Inference Attacks
on Multipath TCP, ACM HotNets, July’13. – https://www.usukitacs.com/node/2491
6
Knowledge about the worldKnowledge about the world
Learning about the world from observations.
7
0.5 : Today = not-raining0.5 : Today = raining
weatherOutlook
0.82 : Today = not-raining0.18 : Today = raining
Outlook = sunny inference
Knowledge about secretsKnowledge about secrets
Characterize adversary knowledge.
8
SecretsystemPublic Output
Public Output = “login failed” inference
…0.01 : Secret = 410.90 : Secret = 420.01 : Secret = 43 …
Levels of knowledge?Levels of knowledge?
Characterize system as safe vs. unsafe.
9
…0.05 : Secret = 410.05 : Secret = 420.05 : Secret = 43 …
…0.02 : Secret = 410.40 : Secret = 420.02 : Secret = 43 …
…0.01 : Secret = 410.90 : Secret = 420.01 : Secret = 43 …
1.00 : Secret = 42
inference
approx.inference
unsafesafe
Soundness of knowledgeSoundness of knowledge
Soundly approximate level of knowledge.
10
…0.05 : Secret = 410.05 : Secret = 420.05 : Secret = 43 …
…0.02 : Secret = 410.40 : Secret = 420.02 : Secret = 43 …
…0.01 : Secret = 410.90 : Secret = 420.01 : Secret = 43 …
1.00 : Secret = 42
actual
inference
sound approx.inference
unsafesafe
Technology: probabilistic programmingTechnology: probabilistic programming
Programs–whose inputs and outputs may be distributions rather than values–which may contain uses of probabilistic choice
Effectively represent algorithmic description of a probabilistic model–conditional probability distribution relating inputs and outputs
11
Pr [ Outlook = sunny | Today = not-raining ] = 0.9
weather(today) { if (today == “not-raining”) { if (flip 0.9) return “sunny” else return “overcast” } else if (today == “raining”) { if (flip 0.8) return “overcast” else return “sunny” }}
CODE
• Maintain a representation of each querier’s belief about secret’s possible values• Each query result revises the belief; reject if actual secret becomes too likely
• Cannot let rejection defeat our protection.
time
Q1 Q3… …Q2
Reject
12
Belief ≜probabilitydistribution
Bayesian reasoningto revise
belief OK (answer) OK (answer)
Knowledge-based securityKnowledge-based security
Policy = knowledge thresholdPolicy = knowledge threshold
Answer a query if, for querier’s revised belief, Pr[my secret] < t–Call t the knowledge threshold
Choice of t depends on the risk of revelation
13
αProb: Implementation (CSF’11, JCS’13)αProb: Implementation (CSF’11, JCS’13)
Queries are simple imperative programs
Approach: abstract interpretation for implementing probabilistic operations. Building blocks:
– lattice point enumeration– integer programming
Key idea: abstract interpretation is sound– Never underestimate the knowledge– But may overestimate it
• Improves audit time• May reject some legal queries
Application to sensor networks, location [NIPSPP’12]
– Gave demo earlier in the week Application to MPTCP [HOTNETS’13]
14
Current activity: Modeling time/changeCurrent activity: Modeling time/change
Secrets can change over time.
In progress: formal model, theorems about knowledge of both the stream of secrets and the delta function
15
Pr [ Secret2 = 42 | Secret1 = 42 ] = 0.900392
delta(secret1) { if (flip 0.9) return secret1 else return (uniform 0,255)}
CODE
Pr [ Secret1 = 42 ] = 1.0
Other activitiesOther activities
Expand expressiveness, improve performance–Model continuous distributions, not just discrete ones–Employ other forms of approximation
More applications–Multiparty TCP flows–Sensor networks–Mobility
16
Joint computations over secretsJoint computations over secrets
Rather than asymmetric queries, may want to compute joint results–Coalitions each have sensor networks; use them to answer queries
while hiding details–Coalitions perform joint mission planning; staff mission without knowing
total resources
17
Q = Some function
x yQ (X,Y)
“attack at dawn”
Secure multiparty computationSecure multiparty computation
Multiple parties have secrets to protect. Want to compute some function over their secrets without revealing
them.
18
x yQ(x,y)
True / False
Q = if x ≥ y then out := True else out := False
Secure multiparty computationSecure multiparty computation
Use trusted third party.
19
x yTQ(x,y)
Q = if x ≥ y then out := True else out := False
True
Secure multiparty computationSecure multiparty computation
SMC lets the participants compute this without a trusted third party.
20
Tx y
Q(x,y)
TrueQ = if x ≥ y then out := True else out := False
Secure multiparty computationSecure multiparty computation
Nothing is learned beyond what is implied* by the query output.
21
x yQ(x,y)
True / False
Q = if x ≥ y then out := True else out := False
Secure multiparty computationSecure multiparty computation
Nothing is learned beyond what is implied* by the query output.
–* what is implied can be a lot
22
x = ?
x y=2Q(x,2)
Q = if x ≥ y then out := True else out := False
FalseA B
Secure multiparty computationSecure multiparty computation
Nothing is learned beyond what is implied* by the query output.
–* what is implied can be a lot
23
x = 1
Q(x,2)
Q = if x ≥ y then out := True else out := False
False
x
A
y=2
B
Secure multiparty computationSecure multiparty computation
Nothing is learned beyond what is implied* by the query output.
–* what is implied can be a lot
24
x = ?
Q(x,3)
Q = if x ≥ y then out := True else out := False
False
x
A
y=3
B
Secure multiparty computationSecure multiparty computation
Nothing is learned beyond what is implied* by the query output.
–* what is implied can be a lot
25
x {1,2}∈
Q(x,3)
Q = if x ≥ y then out := True else out := False
False
x
A
y=3
B
Secure multiparty computationSecure multiparty computation
Nothing is learned beyond what is implied* by the query output.
–* what is implied can be a lot
26
x = ?
Q(x, ∞)
Q = if x ≥ y then out := True else out := False
False
x
A
y=∞
B
Secure multiparty computationSecure multiparty computation
Nothing is learned beyond what is implied* by the query output.
–* what is implied can be a lot
27
x ≥ 1
Q(x, ∞)
Q = if x ≥ y then out := True else out := False
False
x
A
y=∞
B
Knowledge-based security for SMC (PLAS’12)Knowledge-based security for SMC (PLAS’12)
Results (details in paper): –Adapt knowledge inference to SMC setting–Enforce threshold-based policies
• Two techniques: Belief sets and SMC-based belief tracking
–Proof that our methods are sound (never underapproximate adversary knowledge)
Implementation not sufficiently performant for use on-line
28
Goal: Make SMC more performant (PLAS’13)Goal: Make SMC more performant (PLAS’13)
SMC is an appealing technology, but it is very slow
– Implementation based on “garbled circuits”
– Several orders of magnitude slower than normal computation
Recent work has developed general methods to improve SMC performance
– Circuit-level optimizations
– Pipelining circuit generation and execution (increases parallelism and decreases memory)
– But: ultimately SMC is always going to be much slower than normal computation
Idea: use knowledge inference to find opportunities to replace SMC with normal computation in particular programs, with no loss to security
29
Example – Joint Median ComputationExample – Joint Median Computation
{ A1, A2 }, { B1, B2 }
Assume: A1 < A2 and B1 < B2 and Distinct(A1, A2, B1, B2)
a = A1 ≤ B1;
b = a ? A2 : A1;
c = a ? B1 : B2;
d = b ≤ c;
output = d ? b : c;04/20/2330
Can show that Alice and Bob can infer a and d
Can show that Alice and Bob can infer a and d
Secure Computation
Secure Computation
04/20/2331
output = d ? b : c;dd
a = A1 ≤ B1;
b = a ? A2 : A1;
c = a ? B1: B2;
d = b ≤ c;
Knowledge leads to optimized protocolKnowledge leads to optimized protocol
Median Example – Analysis from Bob’s PerspectiveMedian Example – Analysis from Bob’s Perspective
04/20/2332
a = A1 ≤ B1;
b = a ? A2 : A1;
c = a ? B1 : B2;
d = b ≤ c;
output = d ? b : c;
A1 ≤ B1 ∧ A2 ≤ B1 A1 ≤ B1 ∧ A2 > B1 A1 > B1 ∧ A2 ≤ B1 A1 > B1 ∧ A2 > B1
a = (output ≤ B1) Recall: B1 < B2
Formalization of KnowledgeFormalization of Knowledge
04/20/2333
x can be uniquely determined by p’s inputs I and outputs O
x can be uniquely determined by p’s inputs I and outputs O
Party p knows x if:
Two program executions that agree on I and O, also agree on x
Two program executions that agree on I and O, also agree on x
Knowledge in Median ExampleKnowledge in Median Example
04/20/2334
a = A1 ≤ B1;
b = a ? A2 : A1;
c = a ? B1 : B2;
d = b ≤ c;
output = d ? b : c;
Bob knows a, if for all final states σ1 and σ2 s.t.
•σ1[B1] = σ2[B1],
•σ1[B2] = σ2[B2], and
•σ1[output] = σ2[output],
we have,
•σ1[a] = σ2[a]
Results (details in paper)Results (details in paper)
Make the previous definition into an algorithm by using an idea called self-composition
–Allows us to create a formula that, if satisfiable, says whether a variable is known
–Can give this formula to an SMT solver–Result: implementation and proof of correctness (sound and relatively complete)
We have also developed an algorithm that is constructive–Computes formula that witnesses knowledge of the variable
35
Ongoing workOngoing work
Building SMC compiler–Novel programming language for expressing mixed mode multiparty computation (M3PC)
• Combination of joint and local computations–Will employ knowledge-inference optimization to transform SMC programs to M3PC programs
–Developing novel back end based on garbled circuits (standard mechanism) and oblivious RAM
36
SummarySummary
Research agenda based on knowledge inference–Determining what a party can learn about a secret given a run of a program
–Can use this for enforcing security, and optimizing computation
Ongoing work continues this agenda–Time-varying secrets–New applications (greater expressiveness)–New computational platform
37
BACKUPBACKUP
38
ExpressibilityExpressibility
Prior work [CSF’11, JCS‘13], supported limited language features.– distributions: piecewise bounds over discrete domains– possible but inconvenient to express other distributions
39
discretedistributions
upper bounds
lower bounds
Expressibility: continuous distributionsExpressibility: continuous distributions
Continuous distributions for modeling real world processes.
40
Polynomial approximationPolynomial approximation
Improve precision by polynomial bounds (as opposed to constant).
41
Scales better than enumerationScales better than enumeration
=0 ≤ bday ≤ 364
1956 ≤ byear ≤ 1992
=0 ≤ bday ≤ 364
1910 ≤ byear ≤ 2010
1 pp
> 1 pp
42
each equally likely
each equally likely
bday1 small
bday 1 large
43
Birthday query 1+2+special
Performance/precision tradeoffPerformance/precision tradeoff
Intervals very fast generallyIntervals very fast generally
Query Intervals Octagons Polyhedra
Bday1 (small) 0.01 1.87 2.81
Bday1+2 (small) 0.01 2.9 5.25
Bday1+2+spec 0.47 17.8 23.0
Bday1 (large) 0.01 2.1 2.48
Bday1+2 (large) 0.02 3.02 4.52
Bday1+2+spec 0.58 33.6 46.5
Pizza 0.26 92.7 125.5
Photo 0.02 5.47 7.98
Travel 0.48 126.9 154.5
44
Times in secondsAll achieve maximum precision when given unlimited polyhedra
LattE is the performance bottleneckLattE is the performance bottleneck
45
Merging order matters for precision Merging order matters for precision
46
Each point represents a different merging order for the given boundMedian precision point depicted as a box
Semi-interquartile range given in grayBest precision possible is at the very bottom (about 3.8 * 10-4)