Vitaly Shmatikov
SRI International
Constraint-Based Methods:Adding Algebraic Properties
toSymbolic Models
One-Slide Summary “Constraint solving” is a symbolic analysis
method for cryptographic protocols Decidable without finite bounds on the attacker
Big win over finite-state checking (FDR, Mur, etc.) Only need to specify behavior of honest participants
Can be extended with algebraic theories for XOR, modular multiplication, Diffie-Hellman Push-button procedure for finding both Dolev-Yao and
algebraic attacks (e.g., Pereira-Quisquater) Works only for a finite number of sessions
“Attack template” must be expressed as a symbolic execution trace
Attacker algebra withequational theory
Protocol Analysis TechniquesProtocol Analysis Techniques
Formal Models Computational Models
Modal Logics Decidable Inductive Proofs
Probabilistic poly-timeRandom oracle…
Infinite message space, finite sessionsFinite-state
Process Calculi …
Free attacker algebra
(no probabilities)
Protocol Analysis Meets Algebra
Dolev-Yao model uses “black-box” cryptography Many crypto primitives are not black boxes
XOR: ab = ba; aa = 0 Modular exponentiation: xy = yx; (xy)x-1 = y
Attacker can and will exploit algebraic properties Ryan-Schneider attack on Bull’s recursive authentication protocol Pereira-Quisquater attack on A-GDH.2 protocol
Goal: fully automated analysis of protocols with relevant algebraic theories GDOI, group key management protocols, …
A-GDH.2 Protocol
Parties start with pairwise keys Kaz,Kbz,Kcz The goal is to establish common session key
rarbrcrz
A B C
Z
,ra
rbrcrzKaz, rarcrzKbz, rarbrzKcz
rbrc,rarc,rarb,rarbrc
ra,rb,rarb
p is primeq is prime divisor of p-1 is generator of cyclic sub- group of Z*p of order q
Computes session key rarbrcrz
as (rarcrzKbz)Kbz-1rb
[Ateniese, Steiner, Tsudik ’00]
Is This Protocol Secure?
A B
Z
,rara,rb,rarb
rbrc,rarc,rarb,rarbrc
rbrcrzKaz, rarcrzKbz, rarbrzKcz
Suppose two sessions are run concurrently, and malicious C wants tolearn the session key of the session from which he is excluded
C A B
Z
,qaqa,qb,qaqb
qbqzKaz, qaqzKbz
C
Can the attacker who controls the network and participates in
the 1st session learn the session key of the 2nd session?
Model Checking Approach Two sources of infinite behavior
Multiple protocol sessions, multiple participant roles Message space or data space may be infinite
Finite approximation Assume finite number of participants
Example: 2 clients, 2 servers Assume finite message space
Represent random numbers by r1, r2, r3, … Do not allow encrypt(encrypt(encrypt(…)))
This is restriction is not necessaryfor fully automated analysis!
This restriction is necessary(or the problem is undecidable)
Infinite-State Protocol Model Finite number of processes
Each process models a protocol role Messages modeled as terms with variables
Variables represent data under attacker’s control Attacker capabilities modeled by a term
algebra No artificial bounds on attacker computations Generates an infinite space of possible attacker messages
Protocol analysis problem reduces to a decidable symbolic constraint solving problem Easy-to-use, practical software for protocol analysis
[Amadio and Lugiez ‘00][Rusinowitch and Turuani ‘01][Boreale ‘01]
[Millen and Shmatikov ‘01]
Roles in A-GDH.2 ProtocolA B C
Z
,ra ra,rb,rarb
rbrc,rarc,rarb,rarbrc
rbrcrzKaz, rarcrzKbz, rarbrzKcz
B ,X1
B X1,rb,X1rb
B Y1,Y2Kbz,Y3
B role
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
Z role
Variables represent terms unknown to the party who plays the role Attacker can instantiate a variable with any value, but instantiation
must be consistent in all terms where it occurs
Symbolic Execution Trace
A B
Z
,rara,rb,rarb
rbrc,rarc,rarb,rarbrc
rbrcrzKaz, rarcrzKbz, rarbrzKcz
Suppose two sessions are run concurrently, and malicious C wants tolearn the session key of the session from which he is excluded
C
B ,X1
B X1,rb,X1rb
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B ,V1
B V1,qb,V1qb
B W1,W2Kbz,W3
A B
Z
,qaqa,qb,qaqb
qbqzKaz, qaqzKbz
C
B and Z from 1st session B from 2nd session
Partial execution trace (there are finitely many)
Is There A Feasible Attack?B ,X1
B X1,rb,X1rb
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B ,V1
B V1,qb,V1qb
B W1,W2Kbz,W3
W2qb
This attack is feasible if and only if the attacker can consistently instantiate all variables in the
trace so that he can produce every message received by B and Z
B will use this value as session key. If attacker can learn (and announce) it, the protocol is broken.
Attack is modeled as a symbolic execution trace A trace is a sequence of message send and receive events Attack trace ends in a violation (e.g., attacker learns the secret) Messages contain variables, modeling data controlled by
attacker Adequate for trace-based security properties
Secrecy, authentication, some forms of fairness… A symbolic trace may or may not have a
feasible concrete instantiation Finding whether such an instantiation exists is the main goal of
symbolic (infinite-state) protocol analysis
Symbolic Attack Traces
From Attack Traces to Constraints
For each message sent by the attacker in the attack trace, create a symbolic constraint
mi is the message attacker needs to send t1,…,tn are the messages observed by attacker up to this point
Attack is feasible if and only if all constraints are satisfiable simultaneously
There exists an instantiation such that i mi can be derived from t1, …, tn in attacker’s term algebra
mi from t1, …, tn
Constraint Generation for A-GDH.2 B ,X1
B X1,rb,X1rb
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B ,V1
B V1,qb,V1qb
B W1,W2Kbz,W3
W2qb
from ,kcz (attacker’s initial knowledge)from ,kcz,X1,rb,X1rb
from ,kcz,X1,rb,X1rb, Z1rzKaz,Z2rzKbz,Z3rzKcz
W1,W2Kbz,W3
from ,kcz,X1,rb,X1rb, Z1rzKaz,Z2rzKbz,Z3rzKcz, V1,qb,V1qb
W2qb from ,kcz,X1,rb,X1rb, Z1rzKaz,Z2rzKbz,Z3rzKcz, V1,qb,V1qb
,X1
Z1,Z2,Z3,Z4
,V1
Dolev-Yao Term AlgebraAttacker’s term algebra is a set of derivation rules
Tu Tv T[u,v]
Tu Tv Tcryptu[v]
vT Tu
T[u,v] Tu
T[u,v] Tv
Tcryptu[v] Tu Tv
Symbolic constraint m from t1, …, tn is satisfiableif and only if there is a substitution such thatt1, …, tn m is derivable using these rules
if u=v for some
Properties of Term Algebra No restriction on structural size of terms
The closure of any term set under derivation rules is infinite There is no a priori bound on attacker computations
Untyped Attacker doesn’t have to comply with the protocol
specification Attacker may substitute a ciphertext for a random number, a
key for an output of a hash function, etc. Symmetric encryption with non-atomic keys Can add an equational theory to model
algebraic properties of cryptographic functions XOR, modular exponentiation, blinded signatures, …
Solving Symbolic Constraints Constraint reduction rules
Replace each mi from Ti with one or more simpler constraints
Preserve essential properties of the constraint sequence Nondeterministic reduction procedure
Structure-driven, but several rules may apply in any state Exponential in the worst case (the problem is NP-complete)
The procedure is terminating and complete If T m is derivable in attacker’s term algebra,
1. There exists reduction rule r=r() which is applicable to m from T and produces some m’ from T’ such that
2. T’ m’ is derivable in attacker’s term algebra
[Millen and Shmatikov CCS ’01]
Reduction Rules
m from T, vm from T
(elim) m from t, T ___add mgu(t,m) to
(un)
m from [u,v], Tm from u, v, T
(split)m from cryptu[v], T u from cryptuv, Tm from cryptu[v], v, T
(dec)
[m1,m2] from T m1 from T m2 from T
(pair) cryptk[m] from T m from T k from T
(enc)
Reduction ProcedureInitial
constraint sequence
No rule isapplicable
• • •
• • •
apply every possible reduction ruleto first m from T where m is not a
variable
v1 from T1• • •
vN from TN
If reduction tree has at leastone such sequence as a leaf,there is a solution, andattack trace is feasible
or
Symbolic Analysis Summary
Formal specification of protocol roles
Attack (violating execution trace)
attacker is implicit! variables model attacker’s input
Sequence of symbolic constraints
may not have a feasible instantiation
Decidable constraint solving procedure
satisfiable if and only if there existsa feasible instantiation of attack trace
specified bythe analyst
fullyautomated
Let’s Add Algebraic Properties
Verification of trace-based security properties … is decidable for protocols with XOR
Comon-Lundh and Shmatikov (LICS ’03) Chevalier, Kϋsters, Rusinowitch, Turuani (LICS ’03)
… reduces to a system of quadratic Diophantine equations for protocols with Abelian groups
Millen and Shmatikov (CSFW ’03)
… is decidable for a restricted class of protocols with modular exponentiation
Chevalier, Kϋsters, Rusinowitch, Turuani (FST/TCS ’03)
… is decidable for any well-defined protocol with products and modular exponentiation
Shmatikov (ESOP ’04)
Attacker Term Algebra
Tu Tv T[u,v]
Tu Tv Tcryptu[v]
vT Tu
T[u,v] Tu
T[u,v] Tv
Tcryptu[v] Tu Tv
Tu Tv Tuv
Tu Tu-1
Tu Tv Tuv
Associative: (x y) z = x (y z)Commutative: x y = y xNormalization xx-1 1 x1 x rules: (x-1)-1 x (xy)-1 y-1x-1
x1 x (xy)z xyz
Tuv Tuw
Tuvw
Tuv TvAttacker can’t
take discrete logsor solve Diffie-Hellman problem
Dolev-Yao
Key Insights For Decidability In a well-defined protocol, honest participants
don’t need to guess values of attacker inputs Leads to a syntactic condition on usage of variables
If attacker can derive u from T, then there is a derivation which uses only subterms of T and u
If constraints are satisfiable, then there is an attack in which every variable is instantiated by a product of subterms drawn from a finite set
Variable origination condition If C is a constraint sequence generated from an
execution trace, then there exists a linear ordering < on Vars(C) such
that if x appears for the first time in mi from Ti C, then x Vars(mi) and y Vars(Ti) y < x This condition must be satisfied by C after any partial
substitution Rules out only ill-defined protocols
AB XYBA X
Origination Stability
Requires B to split a productof two unknown values
tnT Ttn
t2T Tt2
t1T Tt1
Normal Derivations
Tu
Tv
Lemma: if Tu is derivable, then there is a normal derivation
……
Tv1 Tvk
…
…SYNTHESIS stage:Only pairing, encryption,multiplication, inverse &exponentiation used
ANALYSIS stage:All intermediateterms are products of subterms of T
Conservative Solutions Conservative solution only uses subterms from
the original, uninstantiated constraint sequence x Subterms(x) Subterms(C) closed under , inverse and
exponentiation All subterms used in the conservative solution are drawn from a
finite set which is known before any variables are instantiated Lemma: if C has a solution, then C has a conservative solution This lemma allows to derive a bound on the size
of the attack
Symbolic Decision Procedure{ u1 from T1 , …, un from Tn }
Monotonic: T1 … Tn Satisfy the variable stability condition
1. Guess all equalities between subterms Finite number of possible unifiers modulo AG
2. Guess the order in which subterms are derived
3. Replace exponentiation by and inverse4. Reduce to a decidable system of quadratic Diophantine equations
symbolic constraintsgenerated from protocol
Solvable iff a linearsubsystem is solvable
Back to A-GDH.2X1 from ,kcz
from -”-,X1,rb,X1rb
V1 from -”-, Z1rzKaz,Z2rzKbz, Z3rzKcz
W2Kbz from -”-, V1,qb,V1qb
W2qb from -”-
Z1
from -”-Z2
from -”-Z3
from -”-Z4
X1 from kcz rb
-1Z1 from kcz rb
-1Z2 from kcz rb
-1Z3 from kcz rb
-1Z4 from kcz Z3
-1rz-1kcz
-1V1from kcz Z2
-1rz-1W2from kcz
V1-1W2from kcz
Key insight: under the Diffie-Hellman assumption, attacker can produce x from y if and only if he can produce y-1x (x=(y)y-1x)
Only and inverseused in derivation.Reduces to system of Diophantine equations.
Decidable Quadratic Equations
u1X11…X1k1 from t11, …, tm1u2X21…X2k2 from t21, …, tm2
…unXn1…Xnkn from tn1, …, tmn
Only and inverse usedin each derivation
Convert each constraint into a Diophantine equation uiXi1…Xik from ti1, …, tim becomes uiXi1…Xik = ti1
z1 …timzm for integer zj
If some tij is a variable, equation becomes quadratic, for examplea2X = (ab)z1
a6 = (ab)z2(bX)z3 Equations associated with execution traces have special structure
If a variable occurs on the right, it must previously occur on the left All terms used to construct the variable where it first occurred are
available in every subsequent constraint
Intuition Behind Decidabilitya2X = (ab)z1
a6 = (ab)z2(bX)z3 substitute X
a6 = (ab)z2(ba-2(ab)z1)z3 group (ab) terms together
a6 = (ab)z’(ba-2)z3 z’ = z2 + z1z3 Quadratic part always has a
solution because z2 is unconstrained
Is There A Feasible Attack? Yes!
B ,X1
B X1,rb,X1rb
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B ,V1
B V1,qb,V1qb
B W1,W2Kbz,W3
W2qb
Attacker can learn this value byclever variable instantiation
Attack on A-GDH.2
A B
Z
,rara,rb,rarb
rbrc,rarc,rarb,rarbrc
rbrcrzKaz, rarcrzKbz, rarbrzKcz
Suppose two sessions are run concurrently, and malicious C wants tolearn the session key of the session from which he is excluded
A B,qb
qb,qb,qaqb
qbqzKaz, qaqzKbz
1. Replace with 1
2. Replace with rb,rb,rb,rb
3. Replace with rbrzkcz
4. Replace with rbrzkbz
Attack: B will use rbrzqb as session key, which attacker can compute as (rbrzkczqb)kcz
-1
Attacks of this typecan be foundautomatically fromprotocol specification
Z
Decision Procedures
Free (“black-box”) algebra: decidable Implemented as an easy-to-use analysis tool
XOR: decidable All integer variables are equal to 0 or 1
(Group) Diffie-Hellman: decidable System of quadratic Diophantine equations, which is
solvable if and only if a linear subsystem is solvable Some restrictions (no products in exponentiation base)
Blind signatures, super-exponentiation, ... Axiomatic models of various cryptographic primitives
Currentresearch