Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 214 times |
Download: | 1 times |
Problem Warping and Computational Dynamics in the Solution of NP-hard Problems
John A ClarkDept. of Computer Science
University of York, [email protected]
26.07.2001
Overview Overview of Hill-Climbing and Simulated
Annealing Breaking Permuted Perceptron Problem
previous work problem warping timing analysis solution family based attacks quantum computing
Speculation
Local Optimisation - Hill Climbing
x0 x1 x2
z(x)
Neighbourhood of a point x might be N(x)={x+1,x-1}Hill-climb goes x
0 x
1 x
2 since
f(x0)<f(x
1)<f(x
2) > f(x
3)
and gets stuck at x2 (local
optimum)
xopt
Really want toobtain x
opt
x3
Simulated Annealing
x0 x1
x2
z(x)Allows non-improving moves so that it is possible to go down
x11
x4
x5
x6
x7
x8
x9
x10
x12
x13
x
in order to rise again
to reach global optimum
In practice neighbourhood may be very large and trial neighbour is chosen randomly. Possible to accept worsening move when improving ones exist.
Simulated Annealing Improving moves always accepted Non-improving moves may be accepted
probabilistically and in a manner depending on the temperature parameter T. Loosely
the worse the move the less likely it is to be accepted
a worsening move is less likely to be accepted the cooler the temperature
The temperature T starts high and is gradually cooled as the search progresses.
Initially virtually anything is accepted, at the end only improving moves are allowed (and the search effectively reduces to hill-climbing)
Simulated Annealing Current candidate x. Minimisation formulation.
farsobestisSolution
TempTemp
rejectelse
acceptyxcurrentUifelse
acceptyxcurrentif
yfxf
xighbourgenerateNey
timesDo
dofrozenUntil
TTemp
xxcurrent
Temp
95.0
)( ))1,0((exp
)( )0(
)()(
)(
400
)(
0
0
/
At each temperature consider 400 moves
Always accept improving moves
Accept worsening moves probabilistically.
Gets harder to do this the worse the move.
Gets harder as Temp decreases.
Temperature cycle
Simulated Annealing
Do 400 trial moves
Do 400 trial moves
Do 400 trial moves
Do 400 trial moves
Do 400 trial moves
100T
95.0TT
95.0TT
95.0TT
95.0TT
00001.0TDo 400 trial moves
95.0TT
Identification Problems Notion of zero-knowledge introduced by
Goldwasser and Micali (1985) Indicate that you have a secret without revealing it
Early scheme by Shamir Several schemes of late based on NP-complete
problems Permuted Kernel Problem (Shamir) Syndrome Decoding (Stern) Constrained Linear Equations (Stern) Permuted Perceptron Problem (Pointcheval)
Pointcheval’s Perceptron Schemes
Given
A nm
1a ij
a......aa
...............
a......aa
a.......aa
mnm2m1
2n2221
1n1211
1 js
Find
:
: 2
1
ns
s
s
S n 1
0
:
0
0
:
2
1
mw
w
w
SA nnm 1
So That
Interactive identification protocols based on NP-complete problem.
Perceptron Problem.
Pointcheval’s Perceptron Schemes
Given
A nm
1a ij
a......aa
...............
a......aa
a.......aa
mnm2m1
2n2221
1n1211
1 js
Find
:
: 2
1
ns
s
s
S n 1
:
2
1
mw
w
w
SA nnm 1
So That
Permuted Perceptron Problem (PPP). Make Problem harder by imposing extra constraint.
Has particular histogram H of positive values
1 3 5 .. .. ..
Example: Pointcheval’s Scheme
PP and PPP-example Every PPP solution is a PP solution.
5
1
1
3
1
1
1
1
1
11111
11111
1111-1
1-11-1-1
)1,1,2(
))5(),3(),1((
hhhH
Has particular histogram H of positive values
1 3 5
Generating Instances
Suggested method of generation:
11111
11111
1111-1
11-111-
• Generate random matrix A
1
1
1
1
1
• Generate random secret S
5
1
1
3
• Calculate AS• If any (AS)i <0 then negate ith row of
A
11111
11111
1111-1
11-111-
1
1
1
1
1
5
1
1
3
11111
11111
Significant structure in this problem; high correlation between majority values of matrix columns and secret corresponding secret bits
Instance Properties
Each matrix row/secret dot product is the sum of n Bernouilli (+1/-1) variables.
Initial image histogram has Binomial shape and is symmetric about 0 After negation simply folds over to be positive
-7–5-3-1 1 3 5 7… 1 3 5 7…
Image elements tend to be small
PP Using Search: Pointcheval
Pointcheval couched the Perceptron Problem as a search problem.
1
1
1
1
1
1Y
1
1
1
1
1
2Y
1
1
1
1
1
3Y
1
1
1
1
1
4Y
1
1
1
1
1
5Y
current solution Y
Neighbourhood defined by single bit flips on current solution
1
1
1
1
1
Cost function punishes any negative image components
1
3
1
1
AY
costNeg(y)=|-1|+|-3| =4
Using Annealing: Pointcheval
PPP solution is also PP solution. Based estimates of cracking PPP on ratio of PP solutions to
PPP solutions. Calculated sizes of matrix for which this should be most
difficult Gave rise to (m,n)=(m,m+16) Recommended (m,n)=(101,117),(131,147),
(151,167) Gave estimates for number of years needed to solve PPP
using annealing as PP solution means PP instances with matrices of size 200 ‘could usually be
solved within a day’ But no PPP problem instance greater than 71 was ever
solved this way ‘despite months of computation’.
Perceptron Problem (PP)
Knudsen and Meier approach (loosely): Carrying out sets of runs Note where results obtained all agree Fix those elements where there is complete
agreement and carry out new set of runs and so on. If repeated runs give same values for particular bits
assumption is that those bits are actually set correctly
Used this sort of approach to solve instances of PP problem up to 180 times faster than Pointcheval for (151,167) problem but no upper bound given on sizes achievable.
Profiling Annealing
Approach is not without its problems. Not all bits that have complete agreement are correct.
Actual SecretRun 1Run 2Run 3Run 4Run 5Run 6All runs agree
All agree (wrongly)
1-1
Knudsen and Meier Have used this method to attack PPP problem sizes
(101,117) Needs hefty enumeration stage (to search for wrong
bits), allowed up to 264 search complexity Used new cost function w1=30, w2=1 with histogram
punishment
cost(y)=w1costNeg(y)+w2costHist(y)
1
1
1
1
Ay)0,0,3()(
)1,1,2()(
yhist
shist
010123)(costHist y
PP Move Effects
A move changes a single element of the current solution. Want current negative image values to go positive But changing a bit to cause negative values to go positive
will often cause small positive values to go negative.
01234567 01234567
iAYiW 2''
iWiAYiW
Problem Warping
Can significantly improve results by punishing at positive value K
For example punish any value less than K=4 during the search
Drags the elements away from the boundary during search. Also use square of differences |Wi-K|2 rather than simple
deviation
01234567
AYW Cost=|4- -1|2=25
Problem Warping PP0 1 2 3 0 1 2 3
Pr 0 0 0 0 3 Pr 5 0 4 6 5Pr 1 3 6 2 11 Pr 6 3 6 12 5Pr 2 1 11 6 8 Pr 7 4 7 14 2Pr 3 8 12 6 3 Pr 8 3 14 2 9Pr 4 0 4 5 4 Pr 9 1 1 5 4
0 1 2 0 1 2 Pr 0 0 0 1 Pr 5 0 1 0 Pr 1 0 0 2 Pr 6 1 2 6 Pr 2 1 11 1 Pr 7 0 11 6 Pr 3 8 12 14 Pr 8 0 2 9 Pr 4 0 4 6 Pr 9 3 12 11
0 1 2 3 0 1 2 3Pr 0 0 0 0 0 Pr 5 0 0 0 2Pr 1 0 0 1 1 Pr 6 0 0 0 1Pr 2 0 2 2 4 Pr 7 0 0 0 1Pr 3 0 1 1 3 Pr 8 0 0 0 0Pr 4 0 0 0 0 Pr 9 0 1 3 4
0 1 2 3 0 1 2 3Pr 0 0 0 0 1 Pr 5 0 0 2 2Pr 1 0 0 0 1 Pr 6 0 2 1 1Pr 2 0 0 0 0 Pr 7 0 0 0 0Pr 3 0 0 0 2 Pr 8 0 0 0 0Pr 4 0 0 0 0 Pr 9 0 0 0 0
(201,217)
(401,417)
(501,517)
(601,617)
Table gives numbers of success in 30 runs of annealing followed by 0,1,2,3 bit hill-climb for each of 10 problems.
Problem Warping
Comparative results Generally allows solution within a few runs of annealing for sizes (201,217) Number of bits correct is generally worst when K=0. Best value for K varies between sizes (but can do profiling to test what it is)
Has proved possible to solve for size (601,617) and higher. Enormous increase in power for essentially change to one line of the
program Using powers of 2 rather than just modulus Use of K factor
Morals… Small changes may make a big difference. The real issue is how the cost function and the search technique interact The cost function need not be the most `natural’ direct expression of the
problem to be solved. Cost functions are a means to an end.
This is a form of fault injection or problem warping on the problem.
PPP (101, 117)
PPP (101,117)
0
20
40
60
80
100
120
1 5 9 13 17 21 25 29Problem Number
Ma
x B
its
Co
rre
ct O
ver
All
R
un
s
Bits Correct in FinalSolution
Initial N Bits StuckCorrect
PPP (131, 147)
PPP (131,147)
0
20
40
60
80
100
120
140
160
1 4 7
10
13
16
19
22
25
28
Problem Number
Ma
x B
its
Co
rre
ct
Ov
er A
ll R
un
s
Bits Correct in FinalSolution
Initial N Bits StuckCorrect
PPP (151, 167)
PPP (151,167)
0
50
100
150
200
1 4 7 10 13 16 19 22 25 28Problem Number
Ma
x B
its
Co
rre
ct O
ver
All
R
un
s
Max Bits Correct inFinal Solution
Initial N Bits StuckCorrect
Some Tricks
Won’t go into detail but there are some further problem specific tricks that can be used to reduce the remaining search.
For example, you can generally tell easily whether you have an odd or even number of bits wrong.
Sum the image elements taking values of … -7,-3,1,5,9,13.. (S1)
Sum the image elements taking values of … -5,-1,3,7, 11.. (S2)
Find the corresponding sums T1, T2 in the provided histogram
If T1=S1 and T2=S2 then there are an even number of bits wrong
If T1=S2 and T2=S1 then there are an odd number wrong
A Few Tricks More
Look at the image elements wi produced. If I knew what they should be I could use linear algebra
to solve the system. I do not know whether they are right or not – but often
they are, or nearly so. If wi=1 is obtained by some run. It is very likely that the
actual value it should be is 1,5,9 (assuming an even number of bits wrong).
Assume it is correct. Then changing any bits of the current solution to obtain the original solution must not change the value of wi
This means half the bits xj I change in the solution x must agree in sign with corresponding bit aij in the ith row (and half must disagree). This reduces the complexity of the remaining search.
Overall
Have missed out the details but basically this scheme is broken.
There is just two much structure….and there is more
Radical Viewpoint Analysis
Problem P
Problem P1 Problem P2 Problem Pn-1 Problem Pn
Essentially create mutant problems and attempt to solve them.
If the solutions agree on particular elements then they generally will do so for a reason, generally because they are correct.
Can think of mutation as an attempt to blow the search away from actual original solution.
Look for agreement between solutions. Often nearly half the key can be obtained without any wrong bits.
Radical Viewpoint Analysis
Bits where three runs agree. Go for unanimity. A more stressful variation of Knudsen and Meier’s idea
Democratic Viewpoint Analysis
Problem P
Problem P1 Problem P2 Problem Pn-1 Problem Pn
Essentially same as before but this time go for substantial rather than unanimous agreement.
By choosing the amount of disagreement tolerated carefully you can sometimes get over half the key this way. And on occasion have had only 1 bit in 115 most agreed bits incorrect (out of 167)
It’s a 1 It’s a 1 It’s a 1 No. It’s a -1
Multiple Clock Watchers Analysis
Problem P
Problem P1 Problem P2 Problem Pn-1 Problem Pn
Essentially same as for timing analysis but this time add up the times over all runs where each bit got stuck.
As you might expect those bits that often get stuck early (i.e. have low aggregate times to getting stuck) generally do so at their correct values (take the majority value).
Also seems to have significant potential but needs more work.
Quantum Computation
Everything I have reported so far has assumes the classical computational paradigm.
But this is the very assumption that gave rise to the biggest shock in cryptography.
Let’s not fall into the same trap. Can heuristic search and quantum computing work together?
Grover’s Algorithm
Consider a function f(x) : x is in 0..(2N-1) there is a single value v such that some predicate P(v) holds.
Then Grover’s algorithm can find v in approximately O(2(N/2)) steps.
Thus if we have a state space of size 2100, it will require O(250) steps
Now let us return to the (101,117) PPP case. Finding a solution to this by quantum search would require O(259)
steps. But if we can obtain a solution with 108 bits correct, we could
ask a different question. What are the indices of the 9 wrong bits? Assuming each index can be couched in 7 bits, we have 7*9=63 bits This means that Grover’s can find the answer in O(232)
More Short Term Can we view metaheuristic search as a means
of problem reduction rather than problem solving?
The AI community has developed methods that work very well with very highly constrained problems.
Am currently experimenting with profiling and using properties of how near search gets to the goal to place bounds on the remaining problem and solve using linear programming.
Grover’s Algorithm 2
And it’s not all one way. If there are more states satisfying a predicate one might expect the task of finding one of them to be easier than previously.
Indeed if there are M states v satisfying the predicate P(v) then the search becomes of order
And so characterise positions from which you can use heuristic searcheffectively and use QC to find them. Then use HS to reach optimium
2/12
M
N
Use QC to get in this range
Now hill-climb to get here
Speculation and Further work
Can we try failing millions of times and then start doing cryptanalysis on the results?
Will the techniques work more widely? Why cannot I break say DES or RSA using a technique like this?
Is there a theorem to suggest not? No. Cryptography of block ciphers largely works by approximations,
e.g. functions of the form P[3].xor.P[35].xor.K[1].xor.K[22].xor.C[15].xor.C[52]
are true with some bias (e.g. 50.00001% of the time) P[j] =bit j of a plaintext block, similarly C is ciphertext and K
is key. Can we derive these from sample data using annealing?
How can we exploit the notion of shifting computational paradigm?
How well can we profile the distribution of results in order to isolate those ones at the extremes of correctness?
Speculation and Further work
Very few applications of these techniques to modern day cryptography and its applications.
Have successfully created Boolean functions with desirable cryptographic properties.
Have also evolved evolved protocols in belief logics whose abstract execution is a proof of their own correctness.
Much more to come.