Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | jocelyn-hubbard |
View: | 214 times |
Download: | 2 times |
CS717
Checking Specific Algorithms
Greg Bronevetsky
CS717
Objective
• Primary goal: find checkers that can be applied to any program
• Along the way must look at checkers for specific algorithms
• Each checker exploits special properties of target algorithm– General patterns may become apparent– Useful for toolkit-style checker generator
• Library of algorithm-specific checkers used where appropriate
CS717
Outline
• Low-hanging fruit– Himanshu Gupta. "Result Verification Algorithms for
Optimization Problems", 1995• Checkers for optimizations programs
• Existence of checkers for certain complexity classes
• Certification Trails– Gregory F. Sullivan, Dwight S. Wilson and Gerald M.
Masson, “Certification of Computational Results”, 1995.
• Fundamentals of program testing & correcting– Manuel Blum, Michael Luby and Ronitt Rubinfeld. "Self-
Testing & Correcting with Applications to Numerical Problems"
CS717
Low-Hanging Fruit Outline
• “Greg’s Theorem”
• Himanshu Gupta:– Checkers for optimization problems (primal-dual)
• Max Flow• Min-cost Flow• Unweighted & Weighted bipartite matching• All Shortest Paths
– Existence proofs for verifiers• Verifiers for NP-complete and -hard languages• Approximate verifiers
– Verifiers with Certification Trails
CS717
Intro to Complexity
• Problems defined as sets of strings– Particular set = “language”
• Example language: set of palindromes– 001100, 01011010, etc.
• Membership in given language decidable by computing device of given power– ex: Turing machine running in polynomial time– Machine A on input x
• Runs for time poly(|x|)• Returns ACCEPT : xL(A)
REJECT : xL(A)
CS717
Intro to Complexity
• Complexity class: set of languages accepted by machines of given power
• Examples:– P : Turing machines with running time
O(poly(|x|))– EXPSPACE : Turing machines that use O(2|x|)
memory– BC – Boolean circuits of bounded depth
• Big deal: can show some problems inherently harder than others
CS717
Basics of Reductions
• Problem A reducible to problem B if can easily solve A given algorithm for B
• Reduction:– Given x = input to A– Efficiently transform x to y = input to B
• Efficiently usually means poly-time
– Ask algorithm for B if y valid B-string• B is an “oracle”
– x valid A-string y valid B-string
• Thus, B at least as hard as A
CS717
Basics of Reductions
• Can compare difficulties of problems via reductions
• If all problems in class C1 reducible to problems in class C2 then C1 C2
• Example:– P NP– Unknown if NP P
• If yes, then P=NP• If no, then some problems in NP cannot be solved in
polynomial time
CS717
Theorem 1
• Let A be an implementation of an algorithm– Runs in time = O(p()), p some polynomial
• Does there exist checker C for A that runs in time < p()?– C accepts A’s input & output, returns PASS/FAIL– C knows A’s source code– Runs in time O(q()) < O(p()), asymptotically faster
• A is decision algorithm, so output is 1 bit
A CInput {0,1} {PASS, FAIL}
CS717
Theorem 1
• A’s input = x: n bits• A’s output = y: 1 bit• Suppose C can verify whether xL(A) using y
– In O(q(|x|)) time
• Then can decide L(A) in O(q(|x|)) time– Given input x– Run C on <x,1>– If C(x,1) returns PASS, ACCEPT x– If C(x,1) returns FAIL, REJECT x
CS717
Theorem 1
• A reduced to C
• C runs in O(q()) time
• Thus, p()-time problems solvable in q()-time– This is not true
• Therefore, for p()-time decision algorithm A checker C running in <O(p()) time
CS717
Theorem 1.1
• Lets make job simpler for checker• Helper proof
– Suppose A outputs helper proof– C can use helper proof to simplify checking
• Size of helper proof q(|x|)– Since C can only read q(|x|) bits of proof
A CInput {0,1} {PASS, FAIL}HelperProof
CS717
Theorem 1.1
• A’s input = x: n bits• A’s output = y: 1 bit + q(|x|)-bit proof• Suppose C can verify whether xL(A) using y
– In O(q(|x|)) time
• Then y is witness for xL(A)
CS717
Nondeterministic Time Classes
• Non-deterministic automaton for language L:– Given input x– Runs for O(f(|x|)) amount of time– Can make binary guesses {0,1}– If set of guesses s.t. automaton would accept,
then xL
......
......
• At each guess, automaton splits– 0 branch, 1 branch
• If accepts along some path then whole automaton accepts
0 1
0 1 0 1
CS717
Nondeterministic Time Classes
• String of guesses leading to ACCEPT called a “witness”
• Proves that xL– Can deterministically verify xL– No need to guess: witness gives correct guesses
• NP – runs for polynomial time witnesses poly-length
• Nondet(f()) – runs in time O(f(|x|)) witness length O(f(|x|))
CS717
Theorem 1.1
• A’s input = x: n bits• A’s output = y: 1 bit + q(|x|)-bit proof• Suppose C can verify whether xL(A) using y
– In O(q(|x|)) time
• Then y is witness for xL(A)
• Thus, xL(A) decidable by nondet automaton running in O(q(|x|)) time
CS717
Theorem 1.1
• Guess all bits of proof– Takes q(|x|) guesses
• For each proof, run C on y=<1, proof>• Accept if C says PASS on some
y=<1, some proof>
Guess y
C
0 1
0 1 0 1
... ... ... ...C C C
CS717
Theorem 1.1
• xL(A) decidable by nondet automaton running in O(q(|x|)) time
• Thus, L(A) Nondet(q()) – Recall: A runs in time O(p())>O(q())
• Det(p()) Nondet(q()) Nondet(p()) Det(p()) Nondet(p())
• Shown: If L(A) Det(p()) and proof-checker that runs in time O(q()) where q()<p() then Det(p()) Nondet(p())
CS717
Theorem 1.1
• Shown: If L(A) Det(p()) and checker that runs in time O(q()) where q()<p() then Det(p()) Nondet(p())– Known Det(p()) Nondet(p()) for few functions– In general, as hard as P NP
• Worst case scenario:– If discover efficient checkers, then Det(p())
Nondet(p()) • Very difficult
– If Det(p()) Nondet(p()) proven, no help to us
CS717
Low-Hanging Fruit Outline
• “Greg’s Theorem”
• Himanshu Gupta:– Checkers for optimization problems (primal-dual)
• Max Flow• Min-cost Flow• Unweighted & Weighted bipartite matching• All Shortest Paths
– Existence proofs for verifiers• Verifiers for NP-complete and NP-hard languages• Approximate verifiers
CS717
Definitions
• Problem: IS– I: set of inputs– S: set of solutions
• Algorithm: on input xI, returns yS s.t. <x,y>
• Verification problem:V()(IS){PASS,FAIL}– <(x,y),PASS>V() iff <x,y>– <(x,y),FAIL>V() iff <x,y>
CS717
Maximum Flow
• Given:– Directed graph G=(V,E)– Edges labeled with capacities
• c(u,v) = capacity of edge
– Two special nodes: S and T
• Must find “maximum flow” from s to t
c1
c2
c3
c4c5
c6c7
c8
c9
S
T
CS717
Maximum Flow
• Find edge labeling f– f(u,v) = flow on edge uv c(u,v)
• Sum of flows coming into node = Sum of flows coming out of node– Outflow of node s > inflow– Inflow of node t > outflow
• Net outflow of s = Net inflow of t = = “Network flow”
• Problem: Find labeling f generating largest network flow
CS717
Residual Capacities
• Residual Capacity for flow f:res(u,v) = c(u,v) – f(u,v)
(Capacity of edge unused by flow)
• Residual Graph Rf: graph on V,E where
• Augmenting Path: Path through Rf
0),(
0),(),(),(
vuresifedgeno
vuresifvuresvuc
CS717
Augmenting Paths Verifier
• Theorem: Rf contains augmenting path f is not maximum flow
• Example: one common max flow algorithmPick a flowWhile can find augmenting paths
Adjust flow to remove augmenting path
• Thus, to check if given flow is max flow: – Ensure given flow valid– Construct Rf
– Ensure that no augmenting path exists• Path search much faster than max-flow algorithm
CS717
Min-Cost Flow
• Now edges have cost• Goal: find flow of given size with minimal cost
– For any potential flow construct “Residual Graph” Rf
– Theorem: given flow is min-cost if Rf has no negative-cost cycles
• Thus, checker– Ensures alleged min-cost flow is valid flow
– Constructs Rf
– Ensures that no negative-cost cycles exist• Cycle search much faster than min-cost flow algorithm
CS717
Matching
• Given bipartite weighted/unweighted graph find subset of edges s.t.– No left node has >1 right node as neighbor– No right node has >1 left node as neighbor
X Y
Maximal Matching Size=3
CS717
Verifier for Matching
• Matching also has augmenting paths• Can verify matching:
– Ensure that matching valid– Search of augmenting path
• Above problems examples of matroids– Intuitively: problems solvable via basic greedy
algorithm
• All matroids have this structure• Thus, can easily check matroid problems
CS717
Other Matroid Problems
• Minimal Spanning Tree
• Minimal Cut in Graph
• Maximal Basis in Linear Space
• …
CS717
Low-Hanging Fruit Outline
• “Greg’s Theorem”
• Himanshu Gupta:– Checkers for optimization problems (primal-dual)
• Max Flow• Min-cost Flow• Unweighted & Weighted bipartite matching• All Shortest Paths
– Existence proofs for verifiers• Verifiers for NP-complete and NP-hard languages• Approximate verifiers
CS717
All Shortest Paths
• Given undirected graph G=(V,E)– Each edge (i,j) has weight w(i,j)
• Problem: return the shortest path between every pair of nodes
• Assume output format:– P : predecessor matrix
P[i,j] = k s.t. edge (k,j) lies on shortest path ij
(i.e. node k precedes node j on path ij)
– D : distance matrixD[i,j] = shortest distance from i to j
(D[i,i] = 0)
i jP[i,j]
CS717
Checker Algorithm
• Step 1: ensure that P[] and D[] self-consistentforeach (i,j)
Ensure that D[i,j] = D[i, P[i,j]] + w(P[i,j], j)
If error seen, return FAIL
• Step 2: ensure output paths truly shortestforeach node i
foreach edge (u,v)
if(D[i,u] + w(u,v) < D[i,v]) return FAIL
• Return PASS
CS717
Checking Self-Consistency
• foreach (i,j)
Ensure that D[i,j] = D[i, P[i,j]] + w(P[i,j], j)
• Clearly, if output correct, test returns PASS
i j
D[i, P[i,j]] w(P[i,j], j)
P[i,j]
D[i,j]
CS717
Checking Self-Consistency
• foreach (i,j)
Ensure that D[i,j] = D[i, P[i,j]] + w(P[i,j], j)
• If output incorrect, two possible problems– P[ ] may not represent valid paths– D[ ] may not be true lengths of P[ ]’s paths
i j
D[i, P[i,j]] w(P[i,j], j)
P[i,j]
D[i,j]
CS717
Checking D[ ]
• Suppose D[i,j] ij distance according to P[ ]• Pick pair i,j with shortest true distance
– Let k = P[i,j]– Check: D[i,j] ?= D[i, k] + w(k, j)– No errors in D[ ] for pairs closer than i and j
• Thus, D[i, k] correct distance ij’s predecessor• So, D[i, k] + w(k, j) is correct ij distance
– Thus, will detect error
i jP[i,j]
k
CS717
Checking P[ ]
• P[i,*] must induce spanning tree on nodes
• Thm: Graph on n nodes is spanning tree n-1 edges and acyclic– Clearly, n-1 edges since each node ji has P[i,j]
• Above check will fail if cycle exists
i
CS717
Checking P[ ] : Acyclicity
• Suppose cycle exists• Suppose all edges of u-v cycle have been
checked (and passed), uv is last
i
path
u v
D[i,v]D[i,u]
• Claim: if cycle exists then D[ ] has error
CS717
Checking P[ ] : Acyclicity
• vu path passed, so: D[i,u] = D[i,v] + w(path)
• In order for uv to pass: D[i,v] = D[i,u] + w(u,v)
• But then, D[i,v] = w(u,v) + (D[i,v] + w(path))– Contradiction!
• Thus, cycle exists error in D[ ] detected
i
path
u v
D[i,v]D[i,u]foreach (i,j) D[i,j] = D[i, P[i,j]] + w(P[i,j], j)
CS717
Checker Algorithm
• Step 1: ensure that P[] and D[] self-consistentforeach (i,j)
Ensure that D[i,j] = D[i, P[i,j]] + w(P[i,j], j)
If error seen, return FAIL
• Step 2: ensure output paths truly shortestforeach node i
foreach edge (u,v)
if(D[i,u] + w(u,v) < D[i,v] return FAIL
• Return PASS
CS717
Checking Shortest Paths
• Test nodes i, edges (u,v):D[i,u] + w(u,v) ? D[i,v] (FAIL if <)
• Test checks if path through u shorter than “official” path
• If output correct:– If u on shortest path, D[i,u] + w(u,v) = D[i,v]– If u not on shortest path, D[i,u] + w(u,v) D[i,v]
• Thus, on correct output PASS returned
i vu
CS717
Checking Shortest Paths
• Suppose output incorrect: D[i,j] not length of shortest ij path– Step 1 ensures D[i,j] true distance of path ij
according to P[ ]
• Let – D[ ] = distance function for truly shortest solution– P[ ] = predecessor function for shortest solution
CS717
Checking Shortest Paths
• Pick i,j s.t. – D[i,j] D[i,j]– No other erroneous pair closer to each other (by P[ ])
• Thus, uP[i,j] s.t. D[i,u] + w(u,j) < D[i,j]– Path thru u shorter than path thru P[i,j]– If shortest path is thru P[i,j], then D[i, P[i,j]] is wrong
• Big Question: Will test detect this?
i j
P[i,j]
u Shorter PathWrong Path
CS717
Checking Shortest Paths
• Thus, uP[i,j] s.t. D[i,u] + w(u,j) < D[i,j]
• For each node i, test looks at all edges • When looks at uj:
– Evaluates: D[i,u] + w(u,j) ? D[i,j]– D[i,u] = D[i,u]
• Since i,j closest erroneous pair
– Thus, D[i,u] + w(u,j) = D[i,u] + w(u,j) < D[i,j]
i j
P[i,j]
u Shorter PathWrong Path
CS717
Checking Shortest Paths
• For each node i, test looks at all edges• When looks at u:
– Evaluates: D[i,u] + w(u,v) ? D[i,v]– D[i,u] = D[i,u]
• Since i,j closest erroneous pair
– Thus, D[i,u] + w(u,v) = D[i,u] + w(u,v) < D[i,v]
• Test returns FAIL
• Shown: if output not shortest paths test returns FAIL
CS717
Summary of Checkers
• Shown checkers for– Matroids– All-shortest paths
• Checkers very specific to target algorithm
• In general: – Find invariant of output– Ensure it is true
CS717
Low-Hanging Fruit Outline
• “Greg’s Theorem”
• Himanshu Gupta:– Checkers for optimization problems (primal-dual)
• Max Flow• Min-cost Flow• Unweighted & Weighted bipartite matching• All Shortest Paths
– Existence proofs for verifiers• Verifiers for NP-complete and NP-hard languages• Approximate verifiers
CS717
Verifiers for NP-Complete Problems
• Given NP-complete decision problem I{0,1}
’s verification problem: V(){I{0,1}} {P,F}
reducible to V()…
CS717
Verifiers for NP-Complete Problems
• Given NP-complete decision problem I{0,1}
’s verification problem: V(){I{0,1}} {P,F}
reducible to V()– On input x, check if <x,P>V()– Mapping computable in poly-time– Thus, V() NP-complete
• Proven: if NP-complete then so is V()
CS717
Verifiers for Some NP-Hard Problems
• Given NP-Hard problem HI
• On input x, returns y s.t.– y<O(f(|x|))– f(|x|) O(poly(|x|))– f() poly-time computable
• Verification problem V(H){I}{P,F}
CS717
Verifiers for Some NP-Hard Problems
• Reduction from H to V(H):– Given x, compute f(|x|)– Try all possible values for y{0 … f(|x|)}
• If <<x,y>, P>V(H) then ACCEPT
– If not accepts, REJECT
• Reduction is poly-time• Thus, V(H) is NP-Hard (unless P=NP)
• Examples: min vertex cover, max clique, chromatic number, max cycle length, etc.
CS717
Low-Hanging Fruit Outline
• “Greg’s Theorem”
• Himanshu Gupta:– Checkers for optimization problems (primal-dual)
• Max Flow• Min-cost Flow• Unweighted & Weighted bipartite matching• All Shortest Paths
– Existence proofs for verifiers• Verifiers for NP-complete and NP-hard languages• Approximate verifiers
CS717
Maximization Problems
• Given NP-Hard Maximization problem HI
• Output: size of maximum solution to problem
• For each H NP-Complete Decision problem HD– Input <x,y>– ACCEPT iff maximal solution to problem is y
• Can use binary search to find solve H using log |x| calls to HD
CS717
Approximate Algorithms
• Given NP-Hard Maximization problem HI
• AH is (a,c) approximation algorithm for H if– On input x, outputs y– y max ay + c– max is maximal solution of H– Runs in poly time
• (a,c) approximation verifier: AVH– Input <x,y>– Returns PASS iff y max ay + c
CS717
Approximation Verifiers NP
• Thm: If poly-time (a,c) approximation algorithm AH exists for NP-Hard maximization problem H,
Then approximation verifier AVH is NP-Complete
• Proof: Reduce sibling decision problem HD to AVH– Calls AH– Uses AVH as oracle
CS717
Reducing HD
• Input <x,y>• ACCEPT iff ymax
– max = size of maximal solution to input x
• Step 1: call AH– Get m=AH’s approximation of max
• Step 2: work– If y m, ACCEPT– If y > m
• If y > am + c, REJECT• Else, use AVH Oracle to see if <x,y>L(AVH)
– If <x,y>L(AVH), ACCEPT– Else, REJECT
CS717
Reducing HD
• Given input <x,y> to HD• Decided whether ymax by calling
– AH – approximation algorithm– AVH – approximation verifier
• AH runs in poly time– Reduction is poly-time, so can run AH as part of
reduction
• <x,y> transformed to input to AVH– Thus, AVH at least as hard as HD
CS717
Proven
• NP-Hard Maximization problem AND
• Poly-time approximation algorithm for it
• Approximation verifier for approximation algorithm is NP-Complete
CS717
Algorithm-specific Verifiers
• If poly-time approximation algorithm exists, trivial approximation verifier:– On input: <x,y>– Run fault-free version of approximation algorithm
on x– See if returns y
• Verifier still poly-time
• Only works for specific approximation algorithm– Won’t work for different approximators of same
problem
CS717
Low-Hanging Fruit Summary
• Covered several basic theorems
• Give idea of what is possible/impossible
• Fairly basic results– Equivalent to homework in graduate theory class
CS717
Outline
• Low-hanging fruit– Himanshu Gupta. "Result Verification Algorithms for
Optimization Problems", 1995• Checkers for optimizations programs
• Existence of checkers for certain complexity classes
• Certification Trails– Gregory F. Sullivan, Dwight S. Wilson and Gerald M.
Masson, “Certification of Computational Results”, 1995.
• Fundamentals of program testing & correcting– Manuel Blum, Michael Luby and Ronitt Rubinfeld. "Self-
Testing & Correcting with Applications to Numerical Problems"
CS717
Certification Trails
• Algorithm runs on input• Produces:
– Regular output– Certification Trail: short proof that output matches
input
• Certifier runs on <input, trail>– Produces same output or return FAIL– Additional info from trail speeds recomputation– If trail wrong, may still output correctly
• Certifiers presented here will just FAIL
CS717
Basic Definitions
• D = set of inputs• S = set of valid outputs• T = set of certification trails
• Original program – P: D ST– Accepts input, returns <output, trail>
• Certifier– C: DT S{FAIL}– Accepts <input, trail>, returns output or FAIL
CS717
Focus of Certification Trails
• Paper comes from Software Engineering background
• Thus, focus on detecting programmer errors– Argue that P and C different algorithms– Thus, implementations different– Low probability of same errors in P and C
• Hardware faults also mentioned
CS717
Certification Trails Outline
• Paper presents checkers for several common algorithms– Sorting– Convex Hull
• Neat approach• Problem-specific and very manual• Little insight into general procedure
– Though, can create fault tolerant libraries
CS717
Sorting
• Given list of numbers, return the numbers in sorted order
• Trivial check: – Given allegedly sorted output– Check that order non-decreasing
• This doesn’t work:– Output must be permutation of input – Above checker would
• On input [2, 4, 6, 8]• Accept [0, 0, 0, 0, 0, 0]
CS717
Correct Certifier
• Certification Trail: list of indexes– All input elements get ID
• ith element gets ID i
– At spot j of trail: ID of element in sorted position j
0
121
452
93
264
335
176
117
92IDs:
Data:
2
96
110
125
173
264
331
457
92IDs:
Sorted Data:
2 6 0 5 3 4 1 7IDs in Trail:
CS717
Checking Permutation
• Certifier gets – Input numbers– Their IDs in sorted order
• Uses ID list to reorder input numbers– ID list serves as cheat sheet– Shows correct sort decisions
12 45 9 26 33 17 11 92Data:
2 6 0 5 3 4 1 7IDs in Trail:
2
96
110
125
173
264
331
457
92Sorted(?) IDs:
Sorted(?) Data:
CS717
Checking Permutation
• Two things to check– Onto: For each sorted element, input element – 1-1 : All sorted elements refer to different input
elements
12 45 9 26 33 17 11 92Data:
2 6 0 5 3 4 1 7IDs in Trail:
2
96
110
125
173
264
331
457
92Sorted(?) IDs:
Sorted(?) Data:
CS717
Checking Permutation
• Onto: For each sorted element, input element– Check that all IDs in trail are valid
• 1-1 : All sorted elements refer to different input elements– If two IDs in trail equal then some input element
not touched• Not copied to sorted list• Use touch counters
Input
Sorted
0 1 2 3 4 5 6 7
0 1 3 3 4 5 6 7
CS717
Checking Sort
• To check sort– Traverse reordered list– Ensure non-decreasing
• Sorter time: O(nlog n)• Certifier time: O(n)
– Asymptotically faster
• Big trick: – Trail summarizes decisions made by sorter– Enough info to quickly recompute– Can any problem be cast into set of big decisions?
CS717
Certification Trails Outline
• Paper presents checkers for several common algorithms– Sorting– Convex Hull
• Neat approach• Problem-specific and very manual• Little insight into general procedure
– Though, can create fault tolerant libraries
CS717
Convex Hull Problem
Given set of points on 2D plane, find subset that forms convex hull around all points.
CS717
Convex Hull: Step 1
P1 is the
point with the least x-coordinate.
P6
P2
P8
P3
P5
P1
P7
P4
Points sorted in order of increasing slope relative to P
1
CS717
Convex Hull: Invariant
P6
P2
P8
P3
P5
P1
P7
P4
All the points not on Hull are inside triangle formed by P
1 and two successive points on Hull
CS717
Convex Hull: Invariant
P6
P2
P8
P3
P5
P1
P7
P4
P3 not Hull point clockwise angle
between lines P2P3 and P3P4 ≥ 180º
≥ 180º
CS717
Convex Hull: Invariant
P6
P2
P8
P3
P5
P1
P7
P4
< 180º
If clockwise angle between lines P2P3 and P3P4 < 180º, then P
3 is Hull point
CS717
Convex Hull Algorithm Outline
• Walk through P2 to Pn in slope order
• Keep adding points to hull
• If find point generating angle ≥ 180º– Back up, remove all such points
• Until added Pn
– P1, P2 and Pn must be on convex hull
CS717
Convex Hull Algorithm
• Add P1, P2 and P3 to the Hull
• For Pk = P4 to Pn
(... trying to add Pk to the Hull …)
– Let QA and QB be the two points most recently added to the Hull:
– While the angle formed by QAQB and QBPk ≥ 180
• remove QB from the Hull since it is inside the triangle: P1, QA, Pk.
– Add Pk to the Hull
CS717
Trail for Convex Hull
• Augment Program to output– {h1, h2, ..., hm} = indexes of points on hull
– For each point Pi not on hull, proof of why not
• Tuple (xi, hj, hk, hk) s.t. xi in triangle hjhkhl
– xi internal point
– hj, hk, hk hull points
P2
P3P
1
P5
P4
CS717
Convex Hull Checker
• Checker checks that:– There is 1-1 correspondence between input points
and {x1, x2, ..., xm}U{h1, h2, ..., hr}.
• xi internal points
• hi hull points
– Each point in triangle proofs lies in given triangle
• Basic error checking– Assures that all points accounted for
CS717
Convex Hull Checker
• Checker checks that:– For each triple of consecutive hull points hihi+1hi+2
lines hihi+1 and hi+1hi+2 form counter-clockwise angle 180
• i.e. form convex corners of hull
unique locally maximal point on hull
• There exists theorem saying: shape with above properties is convex hull
P6
P8
P1
180180
180
CS717
Proving Correctness
• If hull correct, will return PASS– Since we check properties true of convex hulls
• If hull not correct– If bad encoding of proof
• (i.e. points don’t refer to input)• First two checks will detect
– If internal points should be on hull• Will not be in any triangle
– If hull points not convex• Convexity assured by theorem, given points were valid
(ensured by first check)
CS717
Checking Convex Hull
• Original Algorithm: O(nlog n) time– Dominated by initial sort of points
• Checker: O(n) time– Asymptotically faster
• Very different algorithms– Little chance of similar errors in both– Hardware errors will affect both differently
CS717
Checking Convex Hull: Big Trick
• Trail contains key facts discovered by algorithm– Containment triangles
• Certifier checks main invariant– Convexity of hull
CS717
Outline
• Low-hanging fruit– Himanshu Gupta. "Result Verification Algorithms for
Optimization Problems", 1995• Checkers for optimizations programs
• Existence of checkers for certain complexity classes
• Certification Trails– Gregory F. Sullivan, Dwight S. Wilson and Gerald M.
Masson, “Certification of Computational Results”, 1995.
• Fundamentals of program testing & correcting– Manuel Blum, Michael Luby and Ronitt Rubinfeld. "Self-
Testing & Correcting with Applications to Numerical Problems"
CS717
Goals
• Aimed at implementation errors• Given implementation P of function f
– Testing: must see if P(x)=f(x) for most inputs• If correct for all inputs, output PASS• If incorrect on too many inputs x, output FAIL
– Correcting: if inputs x for which P(x)f(x) then• Figure out correct P(x)• Assuming P correct on most inputs
• Tester and corrector may call P
CS717
Added Reliability
• Tester and corrector usually simpler than P– Use simple additions, etc.– Smaller probability of bugs– Smaller running time smaller prob of random
fault
• Tester and corrector run faster than P(Counting calls to P as constant time)
• Thus, very likely different implementations– Small probability of same bug in tester/corrector
and P
CS717
Added Reliability
• One tester/corrector pair per function – Reusable across implementations
• Can design testers/correctors for specific libraries– Applications using lots of library code (ex: Java,
C#) mostly covered
• Thus, can spend more time debugging
• Bottom Line: – P may be faulty– Tester/Corrector assumed reliable
CS717
General Technique: Testing
• Will test linearity property– Shared by many functions– f(x+y) = f(x) + f(y)
• Will show: – If linearity holds for implementation P on inputs– Then linear function g s.t. P(x)=g(x) for most
inputs x
(i.e. if P mostly linear then close to actually linear function)
– Since P(x)=f(x) on many inputs, g = f
CS717
General Technique: Correcting
• Focus on “random self-reducible” functions
• Can express f(x) as f(y1), f(y2), … f(yk) where y1, y2, … yk random– Assumes simple reduction function R()
• To get P(x), run several times:– Pick random y1, y2, … yk
– P(x) = R(P(y1), P(y2), … P(yk))
• If on input x P(x) wrong, now correct with high probability
CS717
True Definitions
• error(f, P, D) : probability that P(x)f(x) with x randomly chosen via distribution D
• Probabilistic oracle program– Probabilistic program M– Makes calls to external oracle program A
• Calls counted as constant time
– Syntax: MA
CS717
True Definition: Self-Tester
• Let 0 1 < 2 1
• Confidence parameter >0
• (1, 2) – self-testing program for f
– Probabilistic oracle program Tf
– If error(f, P, D) 1 then TfP returns PASS with
Prob 1-– If error(f, P, D) 2 then Tf
P returns FAIL with Prob 1-
1 2
error(f, P, D)
FAILPASS0 1
CS717
True Definition: Self-Corrector
• Let 0 < 1• Confidence parameter >0
– self-correcting program for f– Probabilistic oracle program Cf
– If error(f, P, D) then CfP(x) = f(x) with
Prob 1-• x randomly selected from distribution D
CS717
Self-Tester/Corrector Pair
• Let 0 1 < 2 1
• Confidence parameter >0
• Tf (1, 2) – self-tester for f
• Cf – self-corrector for f
• Cf applied when Tf doesn’t FAIL
1 2 FAILNOT FAIL
0 1
error(f, P, D)
CS717
Outline
• Self-correctors– Assume existence of self-testers– Simpler to prove correctness
• Self-Testers– Correctness harder to prove– Requires a bit of group theory
CS717
Self-Correctors
• Assume that for implementation Perror(f, P, D) known– Determined by self-tester– error(f, P, D) constant
• Assume function f self-reducible– f(x) = R(f(y1), f(y2), … f(yk))
– y1, y2, … yk random• Not independent of each other
CS717
General Self-Corrector
• For i = 1 to q– Pick random y1, y2, … yk
– answeri = R(P(y1), P(y2), … P(yk))
• Output majority answer out of answer1…q
• Note: For any call to P we ensure output in correct range
CS717
Improving Correctness
• Before: P(x)f(x) for some x’s– P(x)f(x) on constant inputs x
• Now: – Each iteration has prob of failure k
• Prob[R(P(x1), P(x2), …, P(xk)) R(f(x1), f(x2), …, f(xk)) = f(x)] k
• Since k calls to P
– Goal: compute correct f(x) with prob 1-• Each self-reduction is independent• Thus, error probability goes down exponentially• Can get 1- success prob after “several” repeats• “Several” ln(1/)
CS717
Self-Reducibility of Mod
• Want to compute f(x) = (x mod R)– R fixed– x [0…R2n-1]
• Self-reducible:– (x mod R) = (x1 mod R) + (x2 mod R)
– x1 picked uniformly at random
– x2 picked s.t. x = x1 + x2
• Since ([x1+x2] mod R) = (x1 mod R) + (x2 mod R)
CS717
Self-Corrector for Mod
• For i = 1 to q– Pick random x1
– Pick x2 s.t. x=x1+x2
– answeri = P(x1) + P(x2)
• Return majority among answer1…q
CS717
Self-Reducibility of Modular Mult
• Want to compute f(x,y) = xy mod R– R fixed– x, y [0…2n-1]
• Self-Reducible– xy = (x1y1 mod R) + (x1y2 mod R) +
(x2y1 mod R) + (x2y2 mod R)– x1, y1 picked uniformly at random– x2, y2 picked s.t. x = x1 + x2, y = y1 + y2
• Since x1y1 + x1y2 + x2y1 + x2y2 =
= x1(y1 + y2) + x2(y1 + y2) =
= (x1+x2) (y1 + y2) = xy
CS717
Self-Corrector for Modular Mult
• For i = 1 to q– Pick random x1, y1
– Pick x2 s.t. x=x1+x2
– Pick y2 s.t. y=y1+y2
– answeri = P(x1,y1) + P(x1,y2) + P(x2,y1) + P(x2,y2)
• Return majority among answer1…q
• Corrector for integer multiplication similar
CS717
Self-Reducibility of Matrix Mult
• Want to compute f(A,B) = AB– A, B matrices
• Self-Reducible– AB = A1B1 + A1B2 + A2B1 + A2B2
– A1, B1 picked uniformly at random
– A2 = A – A1, B2 = B – B1
• Since A1B1 + A1B2 + A2B1 + A2B2 =
= A1(B1 + B2) + A2(B1 + B2) =
= (A1+A2) (B1 + B2) = AB
CS717
Self-Corrector for Matrix Mult
• For i = 1 to q– Pick random A1, B1
– Pick A2 s.t. A2 = A - A1
– Pick B2 s.t. B2 = B – B1
– answeri = P(A1,B1) + P(A1,B2) + P(A2,B1) + P(A2,B2)
• Return majority among answer1…q
• Polynomial multiplication works same way
CS717
Self-Correctors Summary
• Can design self-corrector for any random self-reducible function– Unknown how large this set is
• Self-Correctors very simple– Finite bound loop– Simple reduction function
• Thus, low probability of error in self-corrector
• Number of calls to P increases by log factor
CS717
Outline
• Self-correctors– Assume existence of self-testers– Simpler to prove correctness
• Self-Testers– Correctness harder to prove– Requires a bit of group theory
CS717
Moving to Self-Testing
• Can now self-correct– Self-correctors only work if error(f, P, D)
constant
• But how can we know?
• Self-testing tells us
CS717
Self-Testing
• Given implementation P of function f• Want to know error(f, P, D) = Prob[P(x)f(x)]
for x chosen with distribution D– Assume P(x) = f(x) for many x’s
• Will present self-testers for linear functions• Linear: f(x1+x2) = f(x1) + f(x2)
• Tests will bound error(f, P, D)
CS717
Linearity
• Modf(x1+x2) = (x1+x2) mod R = = (x1 mod R)+(x2 mod R)
• Modular Multf(x1+x2) = ((x1+x2)y) mod R = = (x1y mod R) + (x2y mod R)...
CS717
Self-Testing Technique
• Central Claim: If P(x) is linear in many spotsThen linear function g s.t. low prob of P(x)g(x)
• If P mostly linear then P very close to actually linear function
• Since assumed: for most x P(x) = f(x)Then: the linear function P is close to is f()
CS717
Self-Testing Technique
• Thus self-tester repeatedly:– Picks random inputs
– Verifies P(x1+x2) = P(x1) + P(x2)
– Verifies P(x1+1) = P(x1) + 1
CS717
Intro to Group Theory
• To explain why this works, must introduce group theory
• Group: set of numbers with operation: <S, >– Set: S– Binary Operation:
• Given two group elements, produces another group element
• Examples– Infinite group: Integers with = +– Finite group: Even integers [0…98], with
= + mod 100
CS717
Identity Element
• Special element 0 hS. 0 h = h– Called “Identity Element”
• Ex: Rationals, with – x y = another rational– 1 is identity since 1 x = x
• Inverse: x-1 s.t. x x-1 = 0
CS717
Group Generators
• Powers: hn = h h h … h (n times)– an am = an+m
• Each group has elements g1,…,gc s.t.hS. n1n2…nc. h=
• Thus, all elements expressible in terms of g1,…gc
– They “generate” the group
cnc
nn ggg 2121
CS717
Subgroups
• Given group <S, >• Subgroup: <T, > if TS• Ex:
– S = Even Integers [0…98] = + mod 100
– T = Multiples of 4 [0…98]
• Each subgroup has own generators– <S, > generated by 2– <T, > generated by 4
• T = {0} always subgroup
CS717
Subgroups Example
• Let G = <integers [0…14], + mod 15>• Group generator: 1• Subgroups:
– {0}: generated by 0– {0, 3, 6, 9, 12}: generated by 3– {0, 5, 15}: generated by 5
• All subgroups generated by some power of 1– In general, every subgroup generated by some
combo of group generators
CS717
Functions on Groups
• is function from group to itself• Can define others
– Ex: given additive group, can define multiplication
• Functions from one group to another– G1: even integers [0...98]
– G2: powers of 4 [0…186]
– f : G1G2
• f(xG1) = (x2) G2
CS717
Homomorphisms
• Homomorphism: function on groups s.t.– f : GAGB
• GB = <SA, A>• GB = <SB, B>
– f(x1 A x2) = f(x1) B f(x2)– i.e. Linear function on groups
• Example: modular exponentiation– GA = Integers [0...R2n], with +– GB = Integers [0...R], with – fa(x) = ax mod R– fa(x1+x2) = ax1+x2 mod R = (ax1 mod R) (ax2 mod R)
= f(x1) f(x2)
CS717
Homomorphisms and Subgroups
• Fact: homomorphisms map subgroups to subgroups
• f: G1G2
xH1, H1 subgroup of G1
f(x) are all the members of some H2, subgroup of G2
Homomorphism
G1 G2
CS717
Self-Testing Technique
• Given implementation P of function ff : GAGB
• Linear Test:For i=1 to n
Randomly pick x1, x2
Check if
• If GB has no finite subgroups (besides {0B}) then done
)()()( 2121 xPxPxxP
CS717
Self-Testing Technique
• Given implementation P of function ff : GAGB
• If GB has finite subgroups (besides {0B})
Neighbor Test:For i=1 to n
For each generator gi of group GA
Randomly pick z
Check if
– is f(z)’s neighbor in target subgroup of GB
)(zF ineighbor
)()()( zFzPgzP ineighborBiA
CS717
Proof
• Method: – Assume that linearity and neighbor tests passed
• i.e. error for tests bounded by constant
– Prove that P mostly equal to some linear g()
• Steps– Define discrepancy function
disc(x) = g(x) P-1(x)• g() some linear function, operating on same groups as f()• P(x)=g(x) for most x disc(x)=0 for most x
– disc(x) is homomorphism linearity test passes for x
• Thus, if linearity test passed on many x’s, disc is probably homomorphism
CS717
Proof
• f() maps finite group to (possibly infinite) group– Mod – – Integer Mult – – Modular Mult – – …
• g() and disc() map same groups
RRZZf n 2
:
ZZZf nn 22
:
RZZZf nn 22
:
CS717
Proof
• Recall: homomorphisms map subgroups to subgroups– Thus, homomorphism disc() can only map domain
group to finite subgroup of range group
• Question: Does range group have finite subgroups?– (besides {0B})
CS717
Proof
• Range group: no finite subgroups besides {0B}
• Then disc must map domain group to {0B}
• disc(x) = 0B
– (On most x’s, since disc probably homomorphism)Finite Group Infinite Group
Homomorphism {0}
CS717
Proof
• Range group: has real finite subgroups• Each subgroup must have generators
– Those generators are products of the group’s generators
• Neighbor test:
For each generator gi
• Ensures any such subgroup must be {0B}
)()()( zFzPgzP ineighborBiA
CS717
Proof
• Thus, disc(x) = 0B (with high probability)
• Recall: disc(x) = g(x) P-1(x)• Thus, P(x) = g(x) (with high probability)
– g() still undefined linear function
• Assumption: P(x) = f(x) on many inputs• Fact: if two linear functions equal on >1 point,
they are same function• Thus, g = f and P(x)=f(x) (with high probability)
CS717
Summary of Self-Testers
• Can verify if P(x)=f(x) with high probability– Useful information to know– Required for self-correctors to work reliably
• Simple implementation– Constant bound loop– A few additions, multiplications, etc.
• Can be self-corrected themselves
CS717
Summary
• Paper presents the basics of broader theory of self-testing/-correcting
• Some additional related papers:– S. Ravikumar and D. Sivakumar. "Efficient Self-Testing of
Linear Recurrences". • Expands these testers to test linearity in bulk fashion for linear
recurrences
– Ronitt Rubinfeld and Madhu Sudan. "Robust Characterizations of Polynomials with Applications to Program Testing", SIAM Journal on Computing, 1996.
– Funda Ergun, S. Ravi Kumar and Ronnitt Rubinfeld. "Checking Approximate Computations of Polynomials and Functional Equations", IEEE Conference on Foundations of Computer Science, 1996.