CS717 Checking Specific Algorithms Greg Bronevetsky.

CS717

Checking Specific Algorithms

Greg Bronevetsky

CS717

Objective

• Primary goal: find checkers that can be applied to any program

• Along the way must look at checkers for specific algorithms

• Each checker exploits special properties of target algorithm– General patterns may become apparent– Useful for toolkit-style checker generator

• Library of algorithm-specific checkers used where appropriate

CS717

Outline

• Low-hanging fruit– Himanshu Gupta. "Result Verification Algorithms for

Optimization Problems", 1995• Checkers for optimizations programs

• Existence of checkers for certain complexity classes

• Certification Trails– Gregory F. Sullivan, Dwight S. Wilson and Gerald M.

Masson, “Certification of Computational Results”, 1995.

• Fundamentals of program testing & correcting– Manuel Blum, Michael Luby and Ronitt Rubinfeld. "Self-

Testing & Correcting with Applications to Numerical Problems"

CS717

Low-Hanging Fruit Outline

• “Greg’s Theorem”

• Himanshu Gupta:– Checkers for optimization problems (primal-dual)

• Max Flow• Min-cost Flow• Unweighted & Weighted bipartite matching• All Shortest Paths

– Existence proofs for verifiers• Verifiers for NP-complete and -hard languages• Approximate verifiers

– Verifiers with Certification Trails

CS717

Intro to Complexity

• Problems defined as sets of strings– Particular set = “language”

• Example language: set of palindromes– 001100, 01011010, etc.

• Membership in given language decidable by computing device of given power– ex: Turing machine running in polynomial time– Machine A on input x

• Runs for time poly(|x|)• Returns ACCEPT : xL(A)

REJECT : xL(A)

CS717

Intro to Complexity

• Complexity class: set of languages accepted by machines of given power

• Examples:– P : Turing machines with running time

O(poly(|x|))– EXPSPACE : Turing machines that use O(2|x|)

memory– BC – Boolean circuits of bounded depth

• Big deal: can show some problems inherently harder than others

CS717

Basics of Reductions

• Problem A reducible to problem B if can easily solve A given algorithm for B

• Reduction:– Given x = input to A– Efficiently transform x to y = input to B

• Efficiently usually means poly-time

– Ask algorithm for B if y valid B-string• B is an “oracle”

– x valid A-string y valid B-string

• Thus, B at least as hard as A

CS717

Basics of Reductions

• Can compare difficulties of problems via reductions

• If all problems in class C1 reducible to problems in class C2 then C1 C2

• Example:– P NP– Unknown if NP P

• If yes, then P=NP• If no, then some problems in NP cannot be solved in

polynomial time

CS717

Theorem 1

• Let A be an implementation of an algorithm– Runs in time = O(p()), p some polynomial

• Does there exist checker C for A that runs in time < p()?– C accepts A’s input & output, returns PASS/FAIL– C knows A’s source code– Runs in time O(q()) < O(p()), asymptotically faster

• A is decision algorithm, so output is 1 bit

A CInput {0,1} {PASS, FAIL}

CS717

Theorem 1

• A’s input = x: n bits• A’s output = y: 1 bit• Suppose C can verify whether xL(A) using y

– In O(q(|x|)) time

• Then can decide L(A) in O(q(|x|)) time– Given input x– Run C on <x,1>– If C(x,1) returns PASS, ACCEPT x– If C(x,1) returns FAIL, REJECT x

CS717

Theorem 1

• A reduced to C

• C runs in O(q()) time

• Thus, p()-time problems solvable in q()-time– This is not true

• Therefore, for p()-time decision algorithm A checker C running in <O(p()) time

CS717

Theorem 1.1

• Lets make job simpler for checker• Helper proof

– Suppose A outputs helper proof– C can use helper proof to simplify checking

• Size of helper proof q(|x|)– Since C can only read q(|x|) bits of proof

A CInput {0,1} {PASS, FAIL}HelperProof

CS717

Theorem 1.1

• A’s input = x: n bits• A’s output = y: 1 bit + q(|x|)-bit proof• Suppose C can verify whether xL(A) using y


• Then y is witness for xL(A)

CS717

Nondeterministic Time Classes

• Non-deterministic automaton for language L:– Given input x– Runs for O(f(|x|)) amount of time– Can make binary guesses {0,1}– If set of guesses s.t. automaton would accept,

then xL

......

......

• At each guess, automaton splits– 0 branch, 1 branch

• If accepts along some path then whole automaton accepts

0 1

0 1 0 1

CS717

Nondeterministic Time Classes

• String of guesses leading to ACCEPT called a “witness”

• Proves that xL– Can deterministically verify xL– No need to guess: witness gives correct guesses

• NP – runs for polynomial time witnesses poly-length

• Nondet(f()) – runs in time O(f(|x|)) witness length O(f(|x|))

CS717

Theorem 1.1

• A’s input = x: n bits• A’s output = y: 1 bit + q(|x|)-bit proof• Suppose C can verify whether xL(A) using y


• Then y is witness for xL(A)

• Thus, xL(A) decidable by nondet automaton running in O(q(|x|)) time

CS717

Theorem 1.1

• Guess all bits of proof– Takes q(|x|) guesses

• For each proof, run C on y=<1, proof>• Accept if C says PASS on some

y=<1, some proof>

Guess y

C

0 1

0 1 0 1

... ... ... ...C C C

CS717

Theorem 1.1

• xL(A) decidable by nondet automaton running in O(q(|x|)) time

• Thus, L(A) Nondet(q()) – Recall: A runs in time O(p())>O(q())

• Det(p()) Nondet(q()) Nondet(p()) Det(p()) Nondet(p())

• Shown: If L(A) Det(p()) and proof-checker that runs in time O(q()) where q()<p() then Det(p()) Nondet(p())

CS717

Theorem 1.1

• Shown: If L(A) Det(p()) and checker that runs in time O(q()) where q()<p() then Det(p()) Nondet(p())– Known Det(p()) Nondet(p()) for few functions– In general, as hard as P NP

• Worst case scenario:– If discover efficient checkers, then Det(p())

Nondet(p()) • Very difficult

– If Det(p()) Nondet(p()) proven, no help to us

CS717





– Existence proofs for verifiers• Verifiers for NP-complete and NP-hard languages• Approximate verifiers

CS717

Definitions

• Problem: IS– I: set of inputs– S: set of solutions

• Algorithm: on input xI, returns yS s.t. <x,y>

• Verification problem:V()(IS){PASS,FAIL}– <(x,y),PASS>V() iff <x,y>– <(x,y),FAIL>V() iff <x,y>

CS717

Maximum Flow

• Given:– Directed graph G=(V,E)– Edges labeled with capacities

• c(u,v) = capacity of edge

– Two special nodes: S and T

• Must find “maximum flow” from s to t

c1

c2

c3

c4c5

c6c7

c8

c9

S

T

CS717

Maximum Flow

• Find edge labeling f– f(u,v) = flow on edge uv c(u,v)

• Sum of flows coming into node = Sum of flows coming out of node– Outflow of node s > inflow– Inflow of node t > outflow

• Net outflow of s = Net inflow of t = = “Network flow”

• Problem: Find labeling f generating largest network flow

CS717

Residual Capacities

• Residual Capacity for flow f:res(u,v) = c(u,v) – f(u,v)

(Capacity of edge unused by flow)

• Residual Graph Rf: graph on V,E where

• Augmenting Path: Path through Rf

0),(

0),(),(),(

vuresifedgeno

vuresifvuresvuc

CS717

Augmenting Paths Verifier

• Theorem: Rf contains augmenting path f is not maximum flow

• Example: one common max flow algorithmPick a flowWhile can find augmenting paths

Adjust flow to remove augmenting path

• Thus, to check if given flow is max flow: – Ensure given flow valid– Construct Rf

– Ensure that no augmenting path exists• Path search much faster than max-flow algorithm

CS717

Min-Cost Flow

• Now edges have cost• Goal: find flow of given size with minimal cost

– For any potential flow construct “Residual Graph” Rf

– Theorem: given flow is min-cost if Rf has no negative-cost cycles

• Thus, checker– Ensures alleged min-cost flow is valid flow

– Constructs Rf

– Ensures that no negative-cost cycles exist• Cycle search much faster than min-cost flow algorithm

CS717

Matching

• Given bipartite weighted/unweighted graph find subset of edges s.t.– No left node has >1 right node as neighbor– No right node has >1 left node as neighbor

X Y

Maximal Matching Size=3

CS717

Verifier for Matching

• Matching also has augmenting paths• Can verify matching:

– Ensure that matching valid– Search of augmenting path

• Above problems examples of matroids– Intuitively: problems solvable via basic greedy

algorithm

• All matroids have this structure• Thus, can easily check matroid problems

CS717

Other Matroid Problems

• Minimal Spanning Tree

• Minimal Cut in Graph

• Maximal Basis in Linear Space

• …

CS717






CS717

All Shortest Paths

• Given undirected graph G=(V,E)– Each edge (i,j) has weight w(i,j)

• Problem: return the shortest path between every pair of nodes

• Assume output format:– P : predecessor matrix

P[i,j] = k s.t. edge (k,j) lies on shortest path ij

(i.e. node k precedes node j on path ij)

– D : distance matrixD[i,j] = shortest distance from i to j

(D[i,i] = 0)

i jP[i,j]

CS717

Checker Algorithm

• Step 1: ensure that P[] and D[] self-consistentforeach (i,j)

Ensure that D[i,j] = D[i, P[i,j]] + w(P[i,j], j)

If error seen, return FAIL

• Step 2: ensure output paths truly shortestforeach node i

foreach edge (u,v)

if(D[i,u] + w(u,v) < D[i,v]) return FAIL

• Return PASS

CS717

Checking Self-Consistency

• foreach (i,j)


• Clearly, if output correct, test returns PASS

i j

D[i, P[i,j]] w(P[i,j], j)

P[i,j]

D[i,j]

CS717

Checking Self-Consistency

• foreach (i,j)


• If output incorrect, two possible problems– P[ ] may not represent valid paths– D[ ] may not be true lengths of P[ ]’s paths

i j

D[i, P[i,j]] w(P[i,j], j)

P[i,j]

D[i,j]

CS717

Checking D[ ]

• Suppose D[i,j] ij distance according to P[ ]• Pick pair i,j with shortest true distance

– Let k = P[i,j]– Check: D[i,j] ?= D[i, k] + w(k, j)– No errors in D[ ] for pairs closer than i and j

• Thus, D[i, k] correct distance ij’s predecessor• So, D[i, k] + w(k, j) is correct ij distance

– Thus, will detect error

i jP[i,j]

k

CS717

Checking P[ ]

• P[i,*] must induce spanning tree on nodes

• Thm: Graph on n nodes is spanning tree n-1 edges and acyclic– Clearly, n-1 edges since each node ji has P[i,j]

• Above check will fail if cycle exists

i

CS717

Checking P[ ] : Acyclicity

• Suppose cycle exists• Suppose all edges of u-v cycle have been

checked (and passed), uv is last

i

path

u v

D[i,v]D[i,u]

• Claim: if cycle exists then D[ ] has error

CS717

Checking P[ ] : Acyclicity

• vu path passed, so: D[i,u] = D[i,v] + w(path)

• In order for uv to pass: D[i,v] = D[i,u] + w(u,v)

• But then, D[i,v] = w(u,v) + (D[i,v] + w(path))– Contradiction!

• Thus, cycle exists error in D[ ] detected

i

path

u v

D[i,v]D[i,u]foreach (i,j) D[i,j] = D[i, P[i,j]] + w(P[i,j], j)

CS717

Checker Algorithm

• Step 1: ensure that P[] and D[] self-consistentforeach (i,j)


If error seen, return FAIL

• Step 2: ensure output paths truly shortestforeach node i

foreach edge (u,v)

if(D[i,u] + w(u,v) < D[i,v] return FAIL

• Return PASS

CS717

Checking Shortest Paths

• Test nodes i, edges (u,v):D[i,u] + w(u,v) ? D[i,v] (FAIL if <)

• Test checks if path through u shorter than “official” path

• If output correct:– If u on shortest path, D[i,u] + w(u,v) = D[i,v]– If u not on shortest path, D[i,u] + w(u,v) D[i,v]

• Thus, on correct output PASS returned

i vu

CS717


• Suppose output incorrect: D[i,j] not length of shortest ij path– Step 1 ensures D[i,j] true distance of path ij

according to P[ ]

• Let – D[ ] = distance function for truly shortest solution– P[ ] = predecessor function for shortest solution

CS717


• Pick i,j s.t. – D[i,j] D[i,j]– No other erroneous pair closer to each other (by P[ ])

• Thus, uP[i,j] s.t. D[i,u] + w(u,j) < D[i,j]– Path thru u shorter than path thru P[i,j]– If shortest path is thru P[i,j], then D[i, P[i,j]] is wrong

• Big Question: Will test detect this?

i j

P[i,j]

u Shorter PathWrong Path

CS717


• Thus, uP[i,j] s.t. D[i,u] + w(u,j) < D[i,j]

• For each node i, test looks at all edges • When looks at uj:

– Evaluates: D[i,u] + w(u,j) ? D[i,j]– D[i,u] = D[i,u]

• Since i,j closest erroneous pair

– Thus, D[i,u] + w(u,j) = D[i,u] + w(u,j) < D[i,j]

i j

P[i,j]

u Shorter PathWrong Path

CS717


• For each node i, test looks at all edges• When looks at u:

– Evaluates: D[i,u] + w(u,v) ? D[i,v]– D[i,u] = D[i,u]

• Since i,j closest erroneous pair

– Thus, D[i,u] + w(u,v) = D[i,u] + w(u,v) < D[i,v]

• Test returns FAIL

• Shown: if output not shortest paths test returns FAIL

CS717

Summary of Checkers

• Shown checkers for– Matroids– All-shortest paths

• Checkers very specific to target algorithm

• In general: – Find invariant of output– Ensure it is true

CS717






CS717

Verifiers for NP-Complete Problems

• Given NP-complete decision problem I{0,1}

’s verification problem: V(){I{0,1}} {P,F}

reducible to V()…

CS717

Verifiers for NP-Complete Problems

• Given NP-complete decision problem I{0,1}

’s verification problem: V(){I{0,1}} {P,F}

reducible to V()– On input x, check if <x,P>V()– Mapping computable in poly-time– Thus, V() NP-complete

• Proven: if NP-complete then so is V()

CS717

Verifiers for Some NP-Hard Problems

• Given NP-Hard problem HI

• On input x, returns y s.t.– y<O(f(|x|))– f(|x|) O(poly(|x|))– f() poly-time computable

• Verification problem V(H){I}{P,F}

CS717

Verifiers for Some NP-Hard Problems

• Reduction from H to V(H):– Given x, compute f(|x|)– Try all possible values for y{0 … f(|x|)}

• If <<x,y>, P>V(H) then ACCEPT

– If not accepts, REJECT

• Reduction is poly-time• Thus, V(H) is NP-Hard (unless P=NP)

• Examples: min vertex cover, max clique, chromatic number, max cycle length, etc.

CS717






CS717

Maximization Problems

• Given NP-Hard Maximization problem HI

• Output: size of maximum solution to problem

• For each H NP-Complete Decision problem HD– Input <x,y>– ACCEPT iff maximal solution to problem is y

• Can use binary search to find solve H using log |x| calls to HD

CS717

Approximate Algorithms

• Given NP-Hard Maximization problem HI

• AH is (a,c) approximation algorithm for H if– On input x, outputs y– y max ay + c– max is maximal solution of H– Runs in poly time

• (a,c) approximation verifier: AVH– Input <x,y>– Returns PASS iff y max ay + c

CS717

Approximation Verifiers NP

• Thm: If poly-time (a,c) approximation algorithm AH exists for NP-Hard maximization problem H,

Then approximation verifier AVH is NP-Complete

• Proof: Reduce sibling decision problem HD to AVH– Calls AH– Uses AVH as oracle

CS717

Reducing HD

• Input <x,y>• ACCEPT iff ymax

– max = size of maximal solution to input x

• Step 1: call AH– Get m=AH’s approximation of max

• Step 2: work– If y m, ACCEPT– If y > m

• If y > am + c, REJECT• Else, use AVH Oracle to see if <x,y>L(AVH)

– If <x,y>L(AVH), ACCEPT– Else, REJECT

CS717

Reducing HD

• Given input <x,y> to HD• Decided whether ymax by calling

– AH – approximation algorithm– AVH – approximation verifier

• AH runs in poly time– Reduction is poly-time, so can run AH as part of

reduction

• <x,y> transformed to input to AVH– Thus, AVH at least as hard as HD

CS717

Proven

• NP-Hard Maximization problem AND

• Poly-time approximation algorithm for it

• Approximation verifier for approximation algorithm is NP-Complete

CS717

Algorithm-specific Verifiers

• If poly-time approximation algorithm exists, trivial approximation verifier:– On input: <x,y>– Run fault-free version of approximation algorithm

on x– See if returns y

• Verifier still poly-time

• Only works for specific approximation algorithm– Won’t work for different approximators of same

problem

CS717

Low-Hanging Fruit Summary

• Covered several basic theorems

• Give idea of what is possible/impossible

• Fairly basic results– Equivalent to homework in graduate theory class

CS717

Outline








CS717

Certification Trails

• Algorithm runs on input• Produces:

– Regular output– Certification Trail: short proof that output matches

input

• Certifier runs on <input, trail>– Produces same output or return FAIL– Additional info from trail speeds recomputation– If trail wrong, may still output correctly

• Certifiers presented here will just FAIL

CS717

Basic Definitions

• D = set of inputs• S = set of valid outputs• T = set of certification trails

• Original program – P: D ST– Accepts input, returns <output, trail>

• Certifier– C: DT S{FAIL}– Accepts <input, trail>, returns output or FAIL

CS717

Focus of Certification Trails

• Paper comes from Software Engineering background

• Thus, focus on detecting programmer errors– Argue that P and C different algorithms– Thus, implementations different– Low probability of same errors in P and C

• Hardware faults also mentioned

CS717

Certification Trails Outline

• Paper presents checkers for several common algorithms– Sorting– Convex Hull

• Neat approach• Problem-specific and very manual• Little insight into general procedure

– Though, can create fault tolerant libraries

CS717

Sorting

• Given list of numbers, return the numbers in sorted order

• Trivial check: – Given allegedly sorted output– Check that order non-decreasing

• This doesn’t work:– Output must be permutation of input – Above checker would

• On input [2, 4, 6, 8]• Accept [0, 0, 0, 0, 0, 0]

CS717

Correct Certifier

• Certification Trail: list of indexes– All input elements get ID

• ith element gets ID i

– At spot j of trail: ID of element in sorted position j

0

121

452

93

264

335

176

117

92IDs:

Data:

2

96

110

125

173

264

331

457

92IDs:

Sorted Data:

2 6 0 5 3 4 1 7IDs in Trail:

CS717

Checking Permutation

• Certifier gets – Input numbers– Their IDs in sorted order

• Uses ID list to reorder input numbers– ID list serves as cheat sheet– Shows correct sort decisions

12 45 9 26 33 17 11 92Data:

2 6 0 5 3 4 1 7IDs in Trail:

2

96

110

125

173

264

331

457

92Sorted(?) IDs:

Sorted(?) Data:

CS717


• Two things to check– Onto: For each sorted element, input element – 1-1 : All sorted elements refer to different input

elements

12 45 9 26 33 17 11 92Data:

2 6 0 5 3 4 1 7IDs in Trail:

2

96

110

125

173

264

331

457

92Sorted(?) IDs:

Sorted(?) Data:

CS717


• Onto: For each sorted element, input element– Check that all IDs in trail are valid

• 1-1 : All sorted elements refer to different input elements– If two IDs in trail equal then some input element

not touched• Not copied to sorted list• Use touch counters

Input

Sorted

0 1 2 3 4 5 6 7

0 1 3 3 4 5 6 7

CS717

Checking Sort

• To check sort– Traverse reordered list– Ensure non-decreasing

• Sorter time: O(nlog n)• Certifier time: O(n)

– Asymptotically faster

• Big trick: – Trail summarizes decisions made by sorter– Enough info to quickly recompute– Can any problem be cast into set of big decisions?

CS717

Certification Trails Outline

• Paper presents checkers for several common algorithms– Sorting– Convex Hull

• Neat approach• Problem-specific and very manual• Little insight into general procedure

– Though, can create fault tolerant libraries

CS717

Convex Hull Problem

Given set of points on 2D plane, find subset that forms convex hull around all points.

CS717

Convex Hull: Step 1

P1 is the

point with the least x-coordinate.

P6

P2

P8

P3

P5

P1

P7

P4

Points sorted in order of increasing slope relative to P

1

CS717

Convex Hull: Invariant

P6

P2

P8

P3

P5

P1

P7

P4

All the points not on Hull are inside triangle formed by P

1 and two successive points on Hull

CS717


P6

P2

P8

P3

P5

P1

P7

P4

P3 not Hull point clockwise angle

between lines P2P3 and P3P4 ≥ 180º

≥ 180º

CS717


P6

P2

P8

P3

P5

P1

P7

P4

< 180º

If clockwise angle between lines P2P3 and P3P4 < 180º, then P

3 is Hull point

CS717

Convex Hull Algorithm Outline

• Walk through P2 to Pn in slope order

• Keep adding points to hull

• If find point generating angle ≥ 180º– Back up, remove all such points

• Until added Pn

– P1, P2 and Pn must be on convex hull

CS717

Convex Hull Algorithm

• Add P1, P2 and P3 to the Hull

• For Pk = P4 to Pn

(... trying to add Pk to the Hull …)

– Let QA and QB be the two points most recently added to the Hull:

– While the angle formed by QAQB and QBPk ≥ 180

• remove QB from the Hull since it is inside the triangle: P1, QA, Pk.

– Add Pk to the Hull

CS717

Trail for Convex Hull

• Augment Program to output– {h1, h2, ..., hm} = indexes of points on hull

– For each point Pi not on hull, proof of why not

• Tuple (xi, hj, hk, hk) s.t. xi in triangle hjhkhl

– xi internal point

– hj, hk, hk hull points

P2

P3P

1

P5

P4

CS717

Convex Hull Checker

• Checker checks that:– There is 1-1 correspondence between input points

and {x1, x2, ..., xm}U{h1, h2, ..., hr}.

• xi internal points

• hi hull points

– Each point in triangle proofs lies in given triangle

• Basic error checking– Assures that all points accounted for

CS717

Convex Hull Checker

• Checker checks that:– For each triple of consecutive hull points hihi+1hi+2

lines hihi+1 and hi+1hi+2 form counter-clockwise angle 180

• i.e. form convex corners of hull

unique locally maximal point on hull

• There exists theorem saying: shape with above properties is convex hull

P6

P8

P1

180180

180

CS717

Proving Correctness

• If hull correct, will return PASS– Since we check properties true of convex hulls

• If hull not correct– If bad encoding of proof

• (i.e. points don’t refer to input)• First two checks will detect

– If internal points should be on hull• Will not be in any triangle

– If hull points not convex• Convexity assured by theorem, given points were valid

(ensured by first check)

CS717

Checking Convex Hull

• Original Algorithm: O(nlog n) time– Dominated by initial sort of points

• Checker: O(n) time– Asymptotically faster

• Very different algorithms– Little chance of similar errors in both– Hardware errors will affect both differently

CS717

Checking Convex Hull: Big Trick

• Trail contains key facts discovered by algorithm– Containment triangles

• Certifier checks main invariant– Convexity of hull

CS717

Outline








CS717

Goals

• Aimed at implementation errors• Given implementation P of function f

– Testing: must see if P(x)=f(x) for most inputs• If correct for all inputs, output PASS• If incorrect on too many inputs x, output FAIL

– Correcting: if inputs x for which P(x)f(x) then• Figure out correct P(x)• Assuming P correct on most inputs

• Tester and corrector may call P

CS717

Added Reliability

• Tester and corrector usually simpler than P– Use simple additions, etc.– Smaller probability of bugs– Smaller running time smaller prob of random

fault

• Tester and corrector run faster than P(Counting calls to P as constant time)

• Thus, very likely different implementations– Small probability of same bug in tester/corrector

and P

CS717

Added Reliability

• One tester/corrector pair per function – Reusable across implementations

• Can design testers/correctors for specific libraries– Applications using lots of library code (ex: Java,

C#) mostly covered

• Thus, can spend more time debugging

• Bottom Line: – P may be faulty– Tester/Corrector assumed reliable

CS717

General Technique: Testing

• Will test linearity property– Shared by many functions– f(x+y) = f(x) + f(y)

• Will show: – If linearity holds for implementation P on inputs– Then linear function g s.t. P(x)=g(x) for most

inputs x

(i.e. if P mostly linear then close to actually linear function)

– Since P(x)=f(x) on many inputs, g = f

CS717

General Technique: Correcting

• Focus on “random self-reducible” functions

• Can express f(x) as f(y1), f(y2), … f(yk) where y1, y2, … yk random– Assumes simple reduction function R()

• To get P(x), run several times:– Pick random y1, y2, … yk

– P(x) = R(P(y1), P(y2), … P(yk))

• If on input x P(x) wrong, now correct with high probability

CS717

True Definitions

• error(f, P, D) : probability that P(x)f(x) with x randomly chosen via distribution D

• Probabilistic oracle program– Probabilistic program M– Makes calls to external oracle program A

• Calls counted as constant time

– Syntax: MA

CS717

True Definition: Self-Tester

• Let 0 1 < 2 1

• Confidence parameter >0

• (1, 2) – self-testing program for f

– Probabilistic oracle program Tf

– If error(f, P, D) 1 then TfP returns PASS with

Prob 1-– If error(f, P, D) 2 then Tf

P returns FAIL with Prob 1-

1 2

error(f, P, D)

FAILPASS0 1

CS717

True Definition: Self-Corrector

• Let 0 < 1• Confidence parameter >0

– self-correcting program for f– Probabilistic oracle program Cf

– If error(f, P, D) then CfP(x) = f(x) with

Prob 1-• x randomly selected from distribution D

CS717

Self-Tester/Corrector Pair

• Let 0 1 < 2 1

• Confidence parameter >0

• Tf (1, 2) – self-tester for f

• Cf – self-corrector for f

• Cf applied when Tf doesn’t FAIL

1 2 FAILNOT FAIL

0 1

error(f, P, D)

CS717

Outline

• Self-correctors– Assume existence of self-testers– Simpler to prove correctness

• Self-Testers– Correctness harder to prove– Requires a bit of group theory

CS717

Self-Correctors

• Assume that for implementation Perror(f, P, D) known– Determined by self-tester– error(f, P, D) constant

• Assume function f self-reducible– f(x) = R(f(y1), f(y2), … f(yk))

– y1, y2, … yk random• Not independent of each other

CS717

General Self-Corrector

• For i = 1 to q– Pick random y1, y2, … yk

– answeri = R(P(y1), P(y2), … P(yk))

• Output majority answer out of answer1…q

• Note: For any call to P we ensure output in correct range

CS717

Improving Correctness

• Before: P(x)f(x) for some x’s– P(x)f(x) on constant inputs x

• Now: – Each iteration has prob of failure k

• Prob[R(P(x1), P(x2), …, P(xk)) R(f(x1), f(x2), …, f(xk)) = f(x)] k

• Since k calls to P

– Goal: compute correct f(x) with prob 1-• Each self-reduction is independent• Thus, error probability goes down exponentially• Can get 1- success prob after “several” repeats• “Several” ln(1/)

CS717

Self-Reducibility of Mod

• Want to compute f(x) = (x mod R)– R fixed– x [0…R2n-1]

• Self-reducible:– (x mod R) = (x1 mod R) + (x2 mod R)

– x1 picked uniformly at random

– x2 picked s.t. x = x1 + x2

• Since ([x1+x2] mod R) = (x1 mod R) + (x2 mod R)

CS717

Self-Corrector for Mod

• For i = 1 to q– Pick random x1

– Pick x2 s.t. x=x1+x2

– answeri = P(x1) + P(x2)

• Return majority among answer1…q

CS717

Self-Reducibility of Modular Mult

• Want to compute f(x,y) = xy mod R– R fixed– x, y [0…2n-1]

• Self-Reducible– xy = (x1y1 mod R) + (x1y2 mod R) +

(x2y1 mod R) + (x2y2 mod R)– x1, y1 picked uniformly at random– x2, y2 picked s.t. x = x1 + x2, y = y1 + y2

• Since x1y1 + x1y2 + x2y1 + x2y2 =

= x1(y1 + y2) + x2(y1 + y2) =

= (x1+x2) (y1 + y2) = xy

CS717

Self-Corrector for Modular Mult

• For i = 1 to q– Pick random x1, y1

– Pick x2 s.t. x=x1+x2

– Pick y2 s.t. y=y1+y2

– answeri = P(x1,y1) + P(x1,y2) + P(x2,y1) + P(x2,y2)


• Corrector for integer multiplication similar

CS717

Self-Reducibility of Matrix Mult

• Want to compute f(A,B) = AB– A, B matrices

• Self-Reducible– AB = A1B1 + A1B2 + A2B1 + A2B2

– A1, B1 picked uniformly at random

– A2 = A – A1, B2 = B – B1

• Since A1B1 + A1B2 + A2B1 + A2B2 =

= A1(B1 + B2) + A2(B1 + B2) =

= (A1+A2) (B1 + B2) = AB

CS717

Self-Corrector for Matrix Mult

• For i = 1 to q– Pick random A1, B1

– Pick A2 s.t. A2 = A - A1

– Pick B2 s.t. B2 = B – B1

– answeri = P(A1,B1) + P(A1,B2) + P(A2,B1) + P(A2,B2)


• Polynomial multiplication works same way

CS717

Self-Correctors Summary

• Can design self-corrector for any random self-reducible function– Unknown how large this set is

• Self-Correctors very simple– Finite bound loop– Simple reduction function

• Thus, low probability of error in self-corrector

• Number of calls to P increases by log factor

CS717

Outline

• Self-correctors– Assume existence of self-testers– Simpler to prove correctness

• Self-Testers– Correctness harder to prove– Requires a bit of group theory

CS717

Moving to Self-Testing

• Can now self-correct– Self-correctors only work if error(f, P, D)

constant

• But how can we know?

• Self-testing tells us

CS717

Self-Testing

• Given implementation P of function f• Want to know error(f, P, D) = Prob[P(x)f(x)]

for x chosen with distribution D– Assume P(x) = f(x) for many x’s

• Will present self-testers for linear functions• Linear: f(x1+x2) = f(x1) + f(x2)

• Tests will bound error(f, P, D)

CS717

Linearity

• Modf(x1+x2) = (x1+x2) mod R = = (x1 mod R)+(x2 mod R)

• Modular Multf(x1+x2) = ((x1+x2)y) mod R = = (x1y mod R) + (x2y mod R)...

CS717

Self-Testing Technique

• Central Claim: If P(x) is linear in many spotsThen linear function g s.t. low prob of P(x)g(x)

• If P mostly linear then P very close to actually linear function

• Since assumed: for most x P(x) = f(x)Then: the linear function P is close to is f()

CS717


• Thus self-tester repeatedly:– Picks random inputs

– Verifies P(x1+x2) = P(x1) + P(x2)

– Verifies P(x1+1) = P(x1) + 1

CS717

Intro to Group Theory

• To explain why this works, must introduce group theory

• Group: set of numbers with operation: <S, >– Set: S– Binary Operation:

• Given two group elements, produces another group element

• Examples– Infinite group: Integers with = +– Finite group: Even integers [0…98], with

= + mod 100

CS717

Identity Element

• Special element 0 hS. 0 h = h– Called “Identity Element”

• Ex: Rationals, with – x y = another rational– 1 is identity since 1 x = x

• Inverse: x-1 s.t. x x-1 = 0

CS717

Group Generators

• Powers: hn = h h h … h (n times)– an am = an+m

• Each group has elements g1,…,gc s.t.hS. n1n2…nc. h=

• Thus, all elements expressible in terms of g1,…gc

– They “generate” the group

cnc

nn ggg 2121

CS717

Subgroups

• Given group <S, >• Subgroup: <T, > if TS• Ex:

– S = Even Integers [0…98] = + mod 100

– T = Multiples of 4 [0…98]

• Each subgroup has own generators– <S, > generated by 2– <T, > generated by 4

• T = {0} always subgroup

CS717

Subgroups Example

• Let G = <integers [0…14], + mod 15>• Group generator: 1• Subgroups:

– {0}: generated by 0– {0, 3, 6, 9, 12}: generated by 3– {0, 5, 15}: generated by 5

• All subgroups generated by some power of 1– In general, every subgroup generated by some

combo of group generators

CS717

Functions on Groups

• is function from group to itself• Can define others

– Ex: given additive group, can define multiplication

• Functions from one group to another– G1: even integers [0...98]

– G2: powers of 4 [0…186]

– f : G1G2

• f(xG1) = (x2) G2

CS717

Homomorphisms

• Homomorphism: function on groups s.t.– f : GAGB

• GB = <SA, A>• GB = <SB, B>

– f(x1 A x2) = f(x1) B f(x2)– i.e. Linear function on groups

• Example: modular exponentiation– GA = Integers [0...R2n], with +– GB = Integers [0...R], with – fa(x) = ax mod R– fa(x1+x2) = ax1+x2 mod R = (ax1 mod R) (ax2 mod R)

= f(x1) f(x2)

CS717

Homomorphisms and Subgroups

• Fact: homomorphisms map subgroups to subgroups

• f: G1G2

xH1, H1 subgroup of G1

f(x) are all the members of some H2, subgroup of G2

Homomorphism

G1 G2

CS717


• Given implementation P of function ff : GAGB

• Linear Test:For i=1 to n

Randomly pick x1, x2

Check if

• If GB has no finite subgroups (besides {0B}) then done

)()()( 2121 xPxPxxP

CS717


• Given implementation P of function ff : GAGB

• If GB has finite subgroups (besides {0B})

Neighbor Test:For i=1 to n

For each generator gi of group GA

Randomly pick z

Check if

– is f(z)’s neighbor in target subgroup of GB

)(zF ineighbor

)()()( zFzPgzP ineighborBiA

CS717

Proof

• Method: – Assume that linearity and neighbor tests passed

• i.e. error for tests bounded by constant

– Prove that P mostly equal to some linear g()

• Steps– Define discrepancy function

disc(x) = g(x) P-1(x)• g() some linear function, operating on same groups as f()• P(x)=g(x) for most x disc(x)=0 for most x

– disc(x) is homomorphism linearity test passes for x

• Thus, if linearity test passed on many x’s, disc is probably homomorphism

CS717

Proof

• f() maps finite group to (possibly infinite) group– Mod – – Integer Mult – – Modular Mult – – …

• g() and disc() map same groups

RRZZf n 2

:

ZZZf nn 22

:

RZZZf nn 22

:

CS717

Proof

• Recall: homomorphisms map subgroups to subgroups– Thus, homomorphism disc() can only map domain

group to finite subgroup of range group

• Question: Does range group have finite subgroups?– (besides {0B})

CS717

Proof

• Range group: no finite subgroups besides {0B}

• Then disc must map domain group to {0B}

• disc(x) = 0B

– (On most x’s, since disc probably homomorphism)Finite Group Infinite Group

Homomorphism {0}

CS717

Proof

• Range group: has real finite subgroups• Each subgroup must have generators

– Those generators are products of the group’s generators

• Neighbor test:

For each generator gi

• Ensures any such subgroup must be {0B}

)()()( zFzPgzP ineighborBiA

CS717

Proof

• Thus, disc(x) = 0B (with high probability)

• Recall: disc(x) = g(x) P-1(x)• Thus, P(x) = g(x) (with high probability)

– g() still undefined linear function

• Assumption: P(x) = f(x) on many inputs• Fact: if two linear functions equal on >1 point,

they are same function• Thus, g = f and P(x)=f(x) (with high probability)

CS717

Summary of Self-Testers

• Can verify if P(x)=f(x) with high probability– Useful information to know– Required for self-correctors to work reliably

• Simple implementation– Constant bound loop– A few additions, multiplications, etc.

• Can be self-corrected themselves

CS717

Summary

• Paper presents the basics of broader theory of self-testing/-correcting

• Some additional related papers:– S. Ravikumar and D. Sivakumar. "Efficient Self-Testing of

Linear Recurrences". • Expands these testers to test linearity in bulk fashion for linear

recurrences

– Ronitt Rubinfeld and Madhu Sudan. "Robust Characterizations of Polynomials with Applications to Program Testing", SIAM Journal on Computing, 1996.

– Funda Ergun, S. Ravi Kumar and Ronnitt Rubinfeld. "Checking Approximate Computations of Polynomials and Functional Equations", IEEE Conference on Foundations of Computer Science, 1996.

Date post:	01-Jan-2016
Category:	Documents
Upload:	jocelyn-hubbard
View:	214 times
Download:	2 times

CS717 Checking Specific Algorithms Greg Bronevetsky.

Documents