Post on 21-Dec-2015
transcript
Administrivia
Me: Ryan O’Donnell; email: odonnell@cs.cmu.edu
Office hours: Wean 7121, by appointment
Web site: http://www.cs.cmu.edu/~odonnell/boolean-analysis
Mailing list: Please sign up! Instructions on web page.
Blog: http://boolean-analysis.blogspot.com
Evaluation: ² About 5 problem sets.
² 2 or 2.5 scribe notes, graded (worth equal to that of a problem set)
x f (x)
0000 0 0001 1 0010 1 0011 1 0100 0 0101 1 0110 1 0111 1 1000 0 1001 1 1010 1 1011 1 1100 1 1101 1 1110 1 1111 1
What: Truth Table
To whom: Complexity theorists,
Circuit designers
What: Subset of the Discrete CubeTo whom: Geometers of the Cube –
Combinatorialists, Coding theorists,Metric space types
(with Hamming Distance)
What: “Concept”
To whom: Machine Learning theorists
Objects
n “features” Visit our new online pharmacy store and save up to 80% From: Tami Curran <alexsoft@gmail.com> To: <cs-251-staff@cs.cmu.edu> Date: Nov 8 2006 - 12:55pm
Take that!
Visit our new online pharmacy store and save up to 80% Only we offer: - All popular drugs are available (Viagra, Cialis,Levitra and much much more ) - World Wide Shipping - No Doctor Visits - No Prescriptions - 100%
CLICK TO FIND OUT ABOUT MORE SPECIAL OFFERS AND VISIT OUR NEW ONLINE PHARMACY STORE
“Viagra”
“Cialis”
“Levitra”
.com.ng
“Credit”
“Mortgage”
“Lottery”
ALL CAPS
1
1
1
0
0
0
0
1
SPAM / NOT-SPAMmessage
What: Set System
To whom: Combinatorialists, Extremal & Algebraic
n element “universe”
a set, X µ
, a collection of subsets
“Set System”
or “Hypergraph”
or “Simplicial Complex” (if f monotone)
What: Graph Property
To whom: Statistical physicists, Probabilists, Random k-SAT-ers
an actual graph
A property of graphs; eg., percolation (left-right crossing)
graph with n “potential” edges
Also good for:
Ising Model
Erdős-Rényi random graph model
Random k-SAT satisfiability (for k-reg. hypergraphs)
What: Voting Scheme / Social ChoiceTo whom: Econometricists,
Political scientists
0 = 1 =
votes winner
majority electoral college
dictatorship
n voters
What: Set of integers
To whom: Number theorists, Additive
combinatoricists
• “How dense a set do you need to guarantee an arithmetic progression of length k?”
• “Suppose f indicates the primes; is there a nontrivial solution tof (x) f (x+a) f (x+2a) = 1?”
“Fourier / Harmonic Analysis of Boolean Functions”
=
A set of techniques for studying structural properties
of boolean functions.
What does it mean for f to be…
• “simple”
• “fair”
• “symmetric”
• “spread out / concentrated”
• “pseudo- or quasi-random”
• “low-degree” ?
“Juntas”
Definition: f : {0,1}n ! {0,1} is called an
r-junta
if f actually only depends on some
subset of r out of n coordinates.
Fourier Analysis
As t ! 1: changes according to Heat Equation, a differential equation.
Basic solutions: 1, sin(2x), cos(2x), sin(4x), cos(4x), sin(6x), …
Every f : n !expressible as linear combination of these “frequencies”.
– Temperature
Fourier Analysis of Boolean Functions
Basic solutions: Parity (XOR) functions on on the 2n subsets of coordinates
Every f : {0,1}n !expressible as linear combination of these “frequencies”.
Fourier expansion of f, Fourier coefficients of f.
– Displacement?
As t ! 1: changes via a “Diffusion” differential equation.
Hallmarks of Fourier Analysis
1. Uniform probability distribution on {0,1}n.
2. Discrete cube graph structure.
Energy
Definition: For f : {0,1}n !{0,1},
is the average sensitivity, or edge-boundary (normalized), or total influence, or energy.
Energy
Highest energy f ?
Lowest energy f ?
Lowest energy balanced f ?
Majority?
Random function? ¼ n/2.
(Homework: f “balanced” ) ( f ) ¸ 1.)
Parity on all bits / its negation. = n.
Constants. = 0.
f (x) = xi, or xi. (“Dictator”)
Connection to Circuit Complexity
Theorem: [Linial-Mansour-Nisan + Håstad]
If f is computable by a circuit of size S and depth D, then
( f ) · O(log D−1 S).
In particular, f 2 AC0 ) ( f ) · polylog(n).
Hence:
• Parity AC0. Majority AC0.
• Pseudorandom function generators AC0.
Lowest Possible Energy
Lowest energy balanced function that “depends essentially on all n inputs”?
Example:
(Tribes) = (log n)
Friedgut’s Theorem: For all f : {0,1}n ! {0,1}, for all > 0,
f is -close to a 2O( ( f ) / ) -junta.
Ç
Æ Æ Æ Æ Æ¼ log2 n − (log log n)
¼ n / log2 n“Tribes”
Influences
Definition: The influence of the ith coordinate on f is
I.e., probability ith voter is a “swing voter”.
Proposition: ( f ) = i Infi( f )
x with ith bit flippedImpartial Culture (IC)
assumption
AKA Banzhaf Power Index.
Influences
For a fair voting scheme, do you want influences large or small?
1Infi(Parity) = Infi(xj) =
Infi(Majorityn) = Infi(Tribesn) =
1 if i = j,0 else
Influential Coalitions
Theorem: [Kahn-Kalai-Linial]
If f : {0,1}n ! {0,1} is any balanced voting scheme,
at least one candidate can bribe o(1) frac. of voters, win with probability 1 −
o(1).
Corollary of:
KKL Theorem: For every balanced f, there is an i with Infi( f ) ¸
After collecting voters, they control with probability 1 − o(1).(Both theorems sharp: Tribes.)
Miscounted Votes
Definition: Noise sensitivity of f at
flip each bit of x indep. w.p.
Aside: In diffusion process: , where = ½ − ½ exp(−t).
The Best Scheme Against Miscounts
Theorems: For all 0 · · ½,
(Dictator) =
(Majorityn) !
(ElectoralCollege) ¼
(Tribesn) !
Majority Is Stablest Theorem:
If f is balanced and
( f ) · arccos(1 − 2) / − ,
then Infi( f ) ¸ O(1/) for at least one i.
arccos(1 − 2) /
½
as n ! 1,
as n ! 1,
as n ! 1,
Applications of to P vs. NP
Q: Is it possible that for every language L in NP,
there is a poly-size family of circuits computing L on
100% of all inputs (of length n, for each n)?
A: No. Assuming NP P (/poly).
What about 99%?
What about 75%?
What about 51%?
How hard is NP on average?
Avg. case NP: Slightly hard ) Very hard
Say f 2 NP, balanced, and “slightly hard”: best poly-size circuit is 99% right.
Impagliazzo’s Hard Core Theorem: 9 H ½ {0,1}n of size 2% ¢ 2n
such that no poly-sized circuit can compute f on ¸ (½ + negl.) fraction of H.
On typical input to F, about 2% ¢ 106 of the f -inputs come from H.
2%(Tribes106) ¼ 49%
Tribes
f f f f fn
106
Let F : {0,1}106 n ! {0,1} be 2 NP. (Why?)
Theorem: F is not 51%-computable by poly-circuits. )
The Opposite of Pseudorandom
Given f ’s value on M random points, can you predict f at other points?
One idea: Take some weighted majority of known f-values, based on Hamming distance.
Can this work with M ¿ 2n ?
01010011 111010111 100101011 101010010 000101001 011101111 1 ¢ ¢ ¢ ¢ ¢ ¢
f (xi)xi
Examples:
Predict: f (00010101) = ?
“Learning f
(from random examples)”
Learning from Random Examples
Works if f has “long-range correlations” – e.g., small ( f ) or ( f ).
LMN Algorithm: This will work (using an appropriate weighted majority) if
M ¸ nO(( f )).
E.g., depth-D, poly-size circuits predictable after only examples.
Similar theorem exists for functions with small .
Learning with Queries
Goldreich-Levin Theorem: From any “one-way function” g : {0,1}n ! {0,1}n,
can produce a “hard-core predicate” f : {0,1}2n ! {0,1}.
Proof by contraposition: gives a learning algorithm, using queries, for
learning f ’s large Fourier coefficients.
GL algorithm put to positive use in Learning Theory:
Theorem: [Mansour] Poly-size DNF (depth-2 circuits) learnable with queries
in time nO(log log n); Fourier techniques.
Jackson’s Theorem: Improved to poly time & queries, by adding an ML technique.
Quasirandomness
Fix a small set of simple statistical tests; quasirandom if you pass all of them.
For graphs: Graph G with edge density p is quasirandom if,
for each O(1)-size graph H,
G has the roughly the “expected” number of copies of H.
For boolean functions: Function f with E[ f ] = p is quasirandom if,
(one weak possible notion) for each O(1)-junta h : {0,1}n ! {0,1},
f has roughly 0 correlation with h.
(I.e., given h(x), you’d still guess p for Pr[ f(x) = 1].)
Quasirandomness & “Tests”
Håstad’s Test: ● Pick x ~ {0,1}n uniformly.● Pick y ~ {0,1}n uniformly. ● Set z = x © y.● Set w ~ z.
● Test whether f (x) © f (y) © f (w) = 0.
f balanced and random: would pass with probability ½.
f a Dictator: would pass with probability 1 − .
Theorem: If f is balanced and quasirandom, passes test with probability · ½ + o(1).
Almost the canonical Fourier Analysis problem; where we’ll start.
x = 101000011011111y = 000100000101110
z = 101100011110001w = 001100010100000
©
Håstad’s “Hardness of Approximation”
Corollary: [Håstad’s Test + “PCP machinery”]
Given a system of 3-variable linear equations mod 2,
x1 © x3 © x7 = 0
x2 © x4 © x7 = 1
x1 © x5 © x6 = 0
x6 © x8 © x9 = 0
which is 99%-satisfiable, no efficient algorithm can find
a solution satisfying 51% of equations. (Unless P = NP.)
e.g.,
Proof Idea
Test yields an NP-hardness gadget for reductions,
m-coloring graphs 3-variable mod-2 equations.
Blocks are 99%-satisfiable because of Dictators – these “encode” the m colors.
Håstad’s Test Theorem:
any f satisfying ¸ 51% of a block is noticeably correlated with O(1) coordinates
) “decodable” to O(1) Dictators/colors.
prob .05 of testing: f (000) © f (010) © f (011) = 0
weight .05: x000 © x010 © x011 = 0
vertex block of 2m vars, x000, x001, …