+ All Categories
Home > Documents > Hypothesis Testing for Structured Probability...

Hypothesis Testing for Structured Probability...

Date post: 18-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
34
Hypothesis Testing for Structured Probability Distributions Ilias Diakonikolas USC Joint work with Daniel Kane (UCSD) Vladimir Nikishkin (Edinburgh)
Transcript
Page 1: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Hypothesis Testing for Structured Probability Distributions

Ilias Diakonikolas USC

Joint work with Daniel Kane (UCSD)

Vladimir Nikishkin (Edinburgh)

Page 2: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

What this talk is about

Basic object of study: Probability distributions over ordered domain: or

Notation: p, q: either pmf or pdf

[n] = 1, . . . , n I = [a, b] ⊆ R

Page 3: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Menu Explaining the title: •  Let be a family of probability distributions

•  Identity Testing Problem: −  Distinguish between the cases p=q and dist (p, q) > ε −  Minimize sample size, computation time

Unknown 1, 2, 2, 4, 3,…

Known/Unknown

2, 1, 2, 3, 1,…

Total Varia0on Distance dTV(p, q) = (1/2)p− q1

D

p ∈ D

q ∈ D

Page 4: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

This Talk

Unified Framework for Identity Testing: Leads to sample-optimal and computationally efficient

estimators for a variety of structured distribution families.

& (Matching Information-Theoretic Lower Bounds)

Page 5: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Outline

§  Introduction, Related and Prior Work §  Framework Overview §  Testing Identity to a Fixed Distribution §  Testing Closeness between two Unknown Distributions §  Future Directions and Concluding Remarks

Page 6: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Distribution Testing (Hypothesis Testing)

Given samples (observations) from one (or more) unknown probability distribution(s) (model), decide whether it satisfies a certain property. •  Introduced by Karl Pearson (’99). •  Classical Problem in Statistics [Neyman-Pearson’33, Lehman-Romano’05]

•  Last twenty years (TCS): property testing [Goldreich-Ron’00, Batu et al. FOCS’00/JACM’13]

Page 7: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Related Work – Property Testing (I)

Focus has been on arbitrary distributions over support of size . Testing Identity to an explicitly known Distribution: •  [Goldreich-Ron’00]: upper bound for uniformity testing

(collision statistics) •  [Batu et al., FOCS’01]: upper bound for testing

identity to any known distribution.

•  [Paninski ’03]: upper bound of for uniformity testing, assuming . Lower bound of .

•  [Valiant-Valiant, FOCS’14, D-Kane-Nikishkin, SODA’15]: upper bound of for identity testing to any known distribution.

n

O(√n/

4)

O(√n) · poly(1/)

O(√n/

2) = Ω(n−1/4) Ω(

√n/2)

O(√n/

2)

Page 8: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Related Work – Property Testing (II)

Focus has been on arbitrary distributions over support of size . Testing Closeness between two unknown distributions: •  [Batu et al., FOCS’00]: upper bound for testing closeness between two unknown discrete distributions.

•  [P. Valiant, STOC’08]: lower bound of for constant error.

•  [Chan-D-Valiant-Valiant, SODA’14]: tight upper bound and lower bound of

n

O(n2/3 log n/8/3)

Ω(n2/3)

O(maxn2/3/

4/3, n

1/2/

2)

Page 9: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Summary of Related Work

Testing Closeness

Tight Bound

[Chan-D-Valiant-Valiant’14]

Learning Tight Bound

[folklore]

Testing Identity

Tight Bound

[Valiant-Valiant’14, D-Kane-Nikishkin’15]

support size: , total variation distance error: n

Θ(maxn2/3/4/3, n1/2/2)

Θ(n1/2/2)

Θ(n/2)

Page 10: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Estimating Structured Distributions

•  Statistical Estimation well-understood for arbitrary discrete distributions.

•  How about for structured distributions? •  Long line of work in statistics since the 1950’s [Grenander’56, Rao’69,

Wegman’70, Birge’87,…]. Focus has been on density estimation (learning).

•  [Batu-Kumar-Rubinfeld, STOC’04]: identity testing of monotone distributions.

•  [Daskalakis-D-Servedio-Valiant-Valiant, SODA’13]: generalization to

k-modal distributions.

Page 11: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Types of Structured Distributions

bimodal

log-­‐concave

monotone •  Distributions with “shape restrictions”

•  Simple combinations of simple distributions

mixtures of Gaussians

Mixtures of simple distributions

Sums of simple distributions

+ + … + Poisson Binomial Distribu9ons

Page 12: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Outline

§  Introduction, Related and Prior Work §  Framework Overview §  Testing Identity to a Fixed Distribution §  Testing Closeness between two Unknown Distributions §  Future Directions and Concluding Remarks

Page 13: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

First Step: Changing the metric

Identity Testing Problem for family Given (sample) access to : •  Output “YES” (with high probability) if (completeness) •  Output “NO” (with high probability) if (soundness)

Reduces to Identity Testing Problem under - distance Given (sample) access to : •  Output “YES” (with high probability) if •  Output “NO” (with high probability) if

Dp, q ∈ D

p = qp− q1 ≥

Ak

p, qp = q

p− qAk ≥

Page 14: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

-Distance between Distributions (I)

Definition. For and , we define the - distance between as follows: Facts: •  For , (essentially) equivalent to the Kolmogorov distance.

•  For any , we have .

•  We have:

Ak

p, q

I1 I2 I3 Ik

p− qAk = supI=(Ii)ki=1

k

i=1

|p(Ii)− q(Ii)|

p, q : R → [0, 1]

Ak

k ≥ 2

k = 2

k ≥ 2 p− qAk ≤ p− q1

limk→∞

p− qAk = p− q1

Page 15: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

-Distance between Distributions (II)

Definition. For and , we define the - distance between as follows: Upper Bound on Sample Complexity: For a family of one-dimensional distributions and , let be the smallest integer such that for any it holds Then, the parameter is the “right” complexity measure for estimating a property of the family .

k ≥ 1 Ak

p, q

D > 0p, q ∈ D

D

k = k(D, )

k

I1 I2 I3 Ik

p− qAk = supI=(Ii)ki=1

k

i=1

|p(Ii)− q(Ii)|

p, q : R → [0, 1]

Ak

p− q1 ≤ p− qAk + /2.

Page 16: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

for each :

Overview of Framework

Approximation (Existential Step)

Identity Tester under - distance

(Algorithmic Step) > 0

YES/NO

Error parameter:

Ak

k = k(D, )

min k s.t.p, q ∈ D

p− q1 ≤ p− qAk + /2. = /2

L1-Identity Tester for D

D

Page 17: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Second Step: Design -Distance Tester

Identity Testing Problem under - distance Given (sample) access to : •  Output “YES” (with high probability) if •  Output “NO” (with high probability) if Two fundamentally different regimes: •  One of known explicitly [Testing Identity to Fixed Distribution]. •  Both unknown [Testing Closeness].

Ak

p, q

p = qp− qAk ≥

Ak

p, q

p, q

Page 18: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

-distance vs L1 distance

Testing Closeness Tight Bound

Support: [n], L1 distance

- distance

Learning Tight Bound Support: [n], L1 distance

- distance

Testing Identity Tight Bound Support: [n], L1 distance

- distance

Θ(maxn2/3/4/3, n1/2/2)

Θ(n1/2/2)

Θ(n/2)

Ak

Ak

Ak Θ(k/2)

Θ(k1/2/2)

Ak

[VC]

Page 19: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Outline

§  Introduction, Related and Prior Work §  Framework Overview §  Testing Identity to a Fixed Distribution §  Testing Closeness between two Unknown Distributions §  Future Directions and Concluding Remarks

Page 20: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

-Testing Identity to Fixed Distribution

Theorem [D-Kane-Nikishkin’15] For any , , and any explicit there exists a computationally efficient algorithm that distinguishes between the case versus with constant error probability using samples from . Moreover, this sample size is information-theoretically necessary for this task. Remark: •  The upper bound holds both for discrete and continuous distributions.

> 0 k ≥ 2

p = q p− qAk ≥

q

O(k1/2/2)

p

Ak

Page 21: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Applications: L1 -Identity Testing for Structured Distributions

Distribution Family

Sample Size Parameters

t-flat

t-piecewise degree-d

Log-concave

Log-concave t-mixture

t-modal over [n]

MHR over [n]

k = O(t)

k = O(t(d+ 1))

O(−9/4) k = O(−1/2)

k = O(t−1/2)

k = O(t log(n)/)

k = O(log(n)/)

O

t1/2

2

O

(t(d+ 1))1/2

2

O

t1/2

9/4

O

(t log n)1/2

5/2

O

(log n)1/2

5/2

Page 22: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

- Identity Testing: Basic Facts

Lemma: Identity testing reduces to uniformity testing. Proof Idea: Appropriately “stretch” the domain size. Henceforth, focus on uniformity testing. Observation: If we know the partition maximizing the discrepancy, can reduce to L1- identity testing over domain of size k.

J1 J2 Jk

p− UAk =k

j=1

|p(Jj)− U(Jj)|

Ak

Page 23: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

- Uniformity Testing: First Approach

•  Partition the domain into intervals of equal

length. •  Apply an L1- uniformity tester on the reduced distributions over

these intervals.

Claim: Sample Complexity:

= 10k/ I1, . . . , I

p− UAk − /2 ≤

i=1

|p(Ii)− U(Ii)| ≤ p− UAk

I1 I2 I3 I

J1 J2 Jk

O(1/2/2) = O(k1/2/5/2)

Ak

Page 24: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

- Uniformity Testing: Optimal Algorithm

•  Construct several oblivious decompositions of the domain. •  Use L2- uniformity tester over the reduced distributions.

In more detail: •  Consider equal-length interval partitions of the domain.

Partition consists of intervals. •  For each j, apply an L2- uniformity tester with L2 - error •  Accept if and only if all testers accept. Structural Lemma: One of the partitions will detect the discrepancy.

Ak

M = log(1/)I(j) j = k · 2j

j = · 23j/8/1/2j

Page 25: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Outline

§  Introduction, Related and Prior Work §  Framework Overview §  Testing Identity to a Fixed Distribution §  Testing Closeness between two Unknown Distributions §  Future Directions and Concluding Remarks

Page 26: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

-distance vs L1 distance

Testing Closeness Tight Bound

Support: [n], L1 distance

- distance

Learning Tight Bound Support: [n], L1 distance

- distance

Testing Identity Tight Bound Support: [n], L1 distance

- distance

Θ(maxn2/3/4/3, n1/2/2)

Θ(n1/2/2)

Θ(n/2)

Ak

Ak

Ak Θ(k/2)

Θ(k1/2/2)

Θ(maxk2/3/4/3, k1/2/2)

Ak

[VC]

Page 27: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

- Equivalence Testing

Theorem For any and , and any distributions there exists a computationally efficient algorithm that distinguishes between the case versus with constant error probability using samples. Moreover, this sample size is information-theoretically necessary for this task. Remarks: •  The upper bound holds both for discrete and continuous distributions.

•  The lower bound applies to continuous distributions or discrete distributions over a domain of size .

> 0 k ≥ 2 p, q

p = q p− qAk ≥

O(maxk4/5/6/5, k1/2/2)

N ≥ 2poly(k)

Ak

Page 28: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

- Closeness Testing: Basic Facts

•  No oblivious decomposition can work: Discrepancy may be hidden in intervals even though reduced distributions are the same.

•  Can partition the domain into “light” intervals, and apply standard

closeness tester on reduced distributions over these intervals. •  Inherently leads to sample algorithms: Need adaptive

partition in which at least one distribution has small mass.

•  How do we obtain sample size?

Ak

o(k)

Ω(k)

Page 29: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

- Closeness Testing Algorithm

Consider the following “order-based” algorithm: •  Let . Draw samples from p,

and samples from q.

•  Let be the union of and sorted in increasing order.

•  Let

•  If , return “NO”; otherwise, return “YES.”

m = O(k4/5/6/5) m1 = Poi(m)m2 = Poi(m)

Sp

Sq

S Sp Sq

Z = #(pairs of consecutive elements of S from same distribution)−#(pairs of consecutive elements of S from different distributions)

Z > 3√m

Ak

Page 30: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Closeness Testing: Sketch of Analysis

•  Bound mean and variance and using concentration.

•  Completeness: and •  Soundness: Main technical step bounding from below.

-  Easy to argue: - Highly non-trivial:

E[Z] = 0

E[Z]

Var[Z] = O(m)

E[Z] = Ω(m33/k2)

Var[Z] = 2m− 1

Page 31: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Outline

§  Introduction, Related and Prior Work §  Framework Overview §  Testing Identity to a Fixed Distribution §  Testing Closeness between two Unknown Distributions §  Future Directions and Concluding Remarks

Page 32: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Future Directions

Unified Technique for Identity Testing: Use - distance as a proxy. Concrete Open Problems: •  Understanding the regime [DKN’16]. •  Testing Other Properties of Structured Distributions: Independence, Entropy, etc. A Few Open-ended Challenges: •  Other Criteria: Privacy, Communication •  High-Dimensional Structured Distributions •  Tradeoffs between sample size and computational efficiency?

Thank you for your attention!

log n ≤ k ≤ n

Ak

Page 33: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Sketch of Lower Bound (I)

•  Suppose algorithm only considers ordering of the samples. •  Consider the following instance:

•  If less than 3 samples land in a mini-bucket, no useful information.

p pq p = q

2k

k

2k

1−

k

Page 34: Hypothesis Testing for Structured Probability Distributions2016.highlightsofalgorithms.org/wp-content/uploads/... · Unified Framework for Identity Testing: Leads to sample-optimal

Sketch of Lower Bound (II)

•  If less than 3 samples land in a mini-bucket, no useful information

for an order-based tester. •  Expected number of buckets with 3 samples •  Need this quantity to be How about for general testers? •  Can embed above instance into larger domain, so that ordered-

based testers suffice. •  Non-constructive argument (Ramsey’s theorem).

p pq p = q

2k

k

2k

1−

k

km

k

3

√m


Recommended