+ All Categories
Home > Documents > Learning Mixtures of Structured Distributions over Discrete Domains 

Learning Mixtures of Structured Distributions over Discrete Domains 

Date post: 25-Feb-2016
Category:
Upload: zoe
View: 52 times
Download: 0 times
Share this document with a friend
Description:
Learning Mixtures of Structured Distributions over Discrete Domains . Xiaorui Sun Columbia University. Joint work with Siu -On Chan(UC Berkeley), Ilias Diakonikolas (U Edinburgh ), Rocco Servedio ( Columbia University ). Density Estimation. PAC-type learning model - PowerPoint PPT Presentation
Popular Tags:
30
Learning Mixtures of Structured Distributions over Discrete Domains Xiaorui Sun Columbia University Joint work with Siu-On Chan(UC Berkeley), Ilias Diakonikolas(U Edinburgh), Rocco Servedio(Columbia University)
Transcript
Page 1: Learning Mixtures of Structured Distributions over Discrete Domains 

Learning Mixtures of Structured Distributions over Discrete Domains

Xiaorui SunColumbia University

Joint work with Siu-On Chan(UC Berkeley), Ilias Diakonikolas(U Edinburgh), Rocco Servedio(Columbia University)

Page 2: Learning Mixtures of Structured Distributions over Discrete Domains 

Density Estimation• PAC-type learning model• Set of possible target distributions over • Learner – Know the set but does not know the target

distribution – Independently draws a few samples from – Outputs (succinct description of a)

distribution which is -close to • Total variation distance is standard measure in

statistics

Page 3: Learning Mixtures of Structured Distributions over Discrete Domains 

Learn a structured distribution

• If = {all distributions over }, samples are required

• Much better sample complexities possible for structured distributions– Poisson binomial distributions [DDS12a]• samples

–Monotone/k-modal [Bir87, DDS12b]• samples/ samples

Page 4: Learning Mixtures of Structured Distributions over Discrete Domains 

This work: Learn mixture of structured distributions

• Learn mixture of distributions?– A set of distributions over – Target distribution is a mixture of

distributions from– i.e. , such that

• Our result: learn mixtures for several structured distributions– Sample complexity close to optimal– Efficient running time

Page 5: Learning Mixtures of Structured Distributions over Discrete Domains 

Our results: learning mixture of log-concave

• Log-concave distribution over [n]– – for

1 n

Page 6: Learning Mixtures of Structured Distributions over Discrete Domains 

Our results: log-concave

• Algorithm to learn a mixture of log-concave distributions – Sample complexity: – Running time: bit operations

• Lower bound: samples

Page 7: Learning Mixtures of Structured Distributions over Discrete Domains 

Our results: mixture of unimodal

• Unimodal distribution over [n]– s.t.

1 n

Page 8: Learning Mixtures of Structured Distributions over Discrete Domains 

Our results: mixture of unimodal

• A mixture of 2 unimodal distributions may have modes

• Algorithm to learn a mixture of unimodal distributions– Sample complexity: samples– Running time: bit operations

• Lower bound: samples

Page 9: Learning Mixtures of Structured Distributions over Discrete Domains 

Our results: mixture of MHR

• Monotone hazard rate distribution – Hazard rate of : – if –MHR distribution: is a non-decreasing

function over

1 n

Page 10: Learning Mixtures of Structured Distributions over Discrete Domains 

Our results: mixture of MHR

• Algorithm to learn a mixture of MHR distributions – Sample complexity: – Running time: bit operations

• Lower Bound: samples

Page 11: Learning Mixtures of Structured Distributions over Discrete Domains 

Compare with parameter estimation

• Parameter estimation [KMV10, MV 10] – Learn a mixture of Gaussians– Independently draw a few samples from – Estimate the parameters of each

Gaussian component accurately • Number of samples inherently

exponentially depends on , even for a mixture of 1-dimensional normal distributions [MV10]

Page 12: Learning Mixtures of Structured Distributions over Discrete Domains 

Compare with parameter estimation

• Parameter estimation needs at least exp() samples to learn a mixture of binomial distributions– Similar to the lower bound in [MV 10]

• Density estimation allows to estimate non parametric distributions– E.g. log-concave, unimodal, MHR

• Density estimation for mixture of binomial distributions over using samples– Binomial distribution is log-concave

Page 13: Learning Mixtures of Structured Distributions over Discrete Domains 

Outline

• Learning algorithm based on decomposition

• Structural results for log-concave, unimodal, MHR distributions

Page 14: Learning Mixtures of Structured Distributions over Discrete Domains 

Flat decomposition

• Key definition: distribution is -flat if there exists a partition of into intervals such that – is an -flat decomposition for

• is obtained by "flattening" within each interval – for

Page 15: Learning Mixtures of Structured Distributions over Discrete Domains 

Flat decomposition

1 n

Page 16: Learning Mixtures of Structured Distributions over Discrete Domains 

Learn -flat distributions

• Main general Thm: Let = {all the -flat distributions}. There is an algorithm which draws samples from , and outputs a hypothesis such that .

• Linear running time with respect to the number of samples

Page 17: Learning Mixtures of Structured Distributions over Discrete Domains 

Easier problem: known decomposition

• Given– Samples from an -flat distribution – -flat decomposition for

• Idea: estimate probability mass of every interval in

• samples are enough

Page 18: Learning Mixtures of Structured Distributions over Discrete Domains 

Real problem: unknown decomposition

• Only given samples from a -flat distribution

• Exists some -flat decomposition for , but unknown

• A useful fact [DDS+ 13]: If is a -flat decomposition of , and is a “refinement” of , is a -flat decomposition of – If know a refinement of , it is good

Page 19: Learning Mixtures of Structured Distributions over Discrete Domains 

Unknown flat decomposition (cont)

• Idea: partition [n] into intervals each with small probability mass,

– Achieve by sampling from

1 n

𝒦ℒ

Page 20: Learning Mixtures of Structured Distributions over Discrete Domains 

Unknown flat decomposition (cont)

• Exist (unknown)– Refinement of both and– intervals

1 n

𝒦ℒ

Page 21: Learning Mixtures of Structured Distributions over Discrete Domains 

Unknown flat decomposition (cont)

• Exist – Refinement of both and– intervals– -flat decomposition for

1 n

𝒥

Page 22: Learning Mixtures of Structured Distributions over Discrete Domains 

Unknown flat decomposition (cont)

• Compare and

1 n

𝒥1 n

𝒥𝒦

Page 23: Learning Mixtures of Structured Distributions over Discrete Domains 

Unknown flat decomposition (cont)

• If the total probability mass of every intervals of is at most , then

• Partition [n] into intervals each with probability mass at most – samples are enough

Page 24: Learning Mixtures of Structured Distributions over Discrete Domains 

Learn -flat distributions

• Main general Thm: Let {all the -flat distributions}. There is an algorithm which draws samples from , and outputs a hypothesis such that

Page 25: Learning Mixtures of Structured Distributions over Discrete Domains 

Learn mixture of distributions

• Lem:A mixture of -flat distributions has an -flat decomposition– Tight for interesting distribution classes

• Thm(Learn mixture): Let be a mixture of -flat distributions. There is an algorithm which draws samples, and outputs a hypothesis s.t.

Page 26: Learning Mixtures of Structured Distributions over Discrete Domains 

First application: learning mixture of log-concave distributions

• Recall definition:– – for

• Lem: Every log-concave distribution is -flat

• Learn a mixture of log-concave distributions with samples

Page 27: Learning Mixtures of Structured Distributions over Discrete Domains 

Second application: learning mixture of unimodal distribution

• Lem: Every unimodal distribution is -flat [Bir87, DDS+13]

• Learn a mixture of unimodal distribution with samples

Page 28: Learning Mixtures of Structured Distributions over Discrete Domains 

Third application: learning mixture of MHR distribution

• Monotone hazard rate distribution– Hazard rate of : – if – is a non-decreasing function over

• Lem: Every MHR distribution is -flat• Learn a mixture of MHR distributions

with samples

Page 29: Learning Mixtures of Structured Distributions over Discrete Domains 

Conclusion and further directions

• Flat decomposition is a useful way to study mixtures of structured distributions

• Extend to higher dimension?• Efficient algorithm with optimal

sample complexity

Distribution Sample complexity Lower boundLog-concaveUnimodalMHR

Page 30: Learning Mixtures of Structured Distributions over Discrete Domains 

Thank you !


Recommended