+ All Categories
Home > Documents > Introduc)on*to**...

Introduc)on*to**...

Date post: 17-Apr-2018
Category:
Upload: phamkhue
View: 218 times
Download: 5 times
Share this document with a friend
31
Introduc)on to Ar)ficial Intelligence Lecture 13 – Approximate Inference CS/CNS/EE 154 Andreas Krause TexPoint fonts used in EMF.
Transcript
Page 1: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

Introduc)on  to    

Ar)ficial  Intelligence  

Lecture  13  –  Approximate  Inference  

CS/CNS/EE  154  

Andreas  Krause  

TexPoint  fonts  used  in  EMF.    

Page 2: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

2  

Bayesian  networks  !   Compact  representa)on  of  distribu)ons  over  large  number  of  variables  

!   (OQen)  allows  efficient  exact  inference  (compu)ng  marginals,  etc.)  

HailFinder  56  vars  

~  3  states  each  

 ~1026  terms  >  10.000  years  

on  Top    supercomputers  

JavaBayes  applet  

Page 3: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

3  

Typical  queries:  Condi)onal  distribu)on  !   Compute  distribu)on  of  some  variables  given  values  for  others  E   B  

A  

J   M  

Page 4: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

4  

Typical  queries:  Maximiza)on  !   MPE  (Most  probable  explana)on):  

 Given  values  for  some  vars,  compute  most  likely  assignment  to  all  remaining  vars  

!   MAP  (Maximum  a  posteriori):    Compute  most  likely  assignment  to  some  variables  

E   B  

A  

J   M  

Page 5: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

5  

Hardness  of  inference  for  general  BNs  !   Compu)ng  condi)onal  distribu)ons:  

!   Exact  solu)on:  #P-­‐complete  !   NP-­‐hard  to  obtain  any  nontrivial  approxima)on  

!   Maximiza)on:  !   MPE:  NP-­‐complete  !   MAP:  NPPP-­‐complete  

!   Inference  in  general  BNs  is  really  hard      

Page 6: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

6  

Inference  !   Can  exploit  structure  (condi)onal  independence)  to  efficiently  perform  exact  inference  in  many  prac)cal  situa)ons  

!   For  BNs  where  exact  inference  is  not  possible,  can  use  algorithms  for  approximate  inference  (later)  

Page 7: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

7  

Variable  elimina)on  algorithm  !   Given  BN  and  Query  P(X  |  E=e)  

!   Choose  an  ordering  of  X1,…,Xn  !   Set  up  ini)al  factors:  fi  =  P(Xi  |  Pai)  !   For  i  =1:n,  Xi  ∉  {X,E}  

!   Collect  all  factors  f  that  include  Xi  !   Generate  new  factor  by  marginalizing  out  Xi  

!   Add  g  to  set  of  factors  

!   Renormalize  P(x,e)  to  get  P(x  |  e)  

Page 8: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

Reusing  computa)on  !   OQen,  want  to  compute  condi)onal  distribu)ons  of  many  variables,  for  fixed  observa)ons  

!   E.g.,  probability  of  Pits  at  different  loca)ons  given  observed  Breezes    

!   Repeatedly  performing  variable  elimina)on  is  wasteful  (many  factors  are  recomputed)  

!   Need  right  data-­‐structure  to  avoid  recomputa)on    Message  passing  on  factor  graphs  

8  

Page 9: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

Factor  graphs  !   P(C,D,G,I,S,L)  =  P(C)  P(I)  P(D|C)  P(G|D,I)  P(S|I,G)  P(L|S)  

9  

C  

D   I  

G   S  

L   C   D   I  G   S   L  

CD   DIG   IGS   SL  

f1   f2   f3   f4  

Page 10: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

Factor  graph  !   A  factor  graph  for  a  Bayesian  network  is  a  bipar)te  graph  consis)ng  of  !   Variables  and  

!   Factors  

!   Each  factor  is  associated  with  a  subset  of  variables,  and  all  CPDs  of  the  Bayesian  network  have  to  be  assigned  to  one  of  the  factor  nodes  

10  

C  

D   I  

G   S  

L   C   D   I  G   S   L  

CD   DIG   IGS   SL  

f1   f2   f3   f4  

Page 11: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

11  

Sum-­‐product  message  passing  on  factor  graphs  !   Messages  from  node  v  to  factor  u  

!   Messages  from  factor  u  to  node  v  

C   D   I  G   S   L  

CD   DIG   IGS   SL  

f1   f2   f3   f4  

Page 12: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

Example  messages  

12  

A   B   C  

AB   BC  

f1   f2  P(A)P(B|A)   P(C|B)  

Page 13: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

Belief  propaga)on  on  polytrees  !   Belief  propaga)on  (aka  sum-­‐product)  is  exact  for  polytree  Bayesian  networks  !   Factor  graph  of  polytree  is  a  tree  

!   Choose  one  node  as  root  !   Send  messages  from  leaves  to  root,    and  from  root  to  leaves  

!   AQer  convergence:  

!   Thus:  immediately  have  correct  values  for  all  marginals!  13  

Page 14: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

What  if  we  have  loops?  !   Can  s)ll  apply  belief  propaga)on  even  if  we  have  loops  

!   Just  run  it,  close  your  eyes  and  hope  for  the  best!  !   Use  approxima)on:  

!   In  general,  will  not  converge…  

!   Even  if  it  converges,  may  converge  to  incorrect  marginals…  !   However,  in  prac)ce  oQen  s)ll  useful!  

!   E.g.,  turbo-­‐codes,  etc.  

!   “Loopy  belief  propaga)on”  

14  

C  

D   I  

G   S  

L  

Page 15: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

15  

Behavior  of  Loopy  BP  

!   Loopy  BP  mul)plies  same  factors  mul)ple  )mes  

   BP  oQen  overconfident  

X1  

X2   X3  

X4  

.5  

0  

1  P(X1  =  1)  

Itera)on  #  

True  posterior  

BP  es)mate  

Page 16: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

16  

Does  Loopy  BP  always  converge?  

!   No!  Can  oscillate!  

!   Typically,  oscilla)on  the  more  severe  the  more  “determinis)c”  the  poten)als  

Graphs  from  K.  Murphy  UAI  ‘99  

Page 17: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

What  about  MPE  queries?  !   E.g.,:  What’s  the  most  likely  assignment  to  the  unobserved  variables,  given  the  observed  ones?  

!   Use  max-­‐product    (same  as  sum-­‐product/BP,  but  with  max  instead  of  sums!)  

17  

E   B  

A  

J   M  

Page 18: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

18  

Max-­‐product  message  passing  on  factor  graphs  !   Messages  from  nodes  to  factors  

!   Messages  from  factors  to  nodes  

C   D   I  G   S   L  

CD   DIG   IGS   SL  

f1   f2   f3   f4  

Page 19: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

19  

Sampling  based  inference  !   So  far:  determinis)c  inference  techniques  

!   Variable  elimina)on  !   (Loopy)  belief  propaga)on  

!   Will  now  introduce  stochas)c  approxima)ons  !   Algorithms  that  “randomize”  to  compute  expecta)ons  !   In  contrast  to  the  determinis)c  methods,  guaranteed  to  converge  to  right  answer  (if  wait  looong  enough..)  

!   More  exact,  but  slower  than  determinis)c  variants  

Page 20: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

20  

Compu)ng  expecta)ons  !   OQen,  we’re  not  necessarily  interested  in  compu)ng  marginal  distribu)ons,  but  certain  expecta)ons:  

!   Moments  (mean,  variance,  …)  

!   Event  probabili)es  

Page 21: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

21  

Sample  approxima)ons  of  expecta)ons  !   x1,…,xN  samples  from  RV  X  

!   Law  of  large  numbers:  

!   Hereby,  the  convergence  is  with  probability  1    (almost  sure  convergence)  

!   Finite  samples:  

Page 22: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

22  

How  many  samples  do  we  need?  !   Hoeffding  inequality  Suppose  f  is  bounded  in  [0,C].  Then  

!   Thus,  probability  of  error  decreases  exponen)ally  in  N!  

!   Need  to  be  able  to  draw  samples  from  P  

Page 23: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

23  

Sampling  from  a  Bernoulli  distribu)on  !   Most  random  number  generators  produce  (approximately)  uniformly  distributed  random  numbers  

!   How  can  we  draw  samples  from  X  ~  Bernoulli(p)?  

Page 24: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

24  

Sampling  from  a  Mul)nomial  !   X  ~  Mult([µ1,…,µk])    where  µi  =  P(X=i);  ∑i  µi  =  1  

!   Func)on  g:  [0,1]{1,…,k}  assigns  state  g(x)  to  each  x  !   Draw  sample  from  uniform  distribu)on  on  [0,1]  !   Return  g-­‐1(x)  

µ1   µ2   µ3   …   µk  0   1  

Page 25: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

25  

Forward  sampling  from  a  BN  

Page 26: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

26  

Monte  Carlo  sampling  from  a  BN  

!   Sort  variables  in  topological  ordering  X1,…,Xn  

!   For  i  =  1  to  n  do  !   Sample  xi  ~  P(Xi  |  X1=x1,  …,  Xi-­‐1=xi-­‐1)  

!   Works  even  with  loopy  models!   C  

D   I  

G   S  

L  

J  H  

Page 27: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

27  

Compu)ng  probabili)es  through  sampling  !   Want  to  es)mate  probabili)es  

!   Draw  N  samples  from  BN  

!   Marginals  

!   Condi)onals  

C  

D   I  

G   S  

L  

J  H  

Page 28: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

28  

Rejec)on  sampling  !   Collect  samples  over  all  variables  

!   Throw  away  samples  that  disagree  with  xB  !   Can  be  problema)c  if  P(XB  =  xB)  is  rare  event  

Page 29: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

29  

Sample  complexity  for  probability  es)mates  !   Absolute  error:  

!   Rela)ve  error:  

Page 30: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

30  

Sampling  from  rare  events  !   Es)ma)ng  condi)onal  probabili)es  P(XA  |  XB=xB)  using  rejec)on  sampling  is  hard!  !   The  more  observa)ons,  the  unlikelier  P(XB  =  xB)  becomes  

!   Want  to  directly  sample  from  posterior  distribu)on!  

Page 31: Introduc)on*to** Ar)ficial*Intelligencecourses.cms.caltech.edu/cs154/slides/cs154-13-bnet3-annotated.pdf · 4 Typical*queries: *Maximizaon*! MPE ... D,I)*P(S|I,G)*P(L|S)* 9 C D I

31  

Gibbs  sampling  !   Start  with  ini)al  assignment  x(0)  to  all  variables  

!   For  t  =  1  to  ∞  do  !  Set  x(t)  =  x(t-­‐1)  !  For  each  variable  Xi  

!  Set  vi  =  values  of  all  x(t)  except  xi  !  Sample  x(t)i  from  P(Xi  |  vi)  

!   For  large  enough  t,  sampling  distribu)on  will  be  “close”  to  true  posterior  distribu)on!  

!  Key  challenge:  Compu)ng  condi)onal  distribu)ons  P(Xi  |  vi)  


Recommended