Download - THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION

THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE

INTERPRETATION

Thesis for the degree of Master of ScienceBy Leonid Karlinsky

Under the supervision of Professor Shimon Ullman

Introduction

GraphicalModels

A

B C

D E

P(B|A) P(C|A)

P(D|B) P(E|B)

Bayesian Network (BN)

iX

ii XXPEDCBAP )|(),,,,(

Introduction

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Part I:Part I: MaxMI Training

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Classification

ClassC

Features

Goal: Classify C, using a subset of “trained” featuressubset of “trained” features - F on new examples with minimum error

f

Training tasks:• Best F• Best• Efficient model

F2

F1F3

F6F5

);,( FCP

1F2F

4F 3F

5F 6F

7F

More…);,( FCP

Best = Maximal MI

MaxMI Training - The Past

• Model: simple “Flat” structure, NCC thresholds

• Training: Features and thresholds selected one by one

);;(),;,;(minmaxarg,

,jjjiji

ijFii FCMIFFCMIF

ii

1 2 3 4 5 6

Cond. independence in C increased MI upper bound

More…

MaxMI Training – Our ApproachMaxMI Training – Our Approach

1

2 3

4 5 6 7

Learn modelmodel and allall togethertogether maximizing:i

);;( FCMI

MaxMI Training – Learning

MaxMI: Decompose MI

1

2 3

4 5 6 7

i);;( FCMI

iiiii FCFMIFCMI ),,|;();;(

EfficientlyEfficiently learn parameters using GDLGDL

Maximize

for allall together

More…

MaxMI Training – Assumptions

1.1. TAN model structureTAN model structure – Tree Augmented Naïve Bayes [Friedman, 97]

2.2. Feature Tree (FT)Feature Tree (FT) – can remove C preserving the feature tree.

i

iin CFFPFFCP ),|(),,,( 1

i

iin FFPFFP )|(),,( 1


MaxMI Training – TAN and

1.1. TAN structure is unknownTAN structure is unknown

2. Learn and TAN s.t.: 1

2 3

4 5 6 7

i

);;( FCMI is maximized. Asymptotic correctness FT holds Efficiency

MaxMI Training – MaxMI hybridMaxMI Training – MaxMI hybrid

,TAN

);;( argmax TAN,

FCMI

,TAN Legal

)|;(),( iiiiMM FCFMIFFwMI

C)|FP(F)F,FP(C

)|FP(F)F,P(F-

iin

-iin

,, TAN, Legal

1

1

MaxMI Training – MaxMI hybridMaxMI Training – MaxMI hybrid

More…

C)|FP(F)F,FP(C

)|FP(F)F,P(F-

iin

-iin

,, TAN, Legal

1

1

)|FP(F)F,P(FFF -iinii 1FT

TAN, ),(w argmax

[Chow & Liu, 68]

);;( argmax ),(w argmax TAN, Legal

MM TAN, Legal

FCMIFF ii MaxMI:MaxMI:

maximal is MI Legal are TAN,

),(w),(w argmax

?

FTMM TAN,

iiii FFFF

C)|FP(F)F,FP(CFF -iinii ,, ),(w argmax 1TAN

TAN,

[Friedman, 97]

);(),(w

),(w),(w

TAN

FTMM

CFMIFF

FFFF

iii

iiii

MaxMI Training – MaxMI hybrid

Convergent algorithm:

),(),( iiFTiiMM FFwFFw ),( iiTAN FFw

TAN

More…

);(),(w

),(w),(w

TAN

FTMM

CFMIFF

FFFF

iii

iiii

MaxMI Training – empirical results

0102030405060708090

100110120130140150160170180

LMI MaxMI

Cow Parts Model - Feature Centered - Test DB (2256)

Miss FA Total Errors

More…

0102030405060708090

100110120130140

LMI MaxMI

Cow Parts Model - Parent Centered - Test DB (2256)

Miss FA Total Errors


0102030405060708090

100110120130140150160170180

LMI MaxMI MaxMI+TAN

Cow Parts Model - Classification Errors - Test DB (2256)

Miss FA Total Errors More…

MaxMI Training – Generalizations

Train any parametersTrain any parameters i Any low-TREEWIDTH structureAny low-TREEWIDTH structure Even without assumptions:


C)|FP(F)F,FP(C

)|FP(F)F,P(F-

iin

-iin

,, TAN, Legal

1

1

iiiiiiiii FCFHFFHFCMI ),,,|()ˆ,,ˆ|();;(

Back to the Goals

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Part II:Part II: Loopy MAP approximation

GraphicalModels

Learning

Loop FreeLoopy

Using

Loop FreeLoopy

Tasks

Scenarios

Loopy network example

x

x x x x

),( 2112xxf ),(

5115 xxf),( 3

113

xxf ),(4

114 xxf

),( 4224 xxf ),( 5335 xxf

),( 5225 xxf

2 3

1

4 5

Want to solve MAP:

nji

jiijxx

n xxfxxn 1},,{

1 ),(argmax},,{1

NP-hard in general! [Cooper 90, Shimony 94]

Our approach – opening loops

x

x x x x

),( 2112xxf ),(

5115 xxf),( 3

113

xxf ),(4

114 xxf

42

24

53

35

52

25

2 3

1

4 5

z 4 z3z 2

Now, we can maximizemaximize:

l

ll

lmkk kjkjk

kjijiij

zzzxn xzfxxfxx ),(),(argmax},,{

,},,{,1

1

The assignment is legallegal for the loopy problem if:ll kk xz

Our approach – opening loops

),,(argmax},,{,,

zxygzxyxzyx

lll

LegallyLegally maximize:

),,( zxyg

),,(argmax},,{,,

zxygzxyzyx

mmm Can maximize unrestricted:

Usually mm zx

Our solution – slow connectionsslow connections

Our approach – slow connectionsOur approach – slow connections

),,(argmax},{,

Zxygxyyx

ZZ

Fix z=Z

ZxZ Now legalizelegalize and return to step one.

Iterate until convergence. This is the Maximize-and-LegalizeMaximize-and-Legalize algorithm. y

x x x y2 3

1

4 5

z 4 z3z 2

4Z 2Z 3Z

Zx )( 2

Zx )( 3

Zx )( 4

Zy )( 1

Zy )( 5

MaximizeMaximize (loop-free, use GDL):

Our approach – slow connections

When will this work?When will this work?

The intuition:The intuition: z-minor

Strong Strong zz-minor-minor

Weak Weak zz-minor-minor

),,(),,(),,(),,(:, ZxygZZygZxygxxygZxy ZZZZZZZZ

),,(),,(),,(),,(:),(),( ZxygZxygZxygzxygxyxy ZZZZZZZZ

global maximumglobal maximum – single stepsingle step

local optimumlocal optimum – several stepsseveral steps

),,( Zxyg ZZ

),,( zxyg ZZ),,( Zxyg

),,( Zxyg ZZ

),,( ZZZ xxyg),,( ZZyg

Making the assumptions trueSelecting z-variablesSelecting z-variables The intuitionThe intuition:: recursive z-selection

Recursive strong strong zz-minor-minor: single step, global maximum! Recursive weak z-minor: iterations, local maximum.

Different / Same speed

Remove – Contract – Split algorithm More…

slow connection

1x

2x 3x3z

slow connection

1x

2x 3x

1z4x 4x

slow connection

1x

2x 3x

1z4x

Making the assumptions true

Approximating the functionApproximating the function The intuitionThe intuition:: recursively “chip away” small parts of the function

More…

slow connection

slow connection

1x

2x 3x),( 322 xxf

1x

2x 3x),( 322 xxf

1z

1z

slow connection

1x

2x 3x),( 322 xxf

1z

Existing approximation algorithms

ClusteringClustering: triangulation [Pearl, 88]

Loopy Belief Revision Loopy Belief Revision [McEliece, 98][McEliece, 98]

Bethe-Kikuchi Free-EnergyBethe-Kikuchi Free-Energy: CCCP [Yuille, 02]

Tree Re-Parametrization (TRP)Tree Re-Parametrization (TRP) [Wainwright, 03] [Wainwright, 03]

Experimental Results

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

Weak z-

minor,

diff erent

speed

Weak z-

minor, same

speed

Random z-

variables

selection

LBR (50

messages)

LBR (10

messages)

Ignore

Siblings

1000 samples, 31 nodes, 4 values

Avarage Approximation Avarage MatchMore…


0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

Weak z-

minor,

diff erent

speed

Weak z-

minor, same

speed

Random z-

variables

selection

LBR (50

messages)

LBR (10

messages)

Ignore

Siblings

1000 samples, 31 nodes, 2 values

Avarage Approximation Avarage MatchMore…

More…More…

Maximum MI Maximum MI vs. vs.

Minimum PMinimum PEE

More…

Classification Specifics• How do we classify a new example?

);|(maxarg FCPCC

• What are “the best” features and parameters?

);;(maxarg,,

FCMIFF

• Why maximize MI?

MAP:

Maximize MI:

More reasons – if time permits

Tightly related to PE

Back…

)|()();( FCHCHFCMI

MaxMI Training - The Past - Reasons

• Why did it work? Conditional independence in C

• What was missing?

);;(),;,;(min);;(),;,;( jjjijiij

ii FCMIFFCMIFCMIFFCMI

Increased MI upper bound

Conditional independence in Conditional independence in C C was assumed!was assumed!

1 2 3 4 5 6

Maximizing the “whole” MI.Maximizing the “whole” MI. Learning model structure.Learning model structure.

Back…

MaxMI Training – JT

JTJT structure = TANTAN structure

GDL - exponential in TREEWIDTHTREEWIDTH

),,|;(

iiii FCFMI

TREEWIDTH = 2

Back…

MaxMI Training – EM

Why not EM?Why not EM?

EM assumes static training datastatic training data!

Not true in our scenario!Not true in our scenario!

[Redner, Walker, 84] EM algorithm:

Training CPTs with EM

Yy

yp );(logmaxarg

Back…

MaxMI Training – MaxMI hybrid solution

[Chow, Liu 68] “Best” Feature Tree

),;;(),( iiiiiiFT FFMIFFw

[Friedman, et al. 97] “Best” TAN

),;|;(),( iiiiiiTAN CFFMIFFw

Back…

[We, 2004] Maximal MI

),;|;(),( iiiiiiMM FCFMIFFw

MaxMI Training – MaxMI hybrid solution

);;();();();( iijiTANjiFTjiMM CFMIFFwFFwFFw

FTMM ScoreScore maxarg

MMScore Increase: ? ICR

TANScoreTAN maxarg

Non-decrease: TAN Asymptotic correctness

i

iiTANFTMM CFMIScoreScoreScore );;(

FTMM ScoreScore

Back…

MaxMI Training – MaxMI hybrid

Back…

);(

),(w)1(),(w

),(w),(w

FTTAN

FTMM

CFMI

FFFF

FFFF

i

iiii

iiii


Before training:

After training:

Back…


0102030405060708090

100110120130140150

MaxMI Original

Training

MaxMI+TAN

constrained

MaxMI+TAN

greedy

MaxMI+TAN

hybrid

MaxMI

(threshold only)

MaxMI+TAN

O&U soft EM

Face Parts Model - Classification Errors - Test DB (2257)

Miss FA Total Errors Back…


Face Parts ModelTest DB Size

Training DB Size

Class entropy on training DB

MI model to class on training DB

Error rate on test DB

Error rate on training DB

MaxMI Training22577670.7926908340.75824246413525

Original Training22577670.7926908340.72242935213635

MaxMI Training with constrained TAN restructure22577670.7926908340.756855168

Miss=62, FA=36

Miss=15, FA=3

MaxMI Training with greedy TAN restructure22577670.7926908340.746516913

Miss=30, FA=44

Miss=16, FA=3

Alternative MaxMI Training with TAN restructure22577670.7926908340.74711484

Miss=33, FA=109N / A

Threshold only training (without restructure)22577670.7926908340.738676981

Miss=84, FA=46

Miss=30, FA=5

Observed & Un-observed model training constructed from the all-observed model and soft EM22577670.792690834N / A67N / A

Back…


Cow Parts ModelTest DB Size

Training DB Size

Class entropy on training DB

MI model to class on training DB

Error rate on test DB

Error rate on training DB

Original Training22569610.46535663N / AMiss=84, FA=64

Miss=36, FA=16

MaxMI Training22569610.46535663N / AMiss=53, FA=42

Miss=25, FA=17

MaxMI Training with constrained TAN restructure22569610.46535663N / A

Miss=32, FA=48

Miss=17, FA=12

MaxMI Training with greedy TAN restructure22569610.46535663N / A

Miss=59, FA=30

Miss=23, FA=16

Observed & Un-observed model training constructed from the all-observed model and trained using soft EM22569610.46535663N / A89N /A

Back…

Remove – Contract – SplitRemove – Contract – Split

Back…

Making the assumptions true

Approximating the functionApproximating the function Strong Strong zz-minor-minor

Challenge:Challenge: selecting proper Z constants Benefit:Benefit: single step convergence

Weak Weak zz-minor-minor

Drawback:Drawback: exponential in number of “chips” Benefit:Benefit: less restrictive

Back…

),(),()1(),(),(

),(),()1(),(

),(),(

2121211

11

),(

1

1

zygxygZygxyf

zygxygxyf

xygxyf

xyg

The clique treeThe clique tree

Back…

)( ii vf )( jj vf

),(log jiij vvw

C1

C2 C4C3

iv jv

Ck


Model SizeNode Count

Value Count

Sample Count

A2 (same "slow" speed)

Average Approximation

Average Mismatch

Average Match )%(

Depth=3, Branching=5

314100094.11%15-1650.31%


313100094.55%11-1263.70%


312100097.16%4-584.60%

Based on Natural feature trees, 4

cliques of size 7252~200098.34%1-293.62%


Value Count

Sample Count

A2 (different "slow" speed)


Average Mismatch

Average Match )%(


314100098.26%10-1165.22%


313100098.08%7-874.51%


312100098.55%3-488.62%


cliques of size 7252~200097.85%3-486.14%

Experimental ResultsModel Size

Node Count

Value Count

Sample Count

Random Slow Connections


Average Mismatch

Average Match )%(


314100082.70%20-2134.58%


313100081.52%16-1745.48%


312100079.37%11-1262.23%


cliques of size 7252~2000N/AN/AN/A


Value Count

Sample Count

Loopy Belief Revision (50 messages per node)


Average Mismatch

Average Match )%(


3141000N/AN/AN/A


313100089.17%13-1455.31%


312100088.73%8-972.80%


cliques of size 7252~200093.34%3-487.73%

Experimental ResultsModel Size

Node Count

Value Count

Sample Count

Loopy Belief Revision (10 messages per node)


Average Mismatch

Average Match )%(


314100087.65%17-1841.95%


313100086.74%14-1554.02%


312100085.78%8-971.80%


cliques of size 7252~2000N/AN/AN/A


Value Count

Sample Count

Ignore Sibling Loopy Links


Average Mismatch

Average Match )%(


314100074.04%21-2229.25%


313100071.89%19-2038.56%


312100069.38%13-1456.09%


cliques of size 7252~200073.45%9-1063.88%

Back…

MaxMI Training – extensions Observed and unobserved model.

MaxMI augmented to support O&U Training observed only + EM heuristic.

Complete training Constrained and greedy TAN restructure. MaxMI vs. MinPE in ideal scenario –

characterization and comparison. Future research directions

MaxMI vs. MinPE

MinPE:

F

FstructureF

FFCPCPF

);),|(maxarg(minargmodel,,

MaxMI:);;(maxargmodel

,,F

structureFFCMI

F

Fano & inverse Fano (binary C):

)();|( EF PHFCH

);|(2

1FE FCHP

Back…

MaxMI vs. MinPE – ideal scenario

MinPE:

MaxMI:

Setting: n-valued C, k-valued F.

Arrange: Select F:

)()1( ,2 iCPiCPni

)|(argmaxj ,1 jFiCPkji

Divide:

Select F:

)(argmax},,{

1},,1{

)()(1 AHAA

k

jj

jj

An

ACPAAPk

0)|(

)(

)()|( ,1

jFAiCP

AP

iCPjFAiCPkj

j

jj

Back…

MaxMI vs. MinPE – ideal scenario

In general MaxMI MinPE

In special cases MaxMI MinPE

With increase in number of guesses:

Implications:Implications:

MaxMIMinPE

Back…