+ All Categories
Home > Documents > “Ideal Parent” Structure Learning

“Ideal Parent” Structure Learning

Date post: 30-Dec-2015
Category:
Upload: margaret-walton
View: 34 times
Download: 3 times
Share this document with a friend
Description:
“Ideal Parent” Structure Learning. Gal Elidan with Iftach Nachman and Nir Friedman. School of Engineering & Computer Science The Hebrew University, Jerusalem, Israel. Variables. Data. S. C. E. S. C. S. C. E. D. D. E. 1. Consider local changes. D. 2. Score each candidate. S. - PowerPoint PPT Presentation
Popular Tags:
20
“Ideal Parent” Structure Learning School of Engineering & Computer Science The Hebrew University, Jerusalem, Gal Elidan with Iftach Nachman and Nir Friedman
Transcript
Page 1: “Ideal Parent”  Structure Learning

“Ideal Parent” Structure Learning

School of Engineering & Computer Science

The Hebrew University, Jerusalem, Israel

Gal Elidan

withIftach Nachman and Nir Friedman

Page 2: “Ideal Parent”  Structure Learning

Problems: Need to score many candidates Each one requires costly parameter optimization

Structure learning is often impractical

S C

E

D

S C

E

D

S C

E

D

S C

E

D

Learning Structure

Data

VariablesInput:

-17.23

-19.19

-23.13

Inst

ance

s

S C

E

D

Output:

Init: Start with initial structure

Consider local changes1Score each candidate2

Apply best modification 3 The “Ideal Parent” Approach Approximate improvements of changes (fast)

Optimize & score promising candidates (slow)

Page 3: “Ideal Parent”  Structure Learning

EC

P(E

| C

)

D

A

C

E

B

Linear Gaussian Networks

Page 4: “Ideal Parent”  Structure Learning

Goal: Score only promising candidates

The “Ideal Parent” Idea

Parent Profile

Child Profile

Instances

Pred(X|U)

U

X

Page 5: “Ideal Parent”  Structure Learning

Goal: Score only promising candidates

The “Ideal Parent” Idea

Ideal Profile

Instances

Pred(X|U)

U

X

Y

Step 1:Compute optimal

hypothetical parent

Pred(X|U,Y)

Instances

pote

ntia

l par

ents

Step 2:Search for

“similar” parent

Z1

Z2

Z3

Z4

Parent Profile

Child Profile

Page 6: “Ideal Parent”  Structure Learning

Step 3:Add new parent

and optimize parameters

Goal: Score only promising candidates

The “Ideal Parent” Idea

Instances

U

X

Step 1:Compute optimal

hypothetical parent

Instances

pote

ntia

l par

ents

Step 2:Search for

“similar” parent

Z1

Z2

Z3

Z4Pred(X|U,Y)

Ideal Profile

Y

Parent(s) Profile

Z2

Predicted(X|U,Z)

Child Profile

Page 7: “Ideal Parent”  Structure Learning

Choosing the best parent Z

Our goal: Choose Z that maximizes

U

X

Z U

X

Likelihood of Likelihood of

Theorem: likelihood improvement when only z is optimized

y,z

Y

Z

We define:

Page 8: “Ideal Parent”  Structure Learning

Similarity vs. Score

C2 is more accurate

C1 will be useful later

scoreC

2 S

imila

rity

score

C1

Sim

ilarit

y

We now have an efficient approximation for the score

effect of fixed variance is large

Page 9: “Ideal Parent”  Structure Learning

Ideal Parent in Search Structure search involves

O(N2) Add parentO(NE) Replace parentO(E) Delete parentO(E) Reverse edge

S C

E

D

S C

E

D

S C

E

D

S C

E

D

-17.23

-19.19

-23.13

Vast majority of evaluations are replaced by ideal approximation

Only K candidates per family are optimized and scored

Page 10: “Ideal Parent”  Structure Learning

Gene Expression Experiment

4 Gene expression datasets with 44 (Amino), 89 (Metabolism) and 173 (2xConditions) variables

0.1

0.2

1 2 3 4 5K

test

-l

og

-lik

elih

oo

d AminoMetabolismConditions (AA)Conditions (Met)

0

1 2 3 4 5K

0

1

2

3

4

sp

eed

up

1 2 3 4 5K0.4%-3.6%

changes evaluated

greedy

Speedup:1.8-2.7

Page 11: “Ideal Parent”  Structure Learning

Scope

Conditional probability distribution (CPD) of the form

link function white noise

General requirement:

g(U) be any invertible (w.r.t ui) function

Linear Gaussian Chemical ReactionSigmoid Gaussian

Page 12: “Ideal Parent”  Structure Learning

Problem: No simple form for similarity measures

Sigmoid Gaussian CPD

0

2

-4 -2 0 2 4

P(X

=0.

5|Z

)

Z

0

2

-4 -2 0 2 4

P(X

=0.

85|Z

)

0

1

g(z)

Z

X = 0.5 X = 0.85

0

1

g(z) 0.5

Y(0.5) Y(0.85)-4 -2 0 2 4-4 -2 0 2 4 Linear approximation

around Y=0ExactApprox

Z

X

Like

lihoo

d

Like

lihoo

d

Solution:

Sensitivity to Z depends on gradient of specific instance

Z

Page 13: “Ideal Parent”  Structure Learning

Sigmoid Gaussian CPD

-0.86 -0.3 0.27 0.83

0.04

1.15

2.26

3.37

Z x 0.25 (g0.5)

Z x

0.1

275

(g 0

.85)

-1.85 -0.64 0.58 1.79

-0.11

1.1

2.31

3.52

Z (X=0.5)

Z (

X=

0.85

)

Equi-Likelihood Potential After gradient correction

We can now use the same measure

Page 14: “Ideal Parent”  Structure Learning

Sigmoid Gene Expression

4 Gene expression datasets with 44 (Amino), 89 (Metabolism) and 173 (Conditions) variables

-0.1

0

0.1

test

-l

og

-lik

elih

oo

d

0 5 10 15 20K

AminoMetabolismConditions (AA)Conditions (Met)

greedy

20

60

100

sp

eed

up

0 5 10 15 20K 2.2%-6.1% moves evaluated

18-30 times faster

Page 15: “Ideal Parent”  Structure Learning

For the Linear Gaussian case:

Challenge: Find that maximizes this bound

Adding New Hidden Variables

Idea Profile

Idea: Introduce hidden parent for nodeswith similar ideal profiles

H

X1 X2 X4

X1

X2

X3

X4

X5

Y1

Y2

Y3

Y4

Y5

Instances

Page 16: “Ideal Parent”  Structure Learning

where is the matrix whose columns are

must lie in the span of

is the eigenvector with largest eignevalue

Setting and using the above (with A invertible)

Scoring a parent

Rayleigh quotient of the matrix and .

Finding h* amounts to solving an eigenvector problem where |A|=size of cluster

Page 17: “Ideal Parent”  Structure Learning

X1

X2

X3

X4

X1 X2 X3 X4

compute only once

Compute using

X1 X2

12.35

X1 X3

14.12

X3 X4

3.11

Finding the best Cluster

Page 18: “Ideal Parent”  Structure Learning

X1

X2

X3

X4

X1 X2 X3 X4

compute only once

X1 X3

X1 X3

X1 X2

12.35

X1 X3

14.12

X3 X4

3.11

14.12

X1 X3 X2

X2

18.45

X4

X1 X3 X2 X416.79

Finding the best Cluster

Select cluster with highest score Add hidden parent and continue with search

Page 19: “Ideal Parent”  Structure Learning

Bipartite Network

Instances from biological expert network with7 (hidden) parents and 141 (observed) children

10 100

-100

-60

-20

test

log

-lik

elih

oo

d

Instances10 100

-60

-40

-20

tra

in lo

g-l

ikel

iho

od

Instances

GreedyIdeal K=2Ideal K=5Gold

Speedup is roughly x 10

Greedy takes over 2.5 days!

Page 20: “Ideal Parent”  Structure Learning

Summary New method for significantly speeding up structure learning in continuous variable networks

Offers promising time vs. performance tradeoff

Guided insertion of new hidden variables

Future work Improve cluster identification for non-linear case

Explore additional distributions and relation to GLM

Combine the ideal parent approach as plug-in with other search approaches


Recommended