+ All Categories
Home > Documents > Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland K. Kersting, Luc De...

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland K. Kersting, Luc De...

Date post: 21-Dec-2015
Category:
View: 215 times
Download: 1 times
Share this document with a friend
Popular Tags:
72
Summer School on Relational Data Mining, 17 and 18 August, Helsinki, Finland K. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany Bayesian Logic Programs Kristian Kersting, Luc De Raedt Albert-Ludwigs University Freiburg, Germany Summer School on Relational Data Mining 17 and 18 August 2002, Helsinki, Finland
Transcript

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs

Kristian Kersting, Luc De RaedtAlbert-Ludwigs University

Freiburg, Germany

Summer School on Relational Data Mining

17 and 18 August 2002, Helsinki, Finland

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Context

Real-world applications

uncertainty complex, structureddomains

logicobjects, relations,functors

probability theorydiscrete, continuous

Bayesian networks Logic Programming (Prolog)

+Bayesian logic programs

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Outline

• Bayesian Logic Programs• Examples and Language• Semantics and Support Networks

• Learning Bayesian Logic Programs• Data Cases• Parameter Estimation• Structural Learning

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Bayesian Logic Programs

• Probabilistic models structured using logic • Extend Bayesian networks with notions of

objects and relations• Probability density over (countably)

infinitely many random variables • Flexible discrete-time stochastic processes• Generalize pure Prolog, Bayesian

networks, dynamic Bayesian networks, dynamic Bayesian multinets, hidden Markov models,...

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Bayesian Networks

• One of the successes of AI• State-of-the-art to model uncertainty, in

particular the degree of belief• Advantage [Russell, Norvig 96]:

„strict separation of qualitative and quantitative aspects of the world“

• Disadvantge [Breese, Ngo, Haddawy, Koller, ...]:Propositional character, no notion of objects

and relations among them

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Stud farm (Jensen ´96)

• The colt John has been born recently on a stud farm.

• John suffers from a life threatening hereditary carried by a recessive gene. The disease is so serious that John is displaced instantly, and the stud farm wants the gene out of production, his parents are taken out of breeding.

• What are the probabilities for the remaining horses to be carriers of the unwanted gene?

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann bt_brian bt_cecily

bt_dorothy bt_eric bt_gwenn

bt_unknown2bt_unknown1

bt_fred

bt_henry bt_irene

bt_john

Bayesian networks [Pearl ´88]

Based on the stud farm example [Jensen ´96]

_bt johnPa

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann bt_brian bt_cecily

bt_dorothy bt_eric bt_gwenn

bt_unknown2bt_unknown1

bt_fred

bt_henry bt_irene

bt_john

Bayesian networks [Pearl ´88]

Based on the stud farm example [Jensen ´96]

(Conditional) Probability distribution

_bt johnPa

P(bt_john) bt_henry bt_irene

(1.0,0.0,0.0) aa aa

(0.5,0.5,0.0) aa aA

(0.0,1.0,0.0) aa AA

...

(0.33,0.33,0.33) AA AA

P(bt_cecily=aA|bt_john=aA)=0.1499 P(bt_john=AA|bt_ann=aA)=0.6906P(bt_john=AA)=0.9909

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Bayesian networks (contd.)

• acyclic graphs• probability distribution over a finite set

of random variables:1, , nX X

1

1 2 2 3

1 1 2 2

1

, ,

, , , ,

n

n n n

n n

n

i ii

X X

X X X X X X X

X X X X X X

X X

P

P P P

P Pa P Pa P Pa

P Pa

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann bt_brian bt_cecily

bt_dorothy bt_eric bt_gwenn

bt_unknown2bt_unknown1

bt_fred

bt_henry bt_irene

bt_john

From Bayesian Networks to Bayesian Logic Programs

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann. bt_brian. bt_cecily.

bt_dorothy bt_eric bt_gwenn

bt_unknown2.bt_unknown1.

bt_fred

bt_henry bt_irene

bt_john

From Bayesian Networks to Bayesian Logic Programs

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann. bt_brian. bt_cecily.

bt_dorothy | bt_ann, bt_brian.

bt_eric| bt_brian, bt_cecily.

bt_gwenn | bt_ann, bt_unknown2.

bt_unknown2.bt_unknown1.

bt_fred | bt_unknown1,bt_ann.

bt_henry bt_irene

bt_john

From Bayesian Networks to Bayesian Logic Programs

_bt fredPa

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann. bt_brian. bt_cecily.

bt_dorothy | bt_ann, bt_brian.

bt_eric| bt_brian, bt_cecily.

bt_gwenn | bt_ann, bt_unknown2.

bt_unknown2.bt_unknown1.

bt_fred | bt_unknown1,bt_ann.

bt_henry | bt_fred, bt_dorothy.

bt_irene | bt_eric, bt_gwenn.

bt_john

From Bayesian Networks to Bayesian Logic Programs

_bt fredPa

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann. bt_brian. bt_cecily.

bt_dorothy | bt_ann, bt_brian.

bt_eric| bt_brian, bt_cecily.

bt_gwenn | bt_ann, bt_unknown2.

bt_unknown2.bt_unknown1.

bt_fred | bt_unknown1,bt_ann.

bt_henry | bt_fred, bt_dorothy.

bt_irene | bt_eric, bt_gwenn.

bt_john | bt_henry ,bt_irene.

From Bayesian Networks to Bayesian Logic Programs

_bt fredPa

P(bt_john) bt_henry bt_irene

(1.0,0.0,0.0) aa aa

(0.5,0.5,0.0) aa aA

(0.0,1.0,0.0) aa AA

...

(0.33,0.33,0.33) AA AA

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

% apriori nodes

bt_ann. bt_brian. bt_cecily. bt_unknown1. bt_unknown1.

% aposteriori nodes

bt_henry | bt_fred, bt_dorothy. bt_irene | bt_eric, bt_gwenn. bt_fred | bt_unknown1, bt_ann. bt_dorothy| bt_brian, bt_ann. bt_eric | bt_brian, bt_cecily. bt_gwenn | bt_unknown2, bt_ann. bt_john | bt_henry, bt_irene.

From Bayesian Networks to Bayesian Logic Programs

Domaine.g. finite, discrete, continuous

(conditional) probability distribution

(0.33,0.33,0.33)

...

(0.0,1.0,0.0)

(0.5,0.5,0.0)

bt_irenebt_henryP(bt_john)

AAAA

AAaa

aAaa

aaaa(1.0,0.0,0.0)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

% apriori nodes

bt(ann). bt(brian). bt(cecily). bt(unknown1). bt(unknown1).

% aposteriori nodes

bt(henry) | bt(fred), bt(dorothy). bt(irene) | bt(eric), bt(gwenn). bt(fred) | bt(unknown1), bt(ann). bt(dorothy)| bt(brian), bt(ann). bt(eric) | bt(brian), bt(cecily). bt(gwenn) | bt(unknown2), bt(ann). bt(john) | bt(henry), bt(irene).

From Bayesian Networks to Bayesian Logic Programs

(conditional) probability distribution

(0.33,0.33,0.33)

...

(0.0,1.0,0.0)

(0.5,0.5,0.0)

bt(irene)bt(henry)P(bt(john))

AAAA

AAaa

aAaa

aaaa(1.0,0.0,0.0)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

From Bayesian Networks to Bayesian Logic Programs

% ground facts / apriori

bt(ann). bt(brian). bt(cecily). bt(unkown1). bt(unkown1).

father(unkown1,fred). mother(ann,fred). father(brian,dorothy). mother(ann, dorothy). father(brian,eric). mother(cecily,eric). father(unkown2,gwenn). mother(ann,gwenn). father(fred,henry). mother(dorothy,henry). father(eric,irene). mother(gwenn,irene). father(henry,john). mother(irene,john).

% rules / aposteriori

bt(X) | father(F,X), bt(F), mother(M,X), bt(M).

(conditional) probability distribution

AA

Aa

bt(F)

false

true

father(F,X)

(0.33,0.33,0.33)

...

(1.0,0.0,0.0)

P(bt(X))

false

bt(M)mother(M,X)

AA

aatrue

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Dependency graph = Bayesian network

bt(ann)

bt(brian) bt(cecily)

bt(dorothy)

mother(ann,dorothy)

father(brian,dorothy)

bt(eric)

father(brian,eric)

mother(cecily,eric)

bt(gwenn)

mother(ann,gwenn)

bt(unknown2)

bt(unknown1)

bt(fred)

mother(ann,fred)

father(unknown1,fred)bt(henry)

bt(irene)mother(dorothy,henry)

mother(gwenn,irene)

father(eric,irene)

father(fred,henry)

bt(john)

father(henry,john)

mother(irene,john)

father(unknown2,eric)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

bt_ann bt_brian bt_cecily

bt_dorothy bt_eric bt_gwenn

bt_unknown2bt_unknown1

bt_fred

bt_henry bt_irene

bt_john

Dependency graph = Bayesian network

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Bayesian Logic Programs- a first definition

A BLP consists of • a finite set of Bayesian clauses.• To each clause in a conditional probability

distribution is associated:

• Proper random variables ~ LH(B)• graphical structure ~ dependency

graph• Quantitative information ~ CPDs

cpd c c c P head body

cpd cc B

B

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Bayesian Logic Programs- Examples

% apriori nodesnat(0).

% aposteriori nodesnat(s(X)) | nat(X).

nat(0) nat(s(0)) nat(s(s(0)) ...MC

% apriori nodesstate(0).

% aposteriori nodesstate(s(Time)) | state(Time).output(Time) | state(Time)

state(0)

output(0)

state(s(0))

output(s(0))

...HMM

% apriori nodesn1(0).

% aposteriori nodesn1(s(TimeSlice) | n2(TimeSlice).n2(TimeSlice) | n1(TimeSlice).n3(TimeSlice) | n1(TimeSlice), n2(TimeSlice).

n1(0)

n2(0)

n3(0)

n1(s(0))

n2(s(0))

n3(s(0))

...DBN

pure P

rolo

g

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

• represent generically the CPD for each ground instance of the corresponding Bayesian clause.

Associated CPDs

Multiple ground instances of clauses having the same head atom?

AA

Aa

bt(F)

false

true

father(F,X)

(0.33,0.33,0.33)

...

(1.0,0.0,0.0)

P(bt(X))

false

bt(M)mother(M,X)

AA

aatrue

1 2 n

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Combining Rules

Multiple ground instances of clauses having the same head atom?

% ground facts as before

% rules bt(X) | father(F,X), bt(F). bt(X) | mother(M,X), bt(M).

cpd(bt(john)|father(henry,john), bt(henry)) andcpd(bt(john)|mother(henry,john), bt(irene))

cpd(bt(john)|father(henry,john),bt(henry),mother(irene,john),bt(irene))

But we need !!!

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Combining Rules (contd.)

P(A|B,C)

P(A|B) and P(A|C)

CR

Any algorithm which combines a set of PDFs

into the (combined) PDFs

where

has an empty output if and only if the input is

empty E.g. noisy-or, regression, ...

1cpd , , 1ii inA A A i m

1cpd , , kA B B

1 11

, , , ,i

m

k ini

B B A A

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Bayesian Logic Programs- a definition

A BLP consists of • a finite set of Bayesian clauses.• To each clause in a conditional probability

distribution is associated:

• To each Bayesian predicate p a combining rule is associated to combine CPDs of multiple ground instances of clauses having the same head

• Proper random variables ~ LH(B)• graphical structure ~ dependency

graph• Quantitative information ~ CPDs and CRs

cpd c c c P head body

cpd cc B

B

cr p

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Outline

• Bayesian Logic Programs• Examples and Language• Semantics and Support Networks

• Learning Bayesian Logic Programs• Data Cases• Parameter Estimation• Structural Learning

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Discrete-Time Stochastic Process

• Family of random variables over a domain X, where

,tX t J0,1,2,J

tX

• for each linearization of the partial order induced by the dependency graph a Bayesian logic program specifies a discrete-time stochastic process

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Theorem of Kolmogorov

Existence and uniqueness of

probability measure

• : a Polish space• : set of all non-empty, finite subsets of J• : the probability measure over

• If the projective family exists then there exists a unique probability measure

X H J

IP , IX I H J

I I H JP

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Consistency Conditions

• Probability measure ,

is represented by a finite Bayesian network which is a subnetwork of the dependency graph over LH(B): Support Network

I H JIP

• (Elimination Order): All stochastic processes represented by a Bayesian logic program B specify the same probability measure over LH(B).

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Support network

bt(ann)

bt(brian) bt(cecily)

bt(dorothy)

mother(ann,dorothy)

father(brian,dorothy)

bt(eric)

father(brian,eric)

mother(cecily,eric)

bt(gwenn)

mother(ann,gwenn)

bt(unknown2)

bt(unknown1)

bt(fred)

mother(ann,fred)

father(unknown1,fred)bt(henry)

bt(irene)mother(dorothy,henry)

mother(gwenn,irene)

father(eric,irene)

father(fred,henry)

bt(john)

father(henry,john)

mother(irene,john)

father(unknown2,eric)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Support network

bt(ann)

bt(brian) bt(cecily)

bt(dorothy)

mother(ann,dorothy)

father(brian,dorothy)

bt(eric)

father(brian,eric)

mother(cecily,eric)

bt(gwenn)

mother(ann,gwenn)

bt(unknown2)

bt(unknown1)

bt(fred)

mother(ann,fred)

father(unknown1,fred)bt(henry)

bt(irene)mother(dorothy,henry)

mother(gwenn,irene)

father(eric,irene)

father(fred,henry)

bt(john)

father(henry,john)

mother(irene,john)

father(unknown2,eric)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Support network

bt(ann)

bt(brian) bt(cecily)

bt(dorothy)

mother(ann,dorothy)

father(brian,dorothy)

bt(eric)

father(brian,eric)

mother(cecily,eric)

bt(gwenn)

mother(ann,gwenn)

bt(unknown2)

bt(unknown1)

bt(fred)

mother(ann,fred)

father(unknown1,fred)bt(henry)

bt(irene)mother(dorothy,henry)

mother(gwenn,irene)

father(eric,irene)

father(fred,henry)

bt(john)

father(henry,john)

mother(irene,john)

father(unknown2,eric)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Support network

• Support network of is the induced subnetwork of

• Support network of is defined as

• Computation utilizes And/Or trees

LHx B N x

LH is influencing xS x y B y

N x LH Bx

x

N N x

x

x

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

• ?- bt(eric).

Queries using And/Or trees• A probabilistic query ?- Q1...,Qn|E1=e1,...,Em=em. asks for the distribution P(Q1, ..., Qn |E1=e1, ..., Em=em).

• Or node is proven if at least one of its successors is provable.

• And node is proven if all of its successors are provable.

bt(brian) bt(cecily)

father(brian,eric),bt(brian),mother(cecily,eric),bt(cecily)

father(brian,eric) mother(cecily,eric)

bt(eric)

cpd

bt(brian)

bt(cecily)

father(brian,eric)mother(cecily,eric)

bt(eric)combinedcpd

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

well-defined Bayesian logic program

Consistency Condition (contd.)

• the dependency graph is acyclic, and

• every random variable is influenced by a finite set of random variables only

Projective family exists if I I H JP

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Relational Character

% ground factsbt(ann). bt(brian). bt(cecily). bt(unknown1). bt(unknown1).

father(unknown1,fred). mother(ann,fred). father(brian,dorothy). mother(ann, dorothy). father(brian,eric). mother(cecily,eric). father(unknown2,gwenn). mother(ann,gwenn). father(fred,henry). mother(dorothy,henry). father(eric,irene). mother(gwenn,irene). father(henry,john). mother(irene,john).

% rules bt(X) | father(F,X), bt(F), mother(M,X), bt(M).

% ground factsbt(ann). bt(brian). bt(cecily). bt(unknown1). bt(unknown1).

father(unknown1,fred). mother(ann,fred). father(brian,dorothy). mother(ann, dorothy). father(brian,eric). mother(cecily,eric). father(unknown2,gwenn). mother(ann,gwenn). father(fred,henry). mother(dorothy,henry). father(eric,irene). mother(gwenn,irene). father(henry,john). mother(irene,john).

% rules bt(X) | father(F,X), bt(F), mother(M,X), bt(M).

P(bt(X)) father(X,F) bt(F) mother(X,M) bt(M)

(1.0,0.0,0.0) true Aa true aa

...

(0.33,0.33,0.33) false AA false AA

% ground factsbt(susanne). bt(ralf). bt(peter). bt(uta).

father(ralf,luca). mother(susanne,luca). ...

% ground factsbt(susanne). bt(ralf). bt(peter). bt(uta).

father(ralf,luca). mother(susanne,luca). ...

% ground factsbt(petra). bt(bsilvester). bt(anne). bt(wilhelm). bt(beate).

father(silvester,claudien). mother(beate,claudien). father(wilhelm,marta). mother(anne, marthe). ...

% ground factsbt(petra). bt(bsilvester). bt(anne). bt(wilhelm). bt(beate).

father(silvester,claudien). mother(beate,claudien). father(wilhelm,marta). mother(anne, marthe). ...

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Bayesian Logic Programs- Summary

• First order logic extension of Bayesian networks• constants, relations, functors• discrete and continuous random variables• ground atoms = random variables• CPDs associated to clauses• Dependency graph = (possibly) infinite Bayesian network• Generalize dynamic Bayesian networks and

definite clause logic (range-restricted)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Applications

• Probabilistic, logical• Description and prediction• Regression• Classification• Clustering

• Computational Biology• APrIL IST-2001-33053

• Web Mining• Query approximation• Planning, ...

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

• Probabilistic Horn Abduction [Poole 93]

• Distributional Semantics (PRISM) [Sato 95]

• Stochastic Logic Programs [Muggleton 96; Cussens 99]

• Relational Bayesian Nets [Jaeger 97]

• Probabilistic Logic Programs [Ngo, Haddawy 97]

• Object-Oriented Bayesian Nets [Koller, Pfeffer 97]

Probabilistic Frame-Based Systems [Koller, Pfeffer 98]

Probabilistic Relational Models [Koller 99]

Other frameworks

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Outline

• Bayesian Logic Programs• Examples and Language• Semantics and Support Networks

• Learning Bayesian Logic Programs• Data Cases• Parameter Estimation• Structural Learning

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Learning Bayesian Logic Programs

Data +

BackgroundKnowledge

learningalgorithm .9 .1

e

b

e

.7 .3

.01.99

.8 .2

be

b

b

e

E P(A)

A | E, B.

B

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Why Learning Bayesian Logic Programs ?

Of interest to different communities ?• scoring functions, pruning techniques,

theoretical insights, ...

Inductive Logic Programming

Learning withinBayesian Logic

Programs

Learning withinBayesian network

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning What is the data about ?

A data case is a partially observed joint state of a finite, nonempty subset

iD D

LH Bx

, , , ,

, , ?,

m ann dorothy true f brian dorothy true

pc brian b bt ann a bt brian bt dorothy a

, , , , ,

, ?, , , , ,

?,

m cecily fred true f henry fred true bt cecily ab

bt henry b bt fred m kim bob true f fred bob true

bt kim bt bob b

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Learning Task

Given:• set of data cases• a Bayesian logic program B

Goal: for each the parameters

of that best fit the given data

1, , nD DD

cpd( )c

c B

1( ) , ,

e cc c c λ

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Parameter Estimation (contd.)

• „best fit“ ~ ML-Estimation

where the hypothesis space is spanned by the product space over the possible values of

:

arg max*BP

λλ Hλ D

:c B

c

λ λ

H

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Parameter Estimation (contd.)

Assumption: D1,...,DN are independently sampled from indentical

distributions (e.g. totally separated families),),

arg max

arg max ln

*B

B

P

P

λλ H

λλ H

λ D

D

arg max ln

arg max ln

iBi

iBi

P D

P D

λλ H

λλ H

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

is an ordinary

BN

N λ : var i

i

N N D λλ marginal of

var arg max ln

iiN D

i

P D

λ H

var iN Dλ

marginal of arg max ln*

iBi

P D

λλ Hλ

Parameter Estimation (contd.)

arg max ln iNi

P D

λλ H

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Parameter Estimation (contd.)

• Reduced to a problem within Bayesian networks: given structure, partially observed random varianbles

• EM [Dempster, Laird, Rubin, ´77], [Lauritzen, ´91]

• Gradient Ascent [Binder, Koller, Russel, Kanazawa, ´97], [Jensen, ´99]

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Decomposable CRs

• Parameters of the clauses and not of the support network.

A

11nA12A11A

1A

...22nA22A21A

2A

mmnA2mA1mA

mA

... ...

...

Single ground instanceof a Bayesian clause

Multiple ground instanceof the same Bayesian clause

CPD for Combining Rule

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

1, if ,

0, if ,

j j k k

j j k k

Bayesian ground clause

Bayesian clause

Gradient Ascent

λln

cpdN

jk

P

c

D

λ

subst. ,

cpdln

cpd cpdN j k

j k j k jk

cP

c c

D

λ

subst.

ln

cpdN

jk

P

c

D

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Gradient Ascent

λ

λ

subst.

subst. 1

ln

cpd

ln

cpd

head ,body

cpd

N

jk

N

jk

nj k iN

i jk

P

c

P

c

P c u c D

c

λ

D

D

u

Bayesian NetworkInference Engine

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Algorithm

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

1. Initialize parameters

2. E-Step and M-Step, i.e.

compute expected counts for each clause and treat the expected count as counts

3. If not converged, iterate to 2

Expectation-Maximization

1

subst.

1subst.

head ,bodycpd

body

n

j k iNi

njk

k iNi

P c u c Dc

P c D

λ

λ

u

u

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Experimental Evidence

• [Koller, Pfeffer ´97]

support network is a good approximation• [Binder et al. ´97]

equality constraints speed up learning

m(M,X) f(F,X) pdf (c)(h(X)|h(M),h(F))

true true N(0.5*h(M)+0.5*h(F),s)

true false N(165,s)

false true N(165,s)

false false N(165,s)

• 100 data cases• constant step-size• Estimation of means

• 13 iterations• Estimation of the weights

• sum = 1.0

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Outline

• Bayesian Logic Programs• Examples and Language• Semantics and Support Networks

• Learning Bayesian Logic Programs• Data Cases• Parameter Estimation• Structural Learning

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Structural Learning

• Combination of Inductive Logic Programming and Bayesian network learning

• Datalog fragment of Bayesian logic programs (no functors)• intensional Bayesian clauses

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Idea - CLAUDIEN

learning from interpretations

• all data cases are

Herbrand interpretations

• a hypothesis should reflect what is in the data

probabilistic extension

all data cases are partially observed joint states of Herbrand intepretations

all hypotheses have to be (logically) true in all data cases

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning What is the data about ?

, , , ,

, , ?,

m ann dorothy true f brian dorothy true

pc brian b bt ann a bt brian bt dorothy a

, , , ,

, , ?,

, , , ,

?,

m cecily fred true f henry fred true

bt cecily ab bt henry b bt fred

m kim bob true f fred bob true

bt kim bt bob b

...

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

• : set of data cases

• : set of all clauses that can be part of hypotheses

(logically) valid iff

logical solution iff is a logically maximally general valid hypothesis

Claudien -Learning From Interpretations

D

C

H C : is logically true in i iD H D D

H

H C

probabilistic solution iff is (logically) valid and the Bayesian network induced by B on is acyclic

HH C

D

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Given:• set of data cases• a set of Bayesian logic programs• a scoring function

Goal: probabilistic solution • matches the data best according to

Learning Task

1, , nD DD H

score :D H R

*H HscoreD

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Algorithm

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Example

mc(john) pc(john)

bc(john)

m(ann,john) f(eric,john)

pc(ann)

mc(ann) mc(eric)

pc(eric)

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Original Bayesian logic program

{m(ann,john)=true, pc(ann)=a, mc(ann)=?, f(eric,john)=true, pc(eric)=b, mc(eric)=a, mc(john)=ab, pc(john)=a, bt(john) = ? } ...

Data cases

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

mc(john) pc(john)

bc(john)

m(ann,john) f(eric,john)

pc(ann)

mc(ann) mc(eric)

pc(eric)

Example

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Original Bayesian logic program

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X).

Initial hypothesis

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Example

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Original Bayesian logic program

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X).

Initial hypothesismc(john) pc(john)

bc(john)

m(ann,john) f(eric,john)

pc(ann)

mc(ann) mc(eric)

pc(eric)

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Example

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Original Bayesian logic program

mc(john) pc(john)

bc(john)

m(ann,john) f(eric,john)

pc(ann)

mc(ann) mc(eric)

pc(eric)

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X), pc(X).

Refinement

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X).

Initial hypothesis

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Example

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Original Bayesian logic program

mc(john) pc(john)

bc(john)

m(ann,john) f(eric,john)

pc(ann)

mc(ann) mc(eric)

pc(eric)

mc(X) | m(M,X),mc(X). pc(X) | f(F,X). bt(X) | mc(X), pc(X).

Refinement

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X).

Initial hypothesis

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X), pc(X).

Refinement

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Example

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Original Bayesian logic program

mc(john) pc(john)

bc(john)

m(ann,john) f(eric,john)

pc(ann)

mc(ann) mc(eric)

pc(eric)

mc(X) | m(M,X),pc(X). pc(X) | f(F,X). bt(X) | mc(X), pc(X).

Refinement

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X).

Initial hypothesis

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X), pc(X).

Refinement

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Example

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Original Bayesian logic program

mc(john) pc(john)

bc(john)

m(ann,john) f(eric,john)

pc(ann)

mc(ann) mc(eric)

pc(eric)

...mc(X) | m(M,X),pc(X). pc(X) | f(F,X). bt(X) | mc(X), pc(X).

Refinement

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X).

Initial hypothesis

mc(X) | m(M,X). pc(X) | f(F,X). bt(X) | mc(X), pc(X).

Refinement

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Properties

• All relevant random variables are known • First order equivalent of Bayesian network

setting• Hypothesis postulates true regularities in

the data• Logical solutions as inital hypotheses• Highlights Background Knowledge

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Example Experiments

mc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).bt(X) | mc(X), pc(X).

Data: sampling from 2 families, each 1000 samplesScore: LogLikelihoodGoal: learn the definition of bt

CLAUDIENmc(X) | m(M,X).pc(X) | f(F,X).

Bloodtypemc(X) | m(M,X), mc(M), pc(M).pc(X) | f(F,X), mc(F), pc(F).

bt(X) | mc(X), pc(X).

highest score

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning Conclusion

• EM-based and Gradient-based method to do ML parameter estimation

• Link between ILP and learning Bayesian networks

• CLAUDIEN setting used to define and to traverse the search space

• Bayesian network scores used to evaluate hypotheses

Summer School on Relational Data Mining, 17 and 18 August, Helsinki, FinlandK. Kersting, Luc De Raedt, Machine Learning Lab, Albert-Ludwigs-University, Freiburg, Germany

Bayesian Logic Programs Examples and Language Semantics and Support NetworksLearning Data Cases Parameter Estimation Structural Learning

Thanks !


Recommended