+ All Categories
Home > Documents > Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts...

Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts...

Date post: 02-Jan-2016
Category:
Upload: berenice-casey
View: 213 times
Download: 0 times
Share this document with a friend
50
Learning from observations Chapter 18
Transcript
Page 1: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Learning from observations

Chapter 18

Page 2: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Two types of learning in AI

Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt with this)

Inductive: Learn new rules/facts from a data set D.

CACBA

CAnyn Nn ...1)(),(xD

We will be dealing with the latter, inductive learning, now

Page 3: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Two tracks

Regression: Learning function values

Classification: Learning categories

Page 4: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning - example A

• f(x) is the target function• An example is a pair [x, f(x)]• Learning task: find a hypothesis h such that h(x) f(x) given a

training set of examples D = {[xi, f(xi) ]}, i = 1,2,…,N

1)(,

0

0

1

0

1

0

1

1

1

xx f

1)(,

0

0

1

1

1

0

0

1

1

xx f

0)(,

0

1

0

1

1

0

0

1

1

xx f

Etc...

Page 5: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning – example B

Construct h so that it agrees with f.

The hypothesis h is consistent if it agrees with f on all

observations.

Ockham’s razor: Select the simplest consistent hypothesis.

How achieve good generalization?

Consistent linear fit Consistent 7th order polynomial fit

Inconsistent linear fit.Consistent 6th orderpolynomial fit.

Consistent sinusoidal fit

Page 6: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning – example C

x

y

Example from V. Pavlovic @ Rutgers

Page 7: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning – example C

x

y

Example from V. Pavlovic @ Rutgers

Page 8: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning – example C

x

y

Example from V. Pavlovic @ Rutgers

Page 9: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning – example C

x

y

Example from V. Pavlovic @ Rutgers

Sometimes a consistent hypothesis is worse than an inconsistent

Page 10: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

The inductive learning problem

Our hypothesis space

f(x)

hopt(x) H

Error

Find appropriate hypothesis space H and findh(x) H with minimum “distance” to f(x) (“error”)

The learning problem is realizable if f(x) ∈ H.

Page 11: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Data is never noise free and never available in infinite amounts, so we get variation in data and model.

The generalization error is a function of both the training data and the hypothesis selection method.

Find appropriate hypothesis space H and minimize the expected distance to f(x) (“generalization error”)

{f(x)}

{hopt(x)}

Egen

The real inductive learning problem

Page 12: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Hypothesis spaces (examples)

f(x) = 0.5 + x + x2 + 6x3123

1={a+bx}; 2={a+bx+cx2}; 3={a+bx+cx2+dx3};Linear; Quadratic; Cubic;

1 2 3

Page 13: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Learning problems

The hypothesis takes as input a set of attributes x

and returns a ”decision” h(x) = the predicted

(estimated) output value for the input x.

Discrete valued function ⇒ classification

Continuous valued function ⇒ regression

Page 14: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Classification

Order into one out of several classes

KD CX Input space Output (category) space

D

D

X

x

x

x

2

1

xK

K

C

c

c

c

0

1

0

2

1

c

Page 15: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

EX. Pattern Classification

Objective: To recognize horse in images

Procedure: Feature => Classifier => Cross+Valivation

23/4/20 15

Page 16: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Classifier

23/4/20 16

Horse Horse

Non Horse Non Horse

Page 17: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Regression

The “fixed regression model”

ε(x)gf(x) θ

x Observed inputf(x) Observed outputg(x) True underlying function I.I.D noise process

with zero mean

Page 18: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Ex: Predict price for cotton futures

Input: Past historyof closing prices,and trading volume

Output: Predictedclosing price

Page 19: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Question?

Let’s look at a classification problem: predicting whether a certain person will choose a particular restaurant.

Page 20: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Method: Decision trees

• “Divide and conquer”: Split data into smaller and smaller subsets.

• Splits usually on a single variable

x1 > ?

yes no

x2 > ? x2 > ?

yes yesno no

Page 21: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

The wait@restaurant decision tree

This is our true function.Can we learn this tree from examples?

Page 22: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning of decision tree

Simplest: Construct a decision tree with one leaf for every example = memory based learning.Not very good generalization.

Advanced: Split on each variable so that the purity of each split increases (i.e. either only yes or only no)

Purity measured,e.g, with entropy

Page 23: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning of decision tree

Simplest: Construct a decision tree with one leaf for every example = memory based learning.Not very good generalization.

Advanced: Split on each variable so that the purity of each split increases (i.e. either only yes or only no)

Purity measured,e.g, with entropy

Page 24: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Inductive learning of decision tree

Simplest: Construct a decision tree with one leaf for every example = memory based learning.Not very good generalization.

Advanced: Split on each variable so that the purity of each split increases (i.e. either only yes or only no)

Purity measured,e.g, with entropy

)](ln[)()](ln[)(Entropy noPnoPyesPyesP

i

ii vPvP )(ln)(EntropyGeneral form:

Page 25: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

The entropy is maximal whenall possibilities are equallylikely.

The goal of the decision treeis to decrease the entropy ineach node.

Entropy is zero in a pure ”yes”node (or pure ”no” node).

The second law of thermodynamics:Elements in a closed system tend to seek their most probable distribution; in a closed system entropy always increases

Entropy is a measure of ”order” in asystem.

Page 26: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning algorithm

Create pure nodes whenever possible

If pure nodes are not possible, choose the split that leads to the largest decrease in entropy.

Page 27: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

10 attributes:

1. Alternate: Is there a suitable alternative restaurant nearby? {yes,no}

2. Bar: Is there a bar to wait in? {yes,no}

3. Fri/Sat: Is it Friday or Saturday? {yes,no}

4. Hungry: Are you hungry? {yes,no}

5. Patrons: How many are seated in the restaurant? {none, some, full}

6. Price: Price level {$,$$,$$$}

7. Raining: Is it raining? {yes,no}

8. Reservation: Did you make a reservation? {yes,no}

9. Type: Type of food {French,Italian,Thai,Burger}

10. Wait: {0-10 min, 10-30 min, 30-60 min, >60 min}

Page 28: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

T = True, F = False 6 True,6 False 30.012

6ln126

126ln12

6Entropy

Page 29: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

30.063ln6

36

3ln63

12

66

3ln63

63ln6

312

6Entropy

Alternate?

3 T, 3 F 3 T, 3 F

Yes No

Entropy decrease = 0.30 – 0.30 = 0

Page 30: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

30.063ln6

36

3ln63

12

66

3ln63

63ln6

312

6Entropy

Bar?

3 T, 3 F 3 T, 3 F

Yes No

Entropy decrease = 0.30 – 0.30 = 0

Page 31: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

29.073ln7

37

4ln74

12

75

3ln53

52ln5

212

5Entropy

Sat/Fri?

2 T, 3 F 4 T, 3 F

Yes No

Entropy decrease = 0.30 – 0.29 = 0.01

Page 32: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

24.054ln5

45

1ln51

12

57

2ln72

75ln7

512

7Entropy

Hungry?

5 T, 2 F 1 T, 4 F

Yes No

Entropy decrease = 0.30 – 0.24 = 0.06

Page 33: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

30.084ln8

48

4ln84

12

84

2ln42

42ln4

212

4Entropy

Raining?

2 T, 2 F 4 T, 4 F

Yes No

Entropy decrease = 0.30 – 0.30 = 0

Page 34: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

29.074ln7

47

3ln73

12

75

2ln52

53ln5

312

5Entropy

Reservation?

3 T, 2 F 3 T, 4 F

Yes No

Entropy decrease = 0.30 – 0.29 = 0.01

Page 35: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

14.06

4ln64

62ln6

212

6

40ln4

04

4ln44

12

42

2ln22

20ln2

012

2Entropy

Patrons?

2 F

4 T

None Full

Entropy decrease = 0.30 – 0.14 = 0.16

2 T, 4 FSome

Page 36: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

23.04

3ln43

41ln4

112

4

20ln2

02

2ln22

12

26

3ln63

63ln6

312

6Entropy

Price

3 T, 3 F

2 T

$ $$$

Entropy decrease = 0.30 – 0.23 = 0.07

1 T, 3 F$$

Page 37: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

30.04

2ln42

42ln4

212

44

2ln42

42ln4

212

4

21ln2

12

1ln21

12

22

1ln21

21ln2

112

2Entropy

Type

1 T, 1 F

1 T, 1 F

French Burger

Entropy decrease = 0.30 – 0.30 = 0

2 T, 2 FItalian

2 T, 2 F

Thai

Page 38: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

24.02

2ln22

20ln2

012

22

1ln21

21ln2

112

2

21ln2

12

1ln21

12

26

2ln62

64ln6

412

6Entropy

Est. waitingtime

4 T, 2 F

1 T, 1 F

0-10 > 60

Entropy decrease = 0.30 – 0.24 = 0.06

2 F10-30

1 T, 1 F

30-60

Page 39: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

Patrons?

2 F

4 T

None Full

Largest entropy decrease (0.16)achieved by splitting on Patrons.

2 T, 4 FSome

X? Continue like this, making new splits, always purifying nodes.

Page 40: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

Induced tree (from examples)

Page 41: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

True tree

Page 42: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Decision tree learning example

Induced tree (from examples)

Cannot make it more complexthan what the data supports.

Page 43: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

How do we know it is correct?

How do we know that h f ?

(Hume's Problem of Induction)

Try h on a new test set of examples

(cross validation)

...and assume the ”principle of uniformity”, i.e. the result we get on this test data should be indicative of results on future data. Causality is constant.

Page 44: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Learning curve for the decision tree algorithm on 100 randomlygenerated examples in the restaurant domain.The graph summarizes 20 trials.

Page 45: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Cross-validation

Use a “validation set”.

Dtrain

Dval

Eval

valgen EE

Split your data set into twoparts, one for training yourmodel and the other for validating your model.The error on the validation data is called “validation error”(Eval)

Page 46: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

K-Fold Cross-validation

More accurate than using only one validation set.

Dtrain

Dval

Dtrain

Dtrain

Dtrain

Dval

Dval

Eval(1) Eval(2) Eval(3)

K

kvalvalgen kE

KEE

1

)(1

Page 47: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

Question?

If the total number of training sample is small, how can we conduct the cross-validation?

Page 48: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

PAC

Any hypothesis that is consistent with a sufficiently large set of training (and test) examples is unlikely to be seriously wrong; it is probably approximately correct (PAC).

What is the relationship between the generalization error and the number of samples needed to achieve this generalization error?

Page 49: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

instance space X

f

h

f and h disagree

The error

X = the set of all possible examples (instance space).D = the distribution of these examples.H = the hypothesis space (h H).N = the number of training data.

] fromdrawn |)()([)(error DfhPh xxx

Page 50: Learning from observations Chapter 18. Two types of learning in AI Deductive: Deduce rules/facts from already known rules/facts. (We have already dealt.

How make learning work?

Use simple hypotheses

Always start with the simple ones first

Constrain H with priors

Do we know something about the domain?

Do we have reasonable a priori beliefs on parameters?

Use many observations

Easy to say...

Cross-validation...


Recommended