+ All Categories
Home > Documents > Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive...

Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive...

Date post: 07-Jun-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
31
Antti Sorjamaa Input and Variable Selection for Local Models
Transcript
Page 1: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa

Input and Variable Selection for Local Models

Page 2: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 2/31

OutlineOutline

l Basic Conceptsl Input selectionl Models

– k-NN, Lazy Learningl Variable Selection

– Leave-one-out, Bootstrapsl Results

Page 3: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 3/31

Selection Selection –– The Word of TodayThe Word of Today

l Inputs– Input selection method (Wrapper or Filter)

l Modell Parameters

– Validation method– Bounds

l Local or Globall Data sets

– Learning, validation, test

Page 4: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 4/31

l Notations:

to minimize

Selection PrincipleSelection Principle

( )( )∑=

∞→

−=

N

n

nnNgen N

y1

2,glim)(E θθ

x)(E θgen

( )validationtraining

nn

td

n

xgyRyR

,E

,ˆ,

θ=∈∈x

l Generalization Error

Page 5: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 5/31

Input SelectionInput Selection

l Exhaustive method– “Brute force”– All 2d input combinations explored

l Forward or Backward– Only d(d-1) input sets evaluated– Local minima problems à Suboptimal

l Forward-Backward– “More Optimal” ß More input sets estimated– Time consumption unknown beforehand

Page 6: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 6/31

Possible Selected InputsAction 1 2 3 4 5Initial x x x

1 x x2 x x

Possible Selected InputsAction 1 2 3 4 5Initial x x x

1 x x2 x x3 x x x x4 x x x x5 x x

Input Selection (2)Input Selection (2)

l Input selection with Forward-Backward method– Initialization

Possible Selected InputsAction 1 2 3 4 5Initial x x x

Possible Selected InputsAction 1 2 3 4 5Initial x x x

1 x x

Page 7: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 7/31

kk--NNNN

l Fast and reliablel Can be used as a part or as a whole

approximatorl Can be used with many different methodsl Only inputs and k need to be determined

beforehand

Page 8: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 8/31

1-NN3-NN

kk--NNNN

For Classification:For Classification: Class 1Class 1 Class 2Class 2

??For Regression:For Regression:

k

yy

k

jj

i

∑== 1

)P(

ˆ

Page 9: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 9/31

Lazy LearningLazy Learning

l Local, linear modell Laziness

– ”Do nothing until query”– No learning mandatory

l Compared to k-NN (local, constant model)– More time consuming than k-NN– Almost as diversified

l Locality can be ”globalized” incrementally

Page 10: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 10/31

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

Local, linear modelLocal, linear model

Page 11: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 11/31

0.78 0.8 0.82 0.84 0.86 0.88

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

Local, linear modelLocal, linear model

Page 12: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 12/31

iii ey += )f(x

∑=

′−N

i

qiii h

y1

2 )}),d(

K(){(xx

βx

FormulaFormula

Page 13: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 13/31

∑=

′−N

i

qiii h

y1

2 )}),d(

K(){(xx

βx

yXPβ

XXP′=

= −

ˆ)'( 1

βx ˆˆ qqy ′=

Simplified version: K à KNN

FormulaFormula

Page 14: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 14/31

0.78 0.8 0.82 0.84 0.86 0.88

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

““Do nothing until query”Do nothing until query”

Page 15: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 15/31

““Do nothing until query”Do nothing until query”

New input needs an output approximationl Validate optimal inputs

– Search nearest neighbors– Validate optimal neighborhood size

l Build linear model

l Calculate the needed estimate

Page 16: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 16/31

-6 -4 -2 0 2 4 6-6

-4

-2

0

2

4

6

1st parameter value

2nd

para

met

er v

alue k local or global!

Example: Example: yy((tt)=LL()=LL(yy((tt--1),1),yy((tt--2))2))

Page 17: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 17/31

Lazy Lazy LearningLearning

l Locality can be ”globalized” incrementally– Global k instead of Local k– Globally selected inputs instead of Local

l More Globalizationà Less Lazinessl Best amount of Globalization should be

determined for each case– Intensive validation and testing– Different attached methods

Page 18: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 18/31

LeaveLeave--OneOne--Out (LOO)Out (LOO)

DATA1

Validation set Learning set

A model is builtError

Procedure repeated N times ∑=

−=N

iiigen yy

N 1

2)ˆ(1E

Page 19: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 19/31

)1()1()(ˆ)1(ˆ)(ˆ)1()1()1(

)1()1()1()1()()1(1)()1()1()()()1(

+++=+

+′−+=+

++=+++′+

+′+−=+

kekkk

kkkyke

kkkkkk

kkkkkk

γ

γ

ββ

βx

xPxPx

PxxPPP

Recursive formula for LOORecursive formula for LOO

Page 20: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 20/31

0.78 0.8 0.82 0.84 0.86 0.88

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

)(ˆ kβ

Recursive formula for LOORecursive formula for LOO

Page 21: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 21/31

0.78 0.8 0.82 0.84 0.86 0.88

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

)1( +kx

)1(ˆ +kβ

Recursive formula for LOORecursive formula for LOO

Page 22: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 22/31

)(ke

kk optimal

Recursive formula for LOORecursive formula for LOO

Page 23: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 23/31

World Sample≠

Sample=

New WorldNew

Sample

Bootstrap ResamplingBootstrap Resampling

Page 24: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 24/31

Bootstrap Bootstrap ResamplingResampling

New World

New sample

Estimate: ( )∑=

−=B

b

bnewnew

bsamnewB 1

,, )(E)(E1)ism(mopti θθθ

)optimism()(E)(E ,, θθθ += samsamgensam

)(E-)(Eˆ)optimism( θθθ sam,samsam,gen=Definition:

)ism(mopti)(E)(E ,, θθθ += samsamgensam

Page 25: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 25/31

Bootstrap 632Bootstrap 632

+ 0.632 is derived from probability of single data point to be selected to bootstrap set

+ Unbiased and faster to evaluate

Bootstrap:

Bootstrap 632:

( )∑=

−=B

b

bnewnew

bsamnewB 1

,, )(E)(E1)ism(mopti θθθ

( )∑=

=B

b

bnewnewB 1

,632 )(E1)(ismmopti θθ

)(ismmopti632.0E)632.01(E 632, θ+−= samsamgen

newsamplenew −=

Page 26: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 26/31

The MethodThe Method

l Input selection with brute force– All 2d input combinations explored

l Using k-NN as approximatorl k selected with Leave-one-out, Bootstrap and

Bootstrap 632– Best k selected with each method

l Best input combination selected with each method

Page 27: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 27/31

ResultsResults

Selected Inputs k Êgen Test errorLOO t - {1, 2, 3, 5, 7, 8} 15 0.9219 1.1650

Bootstrap t - {1, 2, 4, 5, 7, 8} 1 0.6054 1.8458Bootstrap 632 t - {1, 2, 3, 5, 7, 8} 16 0.9333 1.1625

l Darwin Sea Pressure Data – 1400 values– 1000 values for training and 400 for testing

Page 28: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 28/31

The MethodThe Method22

l Input selection– For k-NN, all 2d input possibilities explored– For LL, Backward Selection and continuous

l k selected with Leave-one-out– Best k selected with each method combination

l Best input combination selected with each method combination

l Testing k-NN selected inputs with LL

Page 29: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 29/31

ResultsResults22

l Santa Fe Data – 10 000 values– 1000 values for training and 9000 for testing

Method Calculation time

Test Prediction 40 steps

k LOO error Minutes MSE MSELL 56 42.32 2.58 42.0746 1765.6LL pruned 59 19.42 13.95 20.6037 148.37k -NN 3 57.71 33.78 53.5387 1252.1k -NN + LL 15 33.57 0.20 31.4548 1770.1

Learning

Page 30: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 30/31

ConclusionsConclusions

l Leave-one-out is fast and good method to select inputs

l Bootstraps can select more optimal number of neighbours for k-NN

l Inputs selected with k-NN are not as good to use with LL than the ones selected with LLàk-NN is not good filter for LL

Page 31: Input and Variable Selection for Local Models · Antti Sorjamaa5/31 Input Selection lExhaustive method – “Brute force” – All 2d input combinations explored lForward or Backward

Antti Sorjamaa 31/31

Questions?Questions?

l A. Sorjamaa, A. Lendasse, and M. Verleysen, “Pruned Lazy Learning Models for Time Series Prediction,” pp. 509–514, ESANN 2005.

l A. Sorjamaa, N. Reyhani, and A. Lendasse, “Input and Structure Selection for k-NN Approximator,” in Lecture Notes in Computer Science, vol. 3512, pp. 985–991, IWANN 2005.

l Chris Atkeson, A. Moore and S. Schaal. Locally weighted learning,AI Review, 11:11-73, April 1997

Publications:


Recommended