+ All Categories
Home > Documents > Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Date post: 17-Jan-2016
Category:
Upload: gloria-bennett
View: 214 times
Download: 2 times
Share this document with a friend
23
Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models
Transcript
Page 1: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

1

Joseph XuSoar Workshop 2012

Learning Modal Continuous Models

Page 2: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

2

Setting: Continuous Environment

• Input to the agent is a set of objects with continuous properties– Position, rotation, scaling, ...

• Output is fixed-length vector of continuous numbers

• Agent runs in lock-step with environment

• Fully observable

Output

-9.0

5.8

Input

EnvironmentAgent

0.2 1.2 0.0 0.0

px py rx ry

A

B

0.0 0.2

pz rz

3.4 3.9 0.0

px py pz

0.0 0.0 0.0

rx ry rz

A B

Page 3: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

3

Levels of Problem Solving

Motor Babbling

Continuous Sampling Methods (RRT)

Symbolic Model Free Methods (RL)

Continuous Model

Symbolic Abstraction

Symbolic Planning Symbolic Model

Slower Task CompletionSpecific Solutions

Faster Task CompletionGeneral Solutions

Problem Solving Method

Knowledge RequiredCharacteristics

NoneGoal Recognition

Page 4: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

4

Continuous Model Learning

• Learn a function

• x: current continuous state vector

• u: current output vector• y: state vector in next

time stepx u y

ContinuousOutput

X U Y

Page 5: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

5

Locally Weighted RegressionMotor

Command

left voltage: -0.6right voltage: 1.2 ?

x u

k nearest neighborsWeightedLinearRegression

j

jji

ii uwxwuxf ),(

Page 6: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Problems with LWR

• Euclidean distance doesn’t capture relational similarity

• Averages over neighbors exhibiting different types of interactions

6

Query

Neighbor

Neighbor Neighbor

Neighbor

Page 7: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Problems with LWR

7

Query

Neighbor Neighbor

Prediction

• Euclidean distance doesn’t capture relational similarity

• Averages over neighbors exhibiting different types of interactions

Page 8: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Modal Models• Object behavior can be categorized into different Modes

– Behavior within a single mode is usually simple and smooth (inertia, gravity, etc...)– Behaviors across modes can be discontinuous and complex (collisions, drops)– Modes can often be distinguished by discrete spatial relationships between objects

• Learn two-level models composed of:– A classifier that determines the active mode using spatial relationships– A set of linear functions (initial hypothesis), one for each model

8

Mod

e Cl

assi

fier Mode 1 model

Mode 2 model

Mode 3 model

Scene Prediction

Page 9: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Unsupervised Learning of Modes From Data

9

Environment

Mode 2

time

Mode 1

Expectation Maximization

Learned Mode 1

Learned Mode 2

𝒚

Continuous FeaturesTraining Data

0.5, 1.1, -0.2, 4, 17 21.9

Page 10: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

10

Expectation Maximization

• ExpectationAssuming your current model parameters are correct, what is the likelihood that the model m generated data point i?

• MaximizationAssuming each data point was generated by the most probable model, modify each model’s parameters to maximize likelihood of generating data

• Iterate until convergence to local maximum

Page 11: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Learning Classifier

11

Spatial RelationsTraining Data

0.5, 1.1, -0.2, 4, 17 21.9

time

Scene

left-of(A,B) = 1right-of(A,B) = 0on-top(A,B) = 0touch(A,B) = 0

A B 10001010110110101011010100110010110000010101110101000010100010101111010001010000010101001111111010101010101010000100110101010100110100110010101

1

class

1111222211

attributes

1000101011011

Expectation Maximization

Learned Mode 1

Learned Mode 2

Page 12: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

Learning Classifier

12

0101011010100110010110000010101110101000010100010101111010001010000010101001111111010101010101010000100110101010100110100110010101

1000101011011 1

Classifier Training Dataattributes class

1111222211

touch(A, B)

left-of(A, B)

mode 1 mode 2

mode 2

1 0

1 0

Use linear model for items in same model

Page 13: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

13

Prediction Accuracy Experiment

• 2 Block Environment– Agent has two outputs (dx, dy) which control the x and y offsets of

the controlled block at every times tep– The pushed block can’t be moved except by pushing it with the

controlled block– Blocks are always axis-aligned, there’s no momentum

• Training– Instantiate Soar agent in a variety of spatial configurations– Run 10 time steps, each step is a training example

• Testing– Instantiate Soar agent in some configuration– Check accuracy of prediction for next time step

Page 14: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

14

Prediction Accuracy – Pushed Block

10 20 30 40 50 60 70 801E-8

1E-6

1E-4

1E-2

1E+0

1E+2

1E+4

MM xMM ySM xSM y

Training Scenarios

Aver

age

Erro

r

Page 15: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

15

Classification Performance

0 10 20 30 40 50 60 70 80 900

3

6

9

X errors Y errors

Training Scenarios

Erro

rs

Page 16: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

16

Prediction Performance Without Classification Errors

0 10 20 30 40 50 60 70 80 901E-08

1E-05

1E-02

1E+01

1E+04

Best XBest YReal XReal Y

Training Scenarios

Aver

age

Erro

r

Page 17: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

17

Levels of Problem Solving

Motor Babbling

Continuous Sampling Methods (RRT)

Symbolic Model Free Methods (RL)

Continuous Model

Symbolic Abstraction

Symbolic Planning Symbolic Model

Slower Task CompletionSpecific Solutions

Faster Task CompletionGeneral Solutions

Problem Solving Method

Knowledge RequiredCharacteristics

NoneGoal Recognition

Page 18: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

18

Symbolic Abstraction• Lump continuous states sharing symbolic properties into a single

symbolic state• Should be Predictable

– Planning requires accurate model (ex. STRIPS operators)– Tends to require more states, more symbolic properties

• Should be General– Fast planning and transferrable solutions– Tends to require fewer states, fewer symbolic properties

C2C1

S1

S2C1 C1

C1

C1C1C1

C1

C1

C1

C1

S1: intersect(C1, C2)S2: ~intersect(C1, C2)

Page 19: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

19

Symbolic Abstraction

• Hypothesis: contiguous regions of continuous space that share a single behavioral mode is a good abstract state– Planning within modes is simple because of linear

behavior– Combinatorial search occurs at symbolic level

• Spatial predicates used in continuous model decision tree are a reasonable approximation

Page 20: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

20

Abstraction Experiment

• 3 blocks, goal is to push c2 to t• Demonstrate a solution trace to agent• Agent stores sequence of abstract states in solution in epmem• Agent tries to follow plan in analogous task• Abstraction should include predicates about c1, c2, t, avoid

predicates about d1, d2, d3

C2

C1

td1

d2

d3C2C1

C1

C2

C1

t

d1

d2

d3

C2C1

C1

Page 21: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

21

Generalization Performance

Learned 10 Rnd 40 Rnd 80 Rnd All0

5

10

15

20

25

30 28.1

1.7

7

10.1 10.3

Abstraction Type

Num

ber o

f Tas

ks S

olve

d

80 Tasks Total

(16 average)

Page 22: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

22

Conclusions

• For continuous environments with interacting objects, modal models are more general and accurate than uniform model

• The relationships that distinguish between modes serve as useful symbolic abstraction over continuous state

• All this work takes Soar toward being able to autonomously learn and improve behavior in continuous environments

Page 23: Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models.

23

Evaluation

Coal• Scaling issues: linear

regression is exponential in number of objects

• Linear modes is insufficient for more complex physics such as bouncing -> catastrophic failure

Nuggets• Modal model learning is

more accurate and general than uniform models

• Abstraction learning results are promising, but preliminary


Recommended