Laboratory for Perceptual Robotics – Department of Computer Science Hierarchical Mechanisms for...

Post on 19-Dec-2015

218 views 1 download

Tags:

transcript

Laboratory for Perceptual Robotics – Department of Computer Science

Hierarchical Mechanisms for Robot Programming

Shiraj Sen Stephen Hart Rod Grupen Laboratory for Perceptual Robotics

University of Massachusetts AmherstMay 30, 2008

NEMS ‘08

2Laboratory for Perceptual Robotics – Department of Computer Science

OutlineHierarchical mechanisms

for robot programming

representationprogrammin

g

ActionPotential functions

Value functions

State representation

user defined

reinforcementlearning

intrinsicextrinsic

3Laboratory for Perceptual Robotics – Department of Computer Science

Hierarchical Actions

Σ G

H

Σ G

H

Σ G

H

forcevelocity

references

feedbacksignals

ϕpotential fields

Φvalue functions greedy traversal

avoids local minimum

programs

closed loopprimitive actions

4Laboratory for Perceptual Robotics – Department of Computer Science

Primitive Action Programming Interface

Sensory Error () Visual (uref)

Tactile (fref) Configuration

variables (θref) Operational

Space(xref)

Potential Functions () Spring potential fields

(ϕh)

Collision-free motion fields (ϕc)

Kinematic conditioning fields (ϕcond)

Motor Variables ()Subsets of : Configuration

Variables Operational

Space Variables

primitive actions:

a =Nullspace Projection

a1 a2

5Laboratory for Perceptual Robotics – Department of Computer Science

State Representation

Discrete abstraction of action dynamics. 4-level logic in control predicate pi

no reference ()

convergenceunknown X

-

1

0 descending gradient

6Laboratory for Perceptual Robotics – Department of Computer Science

Hierarchical Programming

A program is defined as a MDP over a vector of controller predicates:

S = p1 … pN

Absorbing states in the value function capture “convergence” of programs.

X

-

1

0

Learn value functions using reinforcement learning

7Laboratory for Perceptual Robotics – Department of Computer Science

StackInsertGraspTouch

Catalog

Intrinsic Reward

Goal: build deep control knowledge

Reward controllable interaction with the world• controllers with direct feedback from the external world.

Track

X

-

1

0

convergence event

X

-

1

0

8Laboratory for Perceptual Robotics – Department of Computer Science

Experimental Demonstration

Motor units• Two 7-DOF Barrett WAMs• Two 4-DOF Barrett Hands• 2-DOF pan/tilt stereo head

Sensory feedback• Visual

• Hue• Saturation• Intensity• Texture

• Tactile • 6-axis finger-tip F/T sensors

• ProprioceptiveDexter

9Laboratory for Perceptual Robotics – Department of Computer Science

STAGE 1: SaccadeTrack - 25 Learning Episodes

atrack

atrack

atrack

asaccade asaccade

X 1X 0

1 X

0 X

X -

X X

Sst = psaccade ptrack

rewarding action

Track-saturation

10Laboratory for Perceptual Robotics – Department of Computer Science

Srg = pst preach pgrab

STAGE 2: ReachGrab - 25 Learning Episodes

rewarding action

TouchTrack-saturation

11Laboratory for Perceptual Robotics – Department of Computer Science

STAGE 2: ReachGrab - 25 Learning Episodes TouchTrack-saturation

12Laboratory for Perceptual Robotics – Department of Computer Science

STAGE 3: VisualInspect - 25 Learning Episodes

Svi = prg pcond ptrack(blue)

TouchTrack-saturation

Track-blue

rewarding action

13Laboratory for Perceptual Robotics – Department of Computer Science

STAGE 3: VisualInspect - 25 Learning Episodes

TouchTrack-saturation

Track-blue

14Laboratory for Perceptual Robotics – Department of Computer Science

STAGE 4: Grasp – User Defined Reward

X - -

1 X XX X X

ReachGrab

X

-

1

0

X 0 0 X 1 1

X 1 0

X 0 1

amoment aforce

TouchTrack-saturation

Grasp

Track-blue

Sgrasp = prg pmoment pforce

rewarding action

15Laboratory for Perceptual Robotics – Department of Computer Science

STAGE 5: PickAndPlace – User Defined Reward

atransport amoment

X

-

1

0

X X X

Grasp

X 0 - X 0 0

X - -

1 X X X 1 1X 1 0

Spnp = pg ptransport pmoment

rewarding action

16Laboratory for Perceptual Robotics – Department of Computer Science

Conclusions

Mechanisms for creating hierarchical programs.• recursive formulation of potential functions and value functions.

control theoretic representation for action, state, and intrinsic reward.

Experimental demonstration of programming manipulation skills using staged learning episodes.

Intrinsic reward pushes out new behavior and models the affordances of objects.

17Laboratory for Perceptual Robotics – Department of Computer Science

Thank You