Date post: | 18-Jan-2018 |
Category: |
Documents |
Upload: | branden-farmer |
View: | 225 times |
Download: | 1 times |
If you can't read please download the document
Doctoral School Robotics Program Human-Robot Interaction
Autonomous Robots Class Human-Robot Interaction Social learning and
skill acquisition via teaching and imitation Aude G Billard
Learning Algorithms and Systems Laboratory - LASA EPFL,Swiss
Federal Institute of Technology Lausanne, Switzerland A.G. Billard,
AR Class -EDIC/EDPR A.G. Billard, AR Class - EDIC/EDPR
Overview of the Class 9h15-12h00 Interfaces and interaction
modalities non-verbal cues and expressiveness in interactions:
gesture, posture, social spaces and facial expressions User-centred
design of social robots: humanoids, androids, etc. motivations and
emotions in robots social intelligence for robots Social learning
and skill acquisition via teaching and imitation 14h15-17h00:
Robots in education, therapy and rehabilitation Evaluation methods
and methodologies for HRI research Ethical issues in human-robot
interaction research A.G. Billard, AR Class -EDIC/EDPR WHY IS
PROGRAMMING BY DEMONSTRATION BENEFICIAL FOR ROBOTS?
. Why is learning beneficial? Assistance with routine tasks which
cannot be fully automated User-friendly means of reprogramming the
robot Means of sharing knowledge with a Companion Robot A.G.
Billard, AR Class -EDIC/EDPR Transmitting Human Skills and
Knowledge to Robots
Why is it not that simple? A.G. Billard, AR Class -EDIC/EDPR
Transmitting Human Skills and Knowledge to Robots
Why is it not that simple? A.G. Billard, AR Class -EDIC/EDPR
Transmitting Human Skills and Knowledge to Robots
Learning human skills by imitation includes learning: What to
imitate? How to imitate? When to imitate? Who to imitate? C. L.
Nehaniv, K. Dautenhahn (2000): Of Hummingbirds and Helicopters: An
Algebraic Framework for Interdisciplinary Studies of Imitation and
Its Applications. In: J. Demiris and A. Birk, eds.,
Interdisciplinary Approaches to Robot Learning, World Scientific
Series in Robotics and Intelligent Systems - Vol. 24. A.G. Billard,
AR Class -EDIC/EDPR A.G. Billard, AR Class - EDIC/EDPR
The Transfer Problem Imitator Demonstrator ? A.G. Billard, AR Class
-EDIC/EDPR Same direction of motion
What to imitate? Same Object, same target location Same direction
of motion Same speed, same force Same posture A.G. Billard, AR
Class -EDIC/EDPR The correspondence problem No solutions (smaller
range of motion)
How to Imitate? The correspondence problem Demonstration Imitation
? Find the closest solution according to a metric No solutions
(smaller range of motion) A.G. Billard, AR Class -EDIC/EDPR A.G.
Billard, AR Class - EDIC/EDPR
Calinon, S. and Billard, A. (2007) Incremental Learning of Gestures
by Imitation in a Humanoid Robot. in Proceedings of the ACM/IEEE
International Conference on Human-Robot Interaction (HRI). A.G.
Billard, AR Class -EDIC/EDPR A.G. Billard, AR Class -
EDIC/EDPR
Argall, B., Sauser, E and Billard, A. (2010) Policy Adaptation
Through Tactile Feedback. in Proceedings of AISB symposium 2010.
A.G. Billard, AR Class -EDIC/EDPR How are actions perceived? How is
information parsed?
Gesture Recognition How are actions perceived? How is information
parsed? Imitation Level of granularity: What is copied? Should it
copy the intention, goal or dynamics of movement? Motor Learning
How is information transferred across multiple modalities?
Visuo-motor, Auditor-motor A.G. Billard, AR Class -EDIC/EDPR
Imitation Learning Following an imitation mechanism
Gesture Recognition While following the teacher, the learner robot
learns to associate a word with a meaning in terms of sensory
inputs Learning by Imitation Robotic Implementation Billard et al,
ESANN1997, Billard & Dautenhahn,Robotics &Autonomous
Systems 1998, Billard & Hayes, 99,00 A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Following an imitation
mechanism
Gesture Recognition Teaching path in a Maze Demiris & Hayes,
1994, 1996; Teaching how to climb a hill Dautenhahn, Robotics &
Autonomous Systems, 1995 Teaching a path in the environment Billard
& Hayes, Adaptive Behavior, 1999 Moga, Gaussier, Applied
Artificial Intelligence, 2000 Kaiser et al, Robotics &
Autonomous Systems, 2002 Nicolescu & Mataric, AGENTS 2003
Teaching a vocabulary Billard 1997, 1998, 1999 Vogt & Steels,
ECAL, 1999 Learning by Imitation Robotic Implementation A.G.
Billard, AR Class -EDIC/EDPR Imitation Learning One-Shot Learning
Methods Gesture Recognition
Segmentation of demonstration into primitives Classification of
gestures into predefined states (e.g. grasp, collision) Built-in
controller for producing sequences of states Learning by Imitation
Robotic Implementation Kuniyoshi et al. IEEE Trans. on Robotics and
Automation,1994. Dillmann et al, Robotics & Autonomous Systems,
2001. Ritter et al, Rev Neuroscience, 2003 Aleotti et al, Robotics
& Autonomous Systems, 2004. A.G. Billard, AR Class -EDIC/EDPR
Robot Programming by Demonstration One-Shot Learning Methods
Sensors: Data Gloves, Fixed cameras, Speech processing Actuators:
Mobile robot, 7 DOF arm, 2 fingers Gripper R. Dillmann, Robotics
& Autonomous Systems 47:2-3, , 2004 A.G. Billard, AR Class
-EDIC/EDPR Robot Programming by Demonstration One-Shot Learning
Methods
Sensors: Data Gloves, Fixed cameras, Speech processing Actuators:
Mobile robot, 7 DOF arm, 2 fingers Gripper R. Dillmann, Robotics
& Autonomous Systems 47:2-3, , 2004 A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning One-Shot Learning Methods Gesture
Recognition
Explicit teaching/learning: Reasoning about tasks Verbal
instructions Gesture Recognition: For each sensor a
context-dependent model based on background knowledge is provided:
opening the refrigerator door, extracting the bottle and closing
the door Task Reproduction: Store action sequences in atree-like
structure of macro-operators Learning by Imitation Robotic
Implementation R. Dillmann,Robotics & Autonomous Systems
47:2-3, A.G. Billard, AR Class -EDIC/EDPR Imitation Learning Robot
Programming by Demonstration: Grasping
Gesture Recognition Because ofthe large range of possible shapes,
generalizing pre-programmed grasps to new and general objects is a
rather hard task: Orientation of the hand Positioning of the
fingers (correspondence problem!) Tactile forces, stable object
contact Learning by Imitation Robotic Implementation Steil et al,
Robotics & Autonomous Systems 47:2-3, , 2004 A.G. Billard, AR
Class -EDIC/EDPR Imitation Learning Robot Programming by
Demonstration: Grasping
Gesture Recognition (i) a nave imitation strategy, in which the
observed joint angle trajectories (after their transformation into
the three-finger geometry) were directly applied to control the
fingers of the TUM hand during the grasp, until complete closure
around the object (ii) a strategy in which the visually observed
hand posture is matched to the initial conditions of a power grip,
a precision grip, a three-finger and two-finger grip, respectively,
in order to identify the grip type. Learning by Imitation Robotic
Implementation Steil et al, Robotics & Autonomous Systems
47:2-3, , 2004 A.G. Billard, AR Class -EDIC/EDPR Robot Programming
by Demonstration
Other related works are, e.g.: Kuniyoshi et al, ICRA, 1994 Aleotti
et al, Robotics & Autonomous Systems, 47:2-3, , 2004 Zhang
& Roessler, Robotics & Autonomous Systems 47:2-3, , 2004
A.G. Billard, AR Class -EDIC/EDPR Imitation Learning Learning
motion through Dynamical Systems
Gesture Recognition Locally weighted learning Learning primitives
of the system Learning by Imitation Robotic Implementation
Ijspeert, Nakanishi, Schaal, ICRA01, NIPS02 A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Learning motion through Dynamical
Systems
Gesture Recognition Goals: Use dynamical systems with well defined
attractor properties to:1) learn by imitation, 2) be able to
modulate the learned trajectory when:perceptual variables are
varied, and/or perturbations occur Learning by Imitation Robotic
Implementation Discrete movements Rhythmic movements dy/dt y dy/dt
y Single point attractor Limit cycle attractor Ijspeert, Nakanishi,
Schaal, ICRA01, NIPS02 A.G. Billard, AR Class -EDIC/EDPR Imitation
Learning Learning motion through Dynamical Systems
Gesture Recognition Principle: Use a basic linear dynamical system
with a single attractor Modulate this system with a pdf estimate of
the dynamics of the movement Learning by Imitation Robotic
Implementation Discrete movements Rhythmic movements Stable linear
system Modulation Single point attractor Limit cycle attractor
Ijspeert, Nakanishi, Schaal, ICRA01, NIPS02 A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Learning motion through Dynamical
Systems
Gesture Recognition Locally weighted learning Learning primitives
of the system Learning by Imitation Robotic Implementation
Ijspeert, Nakanishi, Schaal, ICRA01, NIPS02 A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Learning motion through Dynamical
Systems
Gesture Recognition The learned trajectory is not sufficient to
control the actual robots walking pattern. Phase resetting using
foot contact information is necessary. on-line adjustment of the
phase of the CPG by sensory feedback from the environment is
essential to achieve successful locomotion Learning by Imitation
Robotic Implementation Nakanishi et al, Robotics & Autonomous
Systems, 47:2-3, 79-91, 2004. A.G. Billard, AR Class -EDIC/EDPR
Imitation Learning Learning motion through Dynamical Systems
Gesture Recognition Learning motion through Dynamical Systems
Drawback: stable DS takes over if long delay in reproduction of the
movement, leading the movement to depart from original trajectory
Needs a heuristic to rescale the time parameter of the DS Need a
time-independent encoding A.G. Billard, AR Class -EDIC/EDPR
Imitation Learning Learning of Dynamical Systems through RL
Gesture Recognition Learning of Dynamical Systems through RL Remove
time-dependency. Learn estimates of autonomous dynamical systems.
I.e. assume that all the demonstrations are instances of a global
dynamics of robot motion. Video One must ensure that the system is
asymptotically stable at the target M. Khansari and A. Billard, BM:
An Iterative Algorithm to Learn Stable Non-Linear Dynamical Systems
with Gaussian Mixture Models, Proceedings of the IEEE int. conf. on
Robotics and Automation, ICRA 2010 A.G. Billard, AR Class
-EDIC/EDPR A.G. Billard, AR Class - EDIC/EDPR
Imitation Learning Learning of the Metric Gesture Recognition
Learning Task Constraints as Constraints Continuous in Space and
Time estimating joint prob. density function of the signals
Learning by Imitation Robotic Implementation S Calinon, D. Guenter
A. Billard, On Learning, Representing and Generalizing a Task in a
Humanoid Robot, IEEE trans. in SMC: Part-B, 2007 A.G. Billard, AR
Class -EDIC/EDPR Imitation Learning Learning of the Metric Gesture
Recognition Robotic
Learning by Imitation Robotic Implementation The robot should learn
that the important feature in this task is that the queen should be
moved 2 steps forward vertically S Calinon, D. Guenter A. Billard,
On Learning, Representing and Generalizing a Task in a Humanoid
Robot, IEEE trans. in SMC: Part-B, 2007 A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Learning of the Metric Gesture
Recognition Robotic
Learning by Imitation Robotic Implementation Once the robot has
learned the rule of motion for the queen, it can apply this rule
for moving the queen from locations not seen during the
demonstrations S Calinon, D. Guenter A. Billard, On Learning,
Representing and Generalizing a Task in a Humanoid Robot, IEEE
trans. in SMC: Part-B, 2007 A.G. Billard, AR Class -EDIC/EDPR A.G.
Billard, AR Class - EDIC/EDPR
Imitation Learning Learning of the Metric Encoding a task using
Gaussian Mixture Regression Probability density function: Gaussian
Function: S Calinon, D. Guenter A. Billard, On Learning,
Representing and Generalizing a Task in a Humanoid Robot, IEEE
trans. in SMC: Part-B, 2007 A.G. Billard, AR Class -EDIC/EDPR A.G.
Billard, AR Class - EDIC/EDPR
Imitation Learning Learning of the Metric Extraction of the
constraints Expected output: Where: S Calinon, D. Guenter A.
Billard, On Learning, Representing and Generalizing a Task in a
Humanoid Robot, IEEE trans. in SMC: Part-B, 2007 A.G. Billard, AR
Class -EDIC/EDPR A.G. Billard, AR Class - EDIC/EDPR
INCREMENTAL LEARNING Movie A.G. Billard, AR Class -EDIC/EDPR A.G.
Billard, AR Class - EDIC/EDPR
INCREMENTAL LEARNING Movie A.G. Billard, AR Class -EDIC/EDPR A.G.
Billard, AR Class - EDIC/EDPR
INCREMENTAL LEARNING Movie 6 demonstrations of moving the white
Knight to catch the black King A.G. Billard, AR Class -EDIC/EDPR
A.G. Billard, AR Class - EDIC/EDPR
INCREMENTAL LEARNING Movie Trajectories of the hand with respect to
the first and second object. (Left) Superimposed, the trajectories
of each of the 6 demonstrations Middle: Gaussian Mixture Model,
Right, Gaussian Mixture Regression A.G. Billard, AR Class
-EDIC/EDPR A.G. Billard, AR Class - EDIC/EDPR
INCREMENTAL LEARNING A.G. Billard, AR Class -EDIC/EDPR A.G.
Billard, AR Class - EDIC/EDPR
INCREMENTAL LEARNING A.G. Billard, AR Class -EDIC/EDPR Imitation
Learning Reinforcement Learning Methods Gesture Recognition
Learning the optimal controller Model of physical system (pendulum)
Reinforcement and locally weighted learning Learning by Imitation
Robotic Implementation Atkeson & Schaal, ICML, 1997. A.G.
Billard, AR Class -EDIC/EDPR Imitation Learning Learning DS through
Inverse RL Gesture Recognition
Intended trajectory Expert demonstrations Learning by Imitation If
t is unknown, inference is hard. If t is known, we have a standard
HMM. Time indices Make an initial guess for t. Alternate between
Fix t Run EM on resulting HMM. Choose new t using dynamic
programming. Pieter Abbeel and Andrew Y. Ng. Apprenticeship
Learning via Inverse Reinforcement Learning,In Proceedings of ICML,
2004. A.G. Billard, AR Class -EDIC/EDPR Imitation Learning Learning
of Dynamical Systems through RL
Gesture Recognition Learning of Dynamical Systems through RL
Quadratic programming problem (QP): quadratic objective, linear
constraints. Constraint generation for path constraints. Pieter
Abbeel and Andrew Y. Ng. Apprenticeship Learning via Inverse
Reinforcement Learning,In Proceedings of ICML, 2004. A.G. Billard,
AR Class -EDIC/EDPR Imitation Learning Learning of Dynamical
Systems through RL
Gesture Recognition Learning of Dynamical Systems through RL Give a
local linear model of the dynamics of the helicopter Pieter Abbeel
and Andrew Y. Ng. Apprenticeship Learning via Inverse Reinforcement
Learning,In Proceedings of ICML, 2004. A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Reinforcement Learning Methods
Gesture Recognition
No model of the physics Use statistical estimate of the motion
through mixture of Gaussiansand use the estimated variance as
bounds for the search via RL. Learning by Imitation Robotic
Implementation F. Guenter, M. Hersch, S. Calinon and A. Billard.
Reinforcement Learning for Imitating Constrained Reaching
Movements. RSJ Advanced Robotics (2007) A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Reinforcement Learning Methods
Gesture Recognition
No model of the physics Use statistical estimate of the motion
through mixture of Gaussiansand use the estimated variance as
bounds for the search via RL. Learning by Imitation Robotic
Implementation F. Guenter, M. Hersch, S. Calinon and A. Billard.
Reinforcement Learning for Imitating Constrained Reaching
Movements. RSJ Advanced Robotics (2007) A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Reinforcement Learning Methods
Episodic Natural Actor Critic (Peters et al 2005) is applied to
learn a new trajectory, so as to overcome the obstacle. Gaussian
Stochastic Policy: Reward Function: Shape constraint Reaching
constraint F. Guenter, M. Hersch, S. Calinon and A. Billard.
Reinforcement Learning for Imitating Constrained Reaching
Movements. RSJ Advanced Robotics (2007) A.G. Billard, AR Class
-EDIC/EDPR Imitation Learning Reinforcement Learning Methods
Evolution of the reward during the learning phase Reproduction
after the learning phase F. Guenter, M. Hersch, S. Calinon and A.
Billard. Reinforcement Learning for Imitating Constrained Reaching
Movements. RSJ Advanced Robotics (2007) A.G. Billard, AR Class
-EDIC/EDPR A.G. Billard, AR Class - EDIC/EDPR
SUMMARY Learning new tasks relies on various means of teaching the
robots. Imitation learning is useful in so far that it gives hints
as to the optimal solution The robot must however rely on generic
skills of its own to adapt the demonstration to its own body and to
the context Learning of complex skills is overall relatively slow
and must proceed incrementally A.G. Billard, AR Class
-EDIC/EDPR