Learning Parametric Inverse Dynamics Models from...

Learning Parametric Inverse Dynamics Modelsfrom Multiple Conditions for

Fast Adaptive Computed Torque ControlYasuhito Horiguchi, Takamitsu Matsubara* and Masatsugu Kidode

Graduate School of Information ScienceNara Institute of Science and Technology, Japan,

*[email protected]

Abstract—In this paper, we propose a novel approach forlearning an inverse dynamics model of a serial-link robot onlyfrom data to be suitable for achieving computed torque controlunder unknown conditions, i.e., adaptive computed torque control.In our approach, we first collect a varied set of data from therobot under multiple conditions, each of which is constructedby putting loads on the body of the robot or giving the robottools and bags to imitate real environmental situations. Then,a subspace representation that contains the various inversedynamics models is extracted from the data as the ParametricInverse Dynamics Model (PIDM) which is composed of the basisfunctions and weight coefficients. Using the PIDM, fast adaptivecomputed torque control under an unknown condition can beefficiently performed by automatically adjusting the weight coef-ficients of the basis functions unlike solving a high-dimensionallearning problem as previous studies. To validate our approach,we applied the proposed method for the problem of adaptivecomputed torque control on trajectory-tracking tasks with an7 DoF anthropomorphic manipulator and demonstrated theeffectiveness. As a result, our approach achieved fast adaptationof computed torque control for the manipulator (within twoseconds) even under unknown conditions such as holding andmounting unknown objects.

I. INTRODUCTION

Serial-link Robots available in our daily living environ-ment has become crucial for several applications such ashuman cares, assists and rehabilitations. For such applications,controllers equipped with not only the control accuracy, butalso the compliance are required to perform safe physicalinteractions with humans rather than conventional PID con-trollers with high gains. The computed torque control (e.g.,[1]–[4]) has shown to be an effective approach for preciseand compliant control even for fast movements. The controlmethod requires a precise model of the robot dynamics as afunction f : q, q, q 7→ u where q, q, q are the angle, velocityand acceleration of all the joints, and u is the torque at allthe joints. The function f(·) is called as the Inverse DynamicsModel (IDM) in this literature.

A typical approach for making the IDM f(·) is based onrigid body dynamics. Such a model is analytically derivedfrom the Newton-Euler equations or the Lagrange’s equation(e,g., [3], [4]). While it requires the knowledge of the inertialparameters such as the mass, moment of inertia and centerof mass for all the links of the robot in addition to itskinematic information, several techniques allow to estimate

them only from a reasonable number of data obtained fromactual robot’s movements as in [1]. However, as presented in[5]–[7], the model with the rigid body dynamics cannot oftenrepresent the complex robot dynamics in precise especiallydue to factors of the nonlinear actuator dynamics, complexfrictions, and flexibility of the cables or tubes, which are notmodeled in rigid body dynamics. Such modeling errors mayresult in significant performance deterioration in computedtorque control, that is, it cannot perform precise and compliantcontrol.

To obtain a more accurate model of the robot dynamics, forthe last decade, several researchers have explored statisticalmachine learning methods to this problem by dealing withit as a nonparametric nonlinear regression problem withoutusing the model of the rigid body dynamics [5]–[7]. Thisapproach has shown certain effectiveness for representingcomplex robot dynamics accurately. Although the learningprocedure requires a sufficient quantity of training data forlearning, after the learning completed, the obtained model canbe used for computed torque control in real-time unless thedynamics of the robot is changed.

However, in the reality of our daily living environment, therobot dynamics may be change due to tasks and conditionssuch as holding and mounting objects (as in Fig. 1). In thispaper, we refer to such a situation as condition. The robotoften meets multiple conditions while achieving tasks in a realenvironment. In such cases, since the dynamics of the robot ischanged, the computed torque control method cannot performprecise and compliant control due to the modeling errorsin the IDM. On-line model learning approach [5], [8] thatmodifies the IDM by using sequentially obtained data duringmovements, might be applicable, however, it is impracticalfor the above cases because the method needs a large amountof data to complete the adaptation due to a high dimensionallearning parameters associated with nonlinear dynamics of therobot.

In this paper, we propose a novel approach for learning theIDM of a serial-link robot only from data to be suitable forachieving computed torque control under unknown conditions,i.e., adaptive computed torque control. In our approach, unlikemost previous studies, we first collect a varied set of datafrom the robot under multiple conditions, each of which is

Fig. 1. Examples of several conditions in real environment.

constructed by putting loads on the body of the robot orgiving the robot tools and bags to imitate real environmentalsituations. Then, a subspace representation that contains thevarious inverse dynamics models is extracted from the dataas the Parametric Inverse Dynamics Model (PIDM) whichis composed of the basis functions and weight coefficients.Using the PIDM, fast adaptive computed torque control un-der an unknown condition can be efficiently achieved byautomatically adjusting the weight coefficients of the basisfunctions unlike solving a high-dimensional learning problemas previous studies in [5]–[7].

A data set captured from the robot under multiple conditionswas used in [9], [10] for learning IDMs. However, in theseapproaches, the adaption procedure for unknown conditions isformulated as a nonlinear optimization problem with a largecomputational cost due to the complex model and it cannot beapplied for real-time adaptive control. The modular-structuredapproach [11] prepared multiple inverse dynamics models toovercome a wide range of variations and uncertainties in thereal environment, in which each module covers one or afew conditions. It is a general framework, however, how toprepare a proper set of modules is still an open issue. Ourapproach can be interpreted as an intermediate one betweenthese two methods. Our approach provides a practical solutionfor the problem of adaptive computed torque control in thepoints of fast adaptation and small computational effort. Froma data set that contains certain variations and uncertaintiesof the environment, a practical representation of the inversedynamics model for adaptive computed torque control isextracted through a reasonable learning procedure. The maincontributions of this paper are presenting a novel modelingand its learning procedure, and demonstrating the effectivenessthrough the application to the problem of adaptive computedtorque control with a real 7DoFs anthropomorphic manipula-tor.

Section II presents our proposed method for achieving fastadaptive computed torque control. Section III describes alearning procedure for a PIDM from data collected underseveral conditions. Section IV presents the effectiveness of ourmethod on a trajectory tracking problem with an anthropomor-phic manipulator in a real environment. Section V presents ourconclusion for this paper.

II. LEARNING A PIDM FROM MULTIPLE CONDITIONS FORFAST ADAPTIVE COMPUTED TORQUE CONTROL

This section describes our proposed method for fast adaptivecomputed torque control. In Sec II-A, we first introduce theIDM for computed torque control as a preliminary for ourproposition. In Sec II-B, we propose a novel concept of theIDM called the Parametric Inverse Dynamics Model (PIDM)that efficiently represents multiple IDMs of the robot underseveral conditions. Section II-C presents a novel scheme foradaptive computed torque control with the PIDM.

A. IDMs for Computed Torque Control

The dynamics of a N -DoFs robotic manipulator attached toa base can be modeled by

u = f(q, q, q) (1)

where q, q, q ∈ RN are the joint angles, velocities, andaccelerations, and u ∈ RN denotes the input torque. Ingeneral, the model f(·) is a nonlinear function that includes therigid-body dynamics, hydraulic tubes, cable drives, complexfrictions and nonlinear actuator dynamics (See, e.g., [6]).

Computed torque control [1], [2], [6] for a robotic manipu-lator at the state q, q to track a nominal trajectory {qd, qd, qd}calculates the needed torque with the IDM by

u = f(q, q, qref ) (2)

where qref = qd + Kp(qd − q) + Kv(qd − q) is referenceaccelerations, and Kp and Kv are feedback gains of positionand velocity, respectively.

However, it would be difficult to use this method for robotsin our daily living environment because the dynamics ofthe robot often changes due to tasks and conditions such asholding and mounting objects. The approach of on-line modellearning [8] has been explored to directly track the changesin the dynamics; however, it requires large amount of datacollected through a long motion-execution time because of thehigh dimensionality of the learning parameters in the modelf(·) especially for robots having a large number of DoFs (e.g.,N = 7).

B. Parametric Inverse Dynamics Model (PIDM)

While the IDM is greatly changed due to tasks and con-ditions such as holding and mounting objects, several factorssuch as the length of all the links except the end link and themoment of inertia of most links are preserved. This fact sug-gests that if we have multiple IDMs of the robot correspondingto various conditions, they could be used to efficiently estimatethe IDM of the robot under an unknown condition. That is,it can be assumed that there are a set of common factorsamong multiple IDMs under multiple conditions and eachmodel could be represented by a combination of the commonfactors.

With this in mind, we assume a model that can representmultiple inverse dynamics models of the robot under several

Parametric Inverse

Dynamics Model : PIDMUnknown

condition

Condition M

Condition 2

Condition 1

Condition 3

Adaptation

1w

2w

3w

Μw

)ˆ,,,( wqqqu &&&&&pf=

w

wqqq ˆ,,, &&&&&

u

Fig. 2. A schematic diagram of fast adaptive control with the PIDM. In thePIDM, the inverse dynamics model under condition i can be represented bya model with the corresponding weight coefficient vector wi. Thus, if thecondition of the robot changes, the change caused in the dynamics of therobot can be represented by the change in the weight coefficient vector w.

conditions only with small number of parameters as follows:

u = fp(q, q, q;w) =J∑

j=1

wjfej (q, q, q) (3)

where fe = [fe1 , · · · , fe

J ]T (fej : q, q, q 7→ u) is referred

to as the Eigen Inverse Dynamics Models (EIDMs), w =[w1, · · · , wJ ]T is called its weight coefficient vector, and Jis the number of its dimensions. The change caused in thedynamics of the robot by changing conditions is capturedby the change in the weight coefficient vector w. Sincethe EIDMs fe are unknown, we propose a novel learningprocedure for the EIDMs from a data set captured from therobot under multiple conditions in the next section.

C. Fast Adaptive Computed Torque Control with PIDM

Figure 2 depicts a schematic diagram of the proposedmethod for fast adaptive computed torque control with thePIDM. Assuming that we have a proper PIDM, in the proposedmethod, adaptive control for an unknown condition can beachieved for precise and compliant control by simply adjustingthe weight coefficient vector w to fit the current condition,rather than estimating a large number of model parametersin the rigid-body dynamics [2]–[4] nor re-training kernelmachines [5], [8].

III. LEARNING PROCEDURE FOR PIDM FROM DATACOLLECTED UNDER SEVERAL CONDITIONS

In this section, we present a learning procedure for PIDMfrom data. Our approach requires a varied set of trainingdata D = {D1, · · · ,DM}: each of data Di is obtainedfrom the robot under a condition i, where Di = {Xi,Ui},Xi = [xi,1, · · · ,xi,Li ], xi,l = [qT

l , qTl , qT

l ]T , Ui =[ui,1, · · · ,ui,Li ] ∈ RN×Li and Li is the number of dataunder the condition i. M is the number of all conditions.Each condition can be artificially constructed, for example, byputting weights on links or giving an object to a hand to obtainthe training data D (as shown in Fig. 1). In the following, weassume that we have the data set D for exploration.

A. The Objective Function for Learning PIDM

To derive a learning procedure for the PIDM fp(·) fromdata, we first need to define the quantitative difference betweentwo IDMs. Since the IDM is a function f : x 7→ u in which

the input is x and the output is u, we define the differenceof two models f1(·) and f2(·) by the sum of the Euclideandistance between the outputs corresponding to the same inputs:

E(f1, f2) =1C

C∑c=1

||f1(xc) − f2(xc)||2 (4)

=1C

C∑c=1

||u1,c − u2,c||2 (5)

where C is the number of data and u1,c is the output of themodel f1(xc).

Since the PIDM is a model that can represent several IDMs,the difference between the PIDM fp(·) and any model fi(·) asE(fp, fi) should be sufficiently small. We define the objectivefunction of the learning procedure for a PIDM with data froma robot under M conditions as:

Eall =1

MC

M∑m=1

C∑c=1

||um,c − fp(xc;wm)||2. (6)

The learning procedure is to find the EIDMs fe(x) infp(x;w) and corresponding weight coefficients for all the Mconditions w1:M = {w1, · · · ,wM} so that it minimizes theobjective function Eall as:{

w∗1:M , fe(x)∗

}← arg min

w1:M ,fe(x)

Eall. (7)

B. Learning Procedures

We present an learning procedure of the PIDM from data,which consists of three steps: (1) data alignment, (2) extractionof the EIDMs and (3) learning the PIDM. Step (1), dataalignment, finds a subset of data from D, which is composed ofdata for each condition where all data have the same inputs asDc. Step (2), extraction of the EIDMs, then extracts targets ofall EIDMs as the basis target matrix Fbasis from Dc. Then, theEIDMs fe are learned as smooth functions using a nonlinearregression technique. Finally, step (3), learning the PIDM,learns the PIDM fp(·) using Dc and Fbasis.

1) Data Alignment: We assume a varied set of trainingdata D = {D1, · · · ,DM} obtained from a robot underM conditions. As pre-processing for subsequent steps, thedata alignment procedure generates the aligned torque matrixUall ∈ RM×NC from D, where N is the number of jointsand C is the number of contents. A content corresponds toan input x commonly included among all conditions (in eachDm for all m). All the contents are represented by the contentmatrix Xc ∈ R3N×C , that is also generated by the alignmentprocedure. Thus, the (i, j) element of Uall as Uall(i, j) isthe torque generated by the robot with i-th condition withceil(j/N)-th content at mod(j/N)-th joint.

Note that it is impossible to obtain such a data set directlybecause we cannot know the torques for all conditions requiredto track the trajectory [q, q, q] unless we have exact IDMsfor all conditions. Our procedure avoids the difficulty; it firstgenerates data for all conditions independently to obtain D,then executes an alignment procedure in the input space for

(b) Mount a holder on a link(a) Hold a bag by a hand

Fig. 3. Test cases for experiments.

approximately finding Dc = {Xc,Uall} from D. In thealignment procedure, we utilize the K-d tree method [12] forefficiently searching the torque candidate vector.

2) Extraction of The EIDMs: The targets for learning theEIDMs fe from Uall can be extracted by a Singular ValueDecomposition (SVD) based matrix factorization. The SVDfor Uall leads to the following factorial representation asUall = YΣVT ≈ WFbasis. We define the linear coefficientmatrix W = [w1

T · · ·wMT ]T ∈ RM×J to be the first

J(≤ M) rows of Y, and the basis target matrix Fbasis =

[f1basis

T · · · fJbasis

T ]T

∈ RJ×NC to be the first J columns ofΣVT . The dimension J can be determined with the singularvalue spectrum and it would satisfy J ¿ M if there is a certainrelationship or correlation among IDMs for all conditions. Thisprocedure can yield a compact and effective representation ofthe PIDM.

The EIDMs fe are then learned with the content matrixXc as inputs and Fbasis as corresponding outputs using anonlinear regression technique. With the success of the GPRin the learning IDMs [6], [7], we utilize a sparse Gaussianprocess (SPGP) [13] which has a lower computational costfor prediction than that of GPR and matches the full GPperformance. With the SPGP, we learn a smooth mappingfe

j (·) using Xc and f jbasis independently for all j, that is, fe(·).

3) Learning of PIDM: The PIDM fp(·) can finally beformed by the weighted linear combination of EIDMs asu = fp(q, q, q;w) =

∑Jj=1 wjf

ej (q, q, q). By setting wm

as the m-th row of W, the function fp(·;wm) approximatelyrepresents the IDM of the robot under the m-th condition.Thus, the subspace spanned by the EIDMs contains a varietyof the IDMs, and it can be suitable for achieving fast real-timeadaptive computed torque control for the robot even underunknown conditions.

This learning procedure is inspired by the studies referredto as style content separation in several different contexts suchas face recognition [14], the synthesis of human-like graphics[15] and learning stylistic movement primitives [16], [17]. Ourmethod can be interpreted as a modification of these methodsto be particularly suitable for the purpose.

IV. EXPERIMENTS

In this section, we describe the experiments we conductedto validate our proposed method. The experimental design andsettings are presented in Section IV-A, and the results areshown in Section IV-B.

Bag Holder

0.20m

1.75kg 1.65kg

(a) Objects for training conditions (b) Objects for test conditions

Weight Attachment Bar

0.50m

0.15m

A

B

1.50kg 2.00kg

0.50kg1.00kg

0.25kg

Fig. 4. Objects used for training and test conditions. In (a), A and B on thebar indicate the positions to attach the weights.

(a) Training condition c6

for case (1)

(b) Training condition c ’9for case (2)

Fig. 5. Example of training condition for each cases.

A. Experimental Design and Settings

To validate our approach, we conduct experiments using a 7-DoFs anthropomorphic manipulator (Barrett, WAM) with thetask of trajectory tracking. As examples of typical situationsin our daily living environment, we select two cases: (1) theobject holding case and (2) the object mounting case as shownin Fig. 3.

1) For case (1): We assume a situation where the robotoften meets several conditions in each of which the robotholds an unknown object in its hand. The change in the robotdynamics caused by holding the unknown object would resultin poor trajectory-tracking performance. The effectiveness ofour approach is validated by demonstrating the improvementof the tracking performance through fast adaptation in real-time.

To learn a PIDM suitable for this case, we artificiallyprepare 10 training conditions (c1, · · · , c10), focusing on thecase in which the robot holds an unknown object, made byusing a bar and several different-weight attachments as inFig. 4(a). In each condition, the robot holds a different objectin its hand as in Fig. 5(a). (differences come in the weight,moment of inertia and the center of mass in the end link). Thedetails of all conditions for training are indicated in Table I.Under all conditions, the data is captured every 10ms underPD tracking control of a nominal figure-of-eight trajectory (8sfor one period) in the task space of the end effector and weobtain 8000 data points in total as D.

As a test condition for validation, we select a typical bag(1.75kg) to be held as shown in Fig. 3(a) and Fig. 4(b). Thetest condition is not included in the training conditions, thusit is an unknown condition for the robot.

We first set the robot with a standard condition (no object)and fairly initialize the weight coefficient vector of the PIDM

TABLE ITHE WEIGHT SPECIFICATIONS FOR TRAINING CONDITIONS IN CASE (1)

Condition numberPosition Training (kg)

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10A 0.0 0.5 0.0 1.0 0.0 0.25 1.0 0.25 0.5 0.5B 0.0 0.0 0.5 0.0 1.0 0.25 1.0 0.25 0.25 0.5

TABLE IITHE WEIGHT SPECIFICATIONS FOR TRAINING CONDITIONS IN CASE (2)

Condition numberPosition Training (kg)

c′1 c′2 c′3 c′4 c′5 c′6 c′7 c′8 c′9 c′10A’ 0.0 1.0 0.0 2.0 0.0 1.0 0.5 1.0 1.5 2.0B’ 0.0 0.0 1.0 0.0 2.0 1.0 1.0 1.5 1.5 2.0

as zeros. We then apply our proposed method on the trajectorytracking task to adapt to this condition with additionally usinga low-gain PD controller1. Then, the condition is suddenlyswitched to the test condition by giving the bag to a hand.The proposed method is applied again to adapt to this newcondition. The performance of the adaptive control with ourproposed scheme is evaluated by the tracking performance ofthe nominal trajectory. The tracking error is defined in the taskspace of the end effector as e(t) = ||r(t)−rd(t)||2 where r(t)is the position in the Cartesian space of the end effector attime t and rd(t) is its desired position.

2) For the case (2): We assume a situation where the robotoften meets several conditions in each of which the robot hasan unknown object mounted on a link.

We prepare 10 training conditions (c′1, · · · , c′10), focusingon the case in which the robot is mounted an unknownobject on a link, made by using the several different-weightattachments. In each condition, the robot is mounted thedifferent attachments on links, as shown in Table II and inFig. 4(b) and in Fig. 5(b). In all conditions, the data is capturedevery 10ms under PD tracking control of the same nominaltrajectory and we obtain 8000 data points in total as D′.

As a test condition for validation, we select a holder(1.65kg) and it is mounted on a link as shown in Fig. 3(b). Again, the test condition is not included in the trainingconditions. As the same way of the case (1), the performanceof the proposed method is evaluated by the tracking errors.

B. Experimental Results

As a result of the experiment, the resulted trajectories ofthe end-effector are presented in Fig. 6 and the time courseof tracking errors for both cases are depicted in Fig. 7. Inboth cases (1) and (2), the dimension of the weight coefficientvector was set as J = 3 so that the PIDM explained morethan 90% of the training data. Adaptive control for trajectorytracking was performed by estimating the weight coefficientw in an on-line manner using a Newton-like method [18] by arecursive update rule as w(k + 1) ← λw(k) + g(x(k),u(k)),

1In the experiments, PD gains were commonly set as kp =[90, 250, 60, 50, 50, 50, 8]T and kv = [0.4, 0.8, 0.4, 0.2, 0.5, 0.5, 0.05]T ,where kp is for position, kv is for velocity, respectively.

where λ is a time-forgetting factor. In this experiment, λ wasexperimentally set as 0.99.

For case (1), as shown in Fig. 7(i) and (ii), while thetracking error was initially large (region (a)), the error wasquickly and significantly reduced by using our proposedmethod (region (b)) which took only within two seconds forconvergence. Then, the condition was suddenly switched tothe test condition and it resulted in a large tracking erroragain (region (c)); however, the error was quickly reduced byapplying our proposed method, which also took within twoseconds (region (d)). Also, for case (2), very similar resultswere obtained as shown in Fig. 7(iii) and (iv). These resultsvalidate the effectiveness of the proposed method for fastadaptive computed torque control.

Furthermore, to evaluate the tracking performance withthe adapted PIDM to the test conditions (regions (d) and(d’) in Fig. 7), we calculated the average tracking errorsover two periods of the trajectory after the convergence ofadaptation. As comparisons, we obtained the average trackingerrors without adaptation, i.e., only using the low-gain PDcontroller (low-PD). This comparison can be considered asan “upper bound” of the tracking error. Moreover, to get a“lower bound”, we learned IDMs by applying SPGP for bothcases and obtained the average tracking errors with using theIDMs (SPGP). Note that the above SPGP needs a sufficientamount of data over the whole nominal trajectory from eachtest condition for learning IDM. In this experiment, the datafor one period of the nominal trajectory from the test conditionwas given, which corresponds to data obtained from a four-times longer movement than that required in the proposedmethod for adaptation. All the results are shown in TableIII. The tracking errors with the proposed method were muchsmaller than that of low-PD and were very close to that ofSPGP. These results validate the effectiveness of the proposedmethod in terms of accuracy.

V. SUMMARY

In this paper, we proposed a novel approach for fast adaptivecomputed torque control of robotic manipulators for managingsituations where the dynamics of the robot changes due totasks and conditions such as holding objects and hangingtools. Through experiments with a 7-DoFs anthropomorphicmanipulator, we have shown that our approach can achieverapid adaptation (within two seconds) for a robotic manip-ulator even under unknown conditions such as holding andmounting unknown objects.

Compared to the approach of on-line learning that takes 40sfor adaptation [8], our approach achieves significantly fasterconvergence with competitive accuracy since our approacheffectively utilizes information about the dynamics of therobot under several other conditions for learning a compactrepresentation of the PIDM as shown by the experimentalresults.

The proposed method in this paper is an extension of ourpreliminary study conducted in simulations [19] so that it isapplicable to real robots. The effectiveness of the method for

- First period- Second period

• Reference

- Third period

- Forth period(a) (b)

• Results

0.0 0.1 0.2-0.1-0.2

0.0 0.1 0.2-0.1-0.2

0.0 0.1 0.2-0.1-0.2

0.0 0.1 0.2-0.1-0.2

0.3

0.4

0.5

0.6

0.7

0.3

0.4

0.5

0.6

0.7

0.3

0.4

0.5

0.6

0.7

0.3

0.4

0.5

0.6

0.7

(i)

(ii)

(a)

(a)

(b)

(b)

Hold the bag

Start adaptation

Start adaptation

Start adaptation

Mount the holder

Start adaptation

[m]

[m]

Fig. 6. Resulted trajectories by the proposed method. (i) and (ii) show theresults of case (1) and (2). (a) indicates the results in the first and secondperiods of the trajectory, and (b) indicates the results in the third and fourthperiods, respectively. For all cases, the fast adaptive control is successfullyperformed.

TABLE IIIAVERAGE TRACKING ERROR

low-PD (m) Proposed (m) SPGP (m)Case (1) 0.059 0.018 0.023Case (2) 0.042 0.014 0.013

a real 7-DoFs anthropomorphic manipulator was demonstratedthrough plenty of experiments in real environments.

Our future work includes application of the proposedmethod to whole-body humanoid robots. We will also addressthe extension of the proposed method so that it actively createstraining conditions for efficiently learning the PIDM.

REFERENCES

[1] C. H. An, C. G. Atkeson, and J. Hollerbach, Model-Based Control of aRobot Manipulator. MIT-Press, 1988.

[2] M. W. Spong, S. Hutchinson, and M. Vidyasagar, Robot Dynamics andControl. New York: Johon Wiley and Sons, 2006.

[3] J. J. Craig, Introduction to Robotics: Mechanics and Control, 3rd ed.Prentice Hall, 2004.

[4] R. Featherstone, Robot Dynamics Algorithms. Kluwer AcademicPublishers, 1987.

[5] S. Vijayakumar and S. Schaal, “Locally weighted projection regression:An o(n) algorithm for incremental real time learning in high dimensionalspace,” in Proceedings of the Seventeenth International Conference onMachine Learning, 2000, pp. 1079–1086.

[6] D. Nguyen-Tuong, M. Seeger, and J. Peters, “Computed torque controlwith nonparametric regression models,” in American Control Conference(ACC), 2008, pp. 212–217.

[7] D. Nguyen-Tuong, J. Peters, and M. Seeger, “Local gaussian processregression for real time online model learning and control,” in IEEE/RSJInternational Conference on Intelligent Robots and Systems, 2008, pp.365–372.

[8] D. Nguyen-Thuong and J. Peters, “Incremental sparsification for real-time online model learning,” in Proceedings of Thirteenth InternationalConference on Artificial Intelligence and Statistics(AISTATS 2010),vol. 9, 2010, pp. 557–564.

4.0 8.0 12.0 16.0 20.0

Time [s]

Err

or [m

]

24.0 28.0 32.0

0.01

(ii)

(a) (b) (c) (d)

0.03

0.05

0.01

0.03

0.05

Err

or [m

]

(a’) (b’) (c’) (d’)

(iii)

(iv)

(i)

Time [s]4.0 8.0 12.0 16.0 20.0 24.0 28.0 32.0

Adaptation Adaptation

(a) (b) (c) (d)

AdaptationAdaptation

(a’) (b’) (c’) (d’)

Fig. 7. Time course of the tracking errors with snapshots of the experimentalscenes. (i)-(ii) and (iii)-(iv) are obtained from the experiment of case (1) and(2), respectively. In the region (a) and (a’), the robot resulted in poor trackingperformance because the PIDM is not adapted yet. Then, in the region (b) and(b’), our proposed method is applied to adapt the condition and the trackingperformance is quickly improved. In the region (c) and (c’), the object is givento the robot and it makes the condition switched to a new condition. In theregion (d) and (d’), the proposed method is applied again and it successfullyimproved the tracking performance.

[9] M. K. Ciliz and K. S. Narendra, “Adaptive control of robotic manip-ulators using multiple models and switching,” International Journal ofRobotics Research, vol. 15, no. 6, pp. 592–610, 1996.

[10] K. Ming, A. Chai, C. K. I. Williams, S. Klanke, and S. Vijayakumar,“Multi-task gaussian process learning of robot inverse dynamics,” inAdavences in Neural Information Processnig Systems, vol. 21, 2008,pp. 1–8.

[11] D. M. Wolpert and M. Kawato, “Multiple paired forward and inversemodels for motor control,” Neural Networks, vol. 11(7), pp. 1317–1329,1998.

[12] J. Bentley, “Multidimensional binary search trees used for associativesearching,” in Proceedings of Communications of the ACM, vol. 18,1975, pp. 509–517.

[13] E. Snelson and Z. Ghahramani, “Sparse gaussian process using pseudo-inputs,” in Adavences in Neural Information Processnig Systems, 2006,pp. 1257–1264.

[14] J. B. Tenenbaum and W. T. Freeman, “Separating style and content withbilinear models,” Neural Computation, vol. 12, pp. 1247–1283, 2000.

[15] M. Brand and A. Hertzmann, “Style machines,” in Proceedings ofSIGGRAPH, 2000, pp. 183–192.

[16] T. Matsubara, S. Hyon, and J. Morimoto, “Learning stylistic dynamicmovement primitives from multiple demonstrations,” in IEEE/RSJ In-ternational Conference on Intelligent Robots and Systems (IROS 2010),2010, pp. 1277–1283.

[17] ——, “Learning parametric dynamic movement primitives from multipledemonstrations,” Neural Networks, vol. 24, pp. 493–500, 2011.

[18] M. Kawato, “Feedback-error-learning for neural network for supervisedmotor leanring,” Advanced Neural Computers, pp. 365–372, 1990.

[19] Y. Horiguchi, T. Matsubara, and M. Kidode, “Learning basis representa-tions of inverse dynamics models for real-time adaptive control,” in 17thInternational Conference on Neural Information Processing (ICONIP2010), vol. 6444, 2010, pp. 668–675.

Date post:	08-Mar-2018
Category:	Documents
Upload:	doantuong
View:	218 times
Download:	0 times

Learning Parametric Inverse Dynamics Models from...

Documents