Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | ashlee-mills |
View: | 219 times |
Download: | 2 times |
10/29/04 1
Acquisition of Control Knowledge ofNonholonomic System by Active Learning method
Yoshitaka Sakurai Nakaji Honda Junji Nishino
Presented by: Pujan Ziaie
能動学習法を用いた非ホロノミック系の知識制御の獲得
10/29/0410/29/04 22
Paper InformationPaper Information
JournalJournal of of Advanced Computational Advanced Computational Intelligence Intelligent InformaticsIntelligence Intelligent Informatics– Received August 28,2002 ; accepted December 13,2002Received August 28,2002 ; accepted December 13,2002
Proc.Proc. of 2003 of 2003 IEEE International IEEE International Conference on Systems Conference on Systems – pp.2400--2405 (2003.10)pp.2400--2405 (2003.10)
10/29/0410/29/04 33
About authorAbout author– Yoshitaka Sakurai (P.H.D. Student)Yoshitaka Sakurai (P.H.D. Student)
–University of University of Electro-CommunicationsElectro-CommunicationsDepartment of systems EngineeringDepartment of systems EngineeringHonda Lab.Honda Lab.
10/29/0410/29/04 44
IntroductionIntroduction
ALMALM (Active Learning Method) (Active Learning Method) IDSIDS (Ink Drop Spread) (Ink Drop Spread) Simulation for Simulation for Gymnastic Bar ActionGymnastic Bar Action
– Mathematical Model & EquationsMathematical Model & Equations– Active Learning ApproachActive Learning Approach
ConclusionConclusion
10/29/0410/29/04 55
Active Learning Active Learning MethodMethod Why ALM?Why ALM?
– No need to now the System Inner No need to now the System Inner StructureStructure
– Improving performance by its ownImproving performance by its own
CharacteristicsCharacteristics ConstructionConstruction ModelingModeling
10/29/0410/29/04 66
ALM ALM CharacteristicsCharacteristics
Using SiSO systemsUsing SiSO systems– Choosing most effective dataChoosing most effective data
Accumulation of knowledge by Accumulation of knowledge by ExperienceExperience– Reinforcement Learning (reward or Reinforcement Learning (reward or
punishment)punishment) Estimation of overall information By Estimation of overall information By
fragmentary informationfragmentary information
10/29/0410/29/04 77
ALM ALM ConstructionConstruction Similar to human learningSimilar to human learning
Knowledge Acquisition Part Controller
System Under Control
Modeling
Data collectionEvaluation
Database Control Rule
Sampling RuleStorage Of I/O data
IDS
Trial & Error
10/29/0410/29/04 88
ALM ALM Modeling Modeling (1)(1)
Dividing MIMO System to SISO SystemsDividing MIMO System to SISO Systems Dividing input Domains to fuzzy Dividing input Domains to fuzzy
regionsregions Extracting the continues narrow pathExtracting the continues narrow path Calculating the output by Sum of the Calculating the output by Sum of the
(Adaptability of each region * region-output)(Adaptability of each region * region-output)
MIMOSystem
SISO
SISO
SISO
CombinationRule
CombinationRule
10/29/0410/29/04 99
ALM ALM Modeling Modeling (2)(2)
Example :Example : 2-Input > 1-Output 2-Input > 1-Output
VS S M L VL
X1
y
a
ßM ßL
yM yL
X2 X2b b
ZMZL
y = ßvs * Zvs + ßs * Zs + ßM * ZM + ßL * ZL + ßVL * ZvL
y = ßM * ZM + ßL * ZL
X1=a & X2=b
10/29/0410/29/04 1010
IInknk DDroprop SSpread pread methodmethod
What is IDS?What is IDS?• Extract narrow path by using fuzzy Extract narrow path by using fuzzy
process on input-output dataprocess on input-output data Why using IDS?Why using IDS?
• Create a continuous narrow pathCreate a continuous narrow path• Measure the data distribution amount Measure the data distribution amount
(extracting the (extracting the most effective inputmost effective input))
10/29/0410/29/04 1111
IDS Algorithm IDS Algorithm (1)(1)
Using irradiation pyramid on data planeUsing irradiation pyramid on data plane
10/29/0410/29/04 1212
IDS Algorithm IDS Algorithm (2)(2)Data plan
Projected plan
Combining the lights Narrow path
10/29/0410/29/04 1313
IDS Algorithm IDS Algorithm (3)(3)
Sample of IDSSample of IDS
Gathering more Data( through feedback )
Gathering more Data
10/29/0410/29/04 1414
Control process Control process (1)(1)
Defining the control structureDefining the control structure• Dividing inputs into regions according Dividing inputs into regions according
to their rangeto their range• Selecting most efficient input for the Selecting most efficient input for the
required output ( by human or required output ( by human or controller)controller)
• Defining evaluation rule for selecting Defining evaluation rule for selecting suitable datasuitable data
10/29/0410/29/04 1515
Control process Control process (2)(2)
Control cycleControl cycle1.1. Gathering data by using control rulesGathering data by using control rules
– First time using random numbersFirst time using random numbers– After first time, using the developed After first time, using the developed
controllercontroller
2.2. Evaluate the gathered dataEvaluate the gathered data
3.3. Improve the partial knowledge Improve the partial knowledge function (in case of proper data)function (in case of proper data)
4.4. Repeat from step 1Repeat from step 1
10/29/0410/29/04 1616
Control process Control process (3)(3)
Output calculation methodOutput calculation method1.1. Remove the most efficient input Remove the most efficient input
from inputsfrom inputs2.2. Build input states tree according Build input states tree according
to valid fuzzy regionsto valid fuzzy regions3.3. Extract a narrow path of Extract a narrow path of
the the most efficient inputmost efficient input and and output output for each leaf of the treefor each leaf of the tree
4.4. Calculate the final output value Calculate the final output value by sum of by sum of output of each nodeoutput of each node multiplied by the multiplied by the adaptability of adaptability of that nodethat node..
VS S M L VL
Xn
y
a
ßM ßL
From narrow path
By multiplying theMembership valuesOf nodes from rootTo the leaf
10/29/0410/29/04 1717
Gymnastic Bar ActionGymnastic Bar ActionModel of Bar GymnastModel of Bar Gymnast– 4 joints & 5 links4 joints & 5 links
Link 0 is not drivenLink 0 is not driven
– θθ00 is dependent of the is dependent of the position of center of gravityposition of center of gravity of of the model and the model and shape of posture.shape of posture.
The mass of the head is The mass of the head is assumed to be 0.assumed to be 0.
GOAL:GOAL: achieve the largest swing angel achieve the largest swing angel
10/29/0410/29/04 1818
Equations:Equations:–θθii: relative angle between link i-: relative angle between link i-1 and link i at each joint i. 1 and link i at each joint i. (i=0..4)(i=0..4)–TT: kinetic energy: kinetic energy–VV: potential Energy (gravity): potential Energy (gravity)–LL: T-V > Lagrangian equation: T-V > Lagrangian equation
–IiIi:: moment of inertia moment of inertia–xi, yixi, yi: coordinates of center : coordinates of center of gravity of the iof gravity of the ithth link link–NiNi: torque applied on each : torque applied on each joint ijoint i
10/29/0410/29/04 1919
Acquisition of knowledgeAcquisition of knowledge
Does a little Kid learn the Does a little Kid learn the gymnastic Bar, by Solving gymnastic Bar, by Solving lagrangian equationlagrangian equation?!?!
NO! Trying to Learn from the Trying to Learn from the
environment by environment by trial and errortrial and error
10/29/0410/29/04 2020
ALMALM against against Model of Bar gymnastModel of Bar gymnast
Knowledge Acquisition Part Controller
SimulatorData collection
Evaluation
DatabaseControl Rule
Sampling Rule
IDS
SequentialDatabase
Modeling
IO Model
IDS Diagrams
After some specified timeComparing with last
most Swing angles
Probability based on distribution
10/29/0410/29/04 2121
Simulation propertiesSimulation properties
Sampling rate: each 1/1000 SecSampling rate: each 1/1000 SecEvaluation: each 2 minutesEvaluation: each 2 minutesAngle range & division:Angle range & division:– θθ00 : : -180 to 180 -180 to 180 > 8 MFs> 8 MFs
– θθ1: 1: 0 to 130 0 to 130 > 5 MFs> 5 MFs
– θθ2: 2: -180 to 0 -180 to 0 > 5 MFs> 5 MFs
– θθ3: 3: -130 to 30 -130 to 30 > 5 MFs> 5 MFs
– θθ4: 4: 0 to 130 0 to 130 > 5 MFs> 5 MFs
Most Effective input of each Most Effective input of each output (joint Torque) > output (joint Torque) > the angle the angle of the same jointof the same joint
2410/29/04
Conclusion
ALM is a Strong flexible method against some complicate control problems
Mathematics is completely useless for many control problems
Advantages of this approach flexibility easiness
disadvantages imperfect information collecting rule still too crisp
2510/29/04
Why did I choose this paper?
I liked it. It was quite a challenge It was brand new I have some ideas to improve it
using fuzzy approaches for outputcorrecting membership functions instead of
adding new data
2610/29/04
acknowledgment
Special thanks toSakurai-san for giving me his time and
answering my questionYamazaki-san who helped me to write the
Japanese translation of technical wordsSerata-san who set me an appointment with
Sakurai-san
10/29/0410/29/04 2727
Thank you all for listeningThank you all for listening
Any easy questions?!Any easy questions?!
10/29/0410/29/04
Declaration Declaration SlideSlide
Sampling ruleSampling rule
Xn
probability function
• Probability based on distribution function
y
y
10/29/0410/29/04
Declaration Declaration SlideSlide
input treeinput tree i.e. : i.e. :
y – four inputs (x0..x3)y – four inputs (x0..x3)– – x1 is the most efficientx1 is the most efficient
VS S M L VL
Xn
y
a
ßM ßL x0
x2 x2
x3 x3 x3 x3
ßob ßoa
ß2c ß2cß2d ß2d
ß3e
adaptability of this state(1): ß_L1_S1: ß0b * ß2c * ß3e
x1 output for this state(1): f_L1_S1(x1)
y
y
ß_L1_S1*f_L1_S1(x1)+
ß_L1_S2*f_L1_S2(x1)+
ß_L1_S3*f_L1_S3(x1)+
ß_L1_S4*f_L1_S4(x1)
ß3f
using partial knowledge
function