Differential Evolution with Multi-strategy...

.

Differen al Evolu on with Mul -strategyAdapta on

Wenyin Gong

School of Computer Science,

China University of Geosciences, Wuhan, China

E-mail: [email protected]

March 29, 2019

W. Gong Mul -strategy based DE 1/35...

1/35

.

Outline

Differen al Evolu on

SaJADEProposal

DE with Adap ve Operator Selec onProposal

Cheap Surrogate Models for EAsProposal

Conclusions


2/35

.

Outline


SaJADEProposal



Conclusions


3/35

.

Introduc on

.DE..

.

Differen al Evolu on (DE, Storn & Price, 1997) is a simple yet powerfulpopula on-based, direct search, self-adap ve algorithm for globalop miza on mainly using real-valued parameters.

.Advantages of DE..

.

• Simple structure (around 30 sentences of the main algorithm in C);• Ease of use (only three control parameters, i.e., NP , CR , and F );• Speed and robust.


4/35

.

General Framework of DE

Algorithm 1: The pseudo-code of the classical DE algorithmInput: Control parameters: Np, CR , and FOutput: The best final solu on

1 Generate the ini al popula on with popula on size Np;2 Evaluate the popula on;3 while not terminated do4 for i = 1 to Np do /* Generate offspring */5 Select the parents r1 ̸= r2 ̸= r3 ̸= i ;6 Generate the offspring ui with DE reproduc on strategy;7 Evaluate the offspring ui ;8 for i = 1 to Np do /* Survival selection */9 if ui is be er than its parent xi then

10 xi = ui ;


5/35

.

Pseudo-code of “DE/rand/1/bin” Strategy

Algorithm 2: The pseudo-code of “DE/rand/1/bin” strategyInput: Parameters CR , F , and parents xi , xr1 , xr2 , xr3

Output: The offspring ui1 jrnd = rndint(1, D);2 for j = 1 to D do3 if rndreal(0, 1) < CR or j == jrnd then4 ui ,j = xr1,j + F · (xr2,j − xr3,j) /* "DE/rand/1" */;5 else6 ui ,j = xi ,j ;

Where D is the dimension of decision variables; rndint(1, D) is arandom integer number between 1 and D; rndreal(0, 1) is a randomreal-valued number in [0, 1].


6/35

.

Muta on Strategies in DE

In DE there are many muta on strategies, and different strategy hasdifferent characteris cs.• “DE/rand/1”: classic DE strategy, less greedy, slower convergencespeed, and more reliable, more suitable for mul -modal problems.

• The best-so-far solu on (xbest) based strategies: converge faster,more suitable for unimodal func ons, e.g., “DE/best/1”.

• The strategies based on the two difference vectors: provide be erperturba on, e.g., “DE/rand/2”.

• The current solu on (xi ) based strategies: perform local search,converge faster, stagnate quickly, e.g., “DE/current-to-rand/1”.


7/35

.

Outline


SaJADEProposal



Conclusions


8/35

.

Reference

.Paper: From parameter adap on to strategy selec on..

.

W. Gong, Z. Cai, C.X. Ling, and H. Li, “Enhanced differen al evolu onwith adap ve strategies for numerical op miza on,” IEEE Transac onson Systems, Man, and Cyberne cs: Part B – Cyberne cs. 2011, 41(2):397-413..Paper & Source Codes...Available online at: h p://www.escience.cn/people/wygong


9/35

.

Basic Idea

.Individual representa on..

.

The i-th individual Xi is represented as follows:

Xi = ⟨xi , ηi⟩= ⟨xi ,1, · · · , xi ,D, ηi⟩

(1)

where the parameter ηi ∈ [0, 1) is used to control the selec on ofdifferent muta on strategies.


10/35

.

Basic Idea

.Strategy selec on..

.

Suppose that we have K strategies in the strategy pool, for the i-thtarget vector its muta on strategy (Si = {1, · · · , K}) is obtained as:

Si = ⌊ηi × K⌋ + 1 (2)

.Examples..

.

If K = 4 and ηi ∈ [0, 0.25), then Si = 1. It means that, if Xi is the targetvector, then the first strategy in the pool will be selected to generate themutant vector.


11/35

.

Strategy Selec on

.Ques ons?..

.

To implement the strategy adapta on, we need to address twoques ons:1. Which muta on strategies should be chosen to form the strategy

pool?2. How do we update the strategy parameter ηi?


12/35

.

Strategy Pool

.Strategy pool..

.

• Un l now, there is no theore cal study on the choice of the op malpool size and the selec on of strategies to form the strategy pool(Qin, et al, 2009).

• In this work, four strategies proposed in JADE (Zhang & Sanderson,2009) are chosen:◦ 1-2: “DE/current-to-pbest” with/without archive;◦ 3-4: “DE/rand-to-pbest” with/without archive.


13/35

.

Strategy Parameter Adapta on

.Method 1: Inspired by JADE (Zhang & Sanderson, 2009)..

.

In this method, the parameter ηi is generated as:

ηi ={

rndni(µs , 1/6), if g = 1rndni(µs , 0.1), otherwise (3)

µs = (1 − c) × µs + c × meanA(Hs) (4)

where rndni(µs , 0.1) indicates a normal distribu on of mean µs andstandard devia on 0.1. Hs is the set of all successful strategyparameters ηi ’s at genera on g . c is a posi ve constant in [0, 1] andmeanA(·) is the usual arithme c mean opera on.


14/35

.


.Method 2: Inspired by jDE (Brest, et al, 2006)..

.


ηi ={

rndreal[0, 1), rndreal[0, 1] < δηi , otherwise (5)


15/35

.


.Method 3: Inspired by the DE muta on..

.


ηi = ηi + Ki · (ηbest − ηi) + Ki · (ηr1 − ηr2) (6)

where ηbest is the strategy parameter of the best individual in thecurrent popula on. r1, r2 ∈ [1, NP] and r1 ̸= r2 ̸= i . Ki = rndreal[0, 1].If ηi /∈ [0, 1), then it is truncated to [0, 1).

.Remark..

.Since ηi is a real parameter, many other techniques are possible tohandle this parameter.


16/35

.

The Algorithm

Algorithm 3: The pseudo-code of our approachInput: Control parameters: NP , CR , and FOutput: The best final solu on

1 Generate and evaluate the ini al popula on;2 while not terminated do3 for i = 1 to NP do /* Generate offspring */4 Update the strategy parameter ηi ;5 Calculate Si = ⌊ηi × K⌋ + 1;6 Generate the offspring ui with the selected strategy;7 Evaluate the offspring ui ;8 for i = 1 to NP do /* Survival selection */9 if ui is be er than its parent xi then

10 xi = ui ;11 ηi → Hs /* for method 1 */;

12 Update µs = (1 − c) × µs + c × meanA(Hs) /* for method 1 */;


17/35

.

Outline


SaJADEProposal



Conclusions


18/35

.

Reference

.Papers: Strategy adapta on via learning automata..

.

• W. Gong, A. Fialho, and Z. Cai, “Adap ve strategy selec on indifferen al evolu on”, GECCO 2010. 2010, 409-416.

• W. Gong, A. Fialho, Z. Cai, and H. Li„ “Adap ve strategy selec on indifferen al evolu on for numerical op miza on: An empirical study,”Informa on Sciences. 2011, 181 (24): 5364 - 5386.

.Paper & Source Codes...Available online at: h p://www.escience.cn/people/wygong


19/35

.

Framework of DE-AOS

.Objec ve: Inspired by SaDE (Qin, et al, 2009)..

.Autonomously select the strategy to be applied amongst available ones,based on its impact in the past.


20/35

.

Reward Assignment

.Rela ve fitness improvement..

.

The rela ve fitness improvement based CA is adopted(Ong, et al, 2004).For the minimiza on problem, the rela ve fitness improvement ηi ofthe i-th solu on is defined as follows:

ξi = fbestcfi

· |pfi − cfi | (7)

where i = 1, · · · , NP . fbest is the fitness of the best-so-far solu on inthe popula on. pfi and cfi are the fitness of the target parent and of itsoffspring, respec vely.


21/35

.

Reward Assignment

.Reward assignment: Average reward..

.

Ra(t) =∑|Sa|

i=1 Sa(i)|Sa|

(8)

where Sa is the set of all rela ve fitness improvements achieved by theapplica on of a strategy a (a = 1, · · · , n) during genera on t . |Sa| is thenumber of elements in Sa. If |Sa| = 0, Ra(t) = 0.


22/35

.

Strategy Selec on

.Probability matching (Goldberg, 1990)..

.

Update the quality of a strategy a:

Qa(t + 1) = Qa(t) + α[Ra(t) − Qa(t)] (9)

Update the probability of a strategy a:

pa(t + 1) = pmin + (1 − n × pmin) Qa(t)∑ni=1 Qi(t) (10)

where Ra(t) is the reward of a strategy a at genera on t; Qa(t) is theknown quality; α ∈ (0, 1] is a user-defined adapta on rate; pmin ∈ (0, 1)is a user-defined minimal value of probability, used to ensure that nostrategy gets lost (Thierens, 2005).


23/35

.

Strategy Selec on

.Adap ve pursuit (Thierens, 2005)..

.

Update the probability as follows:

pa∗(t + 1) = pa∗(t) + β [pmax − pa∗(t)] (11)

and∀a ̸= a∗ : pa(t + 1) = pa(t) + β[pmin − pa(t)] (12)

wherea∗ = argmaxa(Qa(t + 1))

pmax = 1 − (n − 1) · pmin

β ∈ (0, 1] is a user-defined pursuit rate.


24/35

.

The pseudo-code of DE-AOS

Algorithm 4: The pseudo-code of DE-AOSInput: Parameters of AOS-DEOutput: The best final solu on

1 Choose the strategy for each solu on uniformly;2 while not terminated do3 Select the strategy for each solu on based on the probabili es;4 Generate the offspring based on the selected strategy;5 Evaluate the offspring;6 Selec on between offspring and parents;7 Calculate the reward of each strategy;8 Update the quality of each strategy;9 Update the probability/score of each strategy;


25/35

.

Outline


SaJADEProposal



Conclusions


26/35

.

References

.Paper: Learning guided strategy adapta on..

.

W. Gong, A. Zhou, and Z. Cai, “A mul operator search strategy based oncheap surrogate models for evolu onary op miza on,” IEEETransac ons on Evolu onary Computa on. 2015, 19(5), 746 - 758.

.Paper & Source Codes...Available online at: h p://www.escience.cn/people/wygong


27/35

.

General Idea


28/35

.

Operator Selec on

.How to select an offspring: Cheap surrogate model..

.

The density contribu on of each offspring is evaluated by the densityes ma on func on as

F̂ (y) = 1µ

µ∑i=1

[Riµ

1w φ

( ||y − xi ||2w

)](13)

.Explana ons..

.

1. ||x ||2 =√∑n

j=1 x2j denotes the L2 norm.

2. Ri is the ranking of solu on xi in the sorted popula on (from thebest to the worst), and is calculated as

Ri = µ − i + 1


29/35

.

Operator Selec on

.Explana ons (cont.)..

.

3. φ(u) is the kernel. Different kernels can be used in Equa on (13), inthis work, two kernels are used:◦ The Epanechnikov kernel:

φ(u) = 34(1 − u2)1{|u|≤1} (14)

◦ The normal kernel:

φ(u) = 1√2π

exp(

−u2

2

)(15)

The two kernels are selected because (a) the Epanechnikov kernel isop mal in a minimum variance sense, and (b) the normal kernel iso en used in pa ern recogni on;


30/35

.

Operator Selec on

.Explana ons (cont.)..

.

4. w is the window width and is calculated as

w =

√√√√1n

n∑j=1

(aj − bj

)2(16)

where aj = arg maxi=1,··· ,µ

xi ,j and bj = arg mini=1,··· ,µ

xi ,j . Note that to

make sure |u| ≤ 1 in Equa on (14), when a new candidate point y isgenerated by the reproduc on operator, if yj < bj or yj > aj , then bjor aj will be updated immediately.


31/35

.

Framework


32/35

.

Outline


SaJADEProposal



Conclusions


33/35

.

Conclusions

.Conclusions..

.

1. Strategy selec on in DE is a difficult task for a problem at hand.2. Three mul -strategy adapta on techniques for DE were introduced.3. The performance of the DE variants is promising.4. Can reinforcement learning be also used for strategy/operator

selec on?


34/35

.

Thank you!A : GONG, WenyinA : School of Computer Science,

China University of Geosciences,Wuhan, 430074, China

E- : [email protected] : h p://www.escience.cn/people/wygong


35/35

mailto:[email protected]

http://www.escience.cn/people/wygong

Date post:	19-Oct-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Differential Evolution with Multi-strategy...

Documents