An Introduction to Metaheuristics Chun-Wei Tsai Electrical Engineering, National Cheng Kung...

transcript

An Introduction to Metaheuristics

Chun-Wei Tsai

Electrical Engineering, National Cheng Kung University

Outline

Optimization Problem and Metaheuristics

Metaheuristic Algorithms

• Hill Climbing (HC)

• Simulated Annealing (SA)

• Tabu Search (TS)

• Genetic Algorithm (GA)

• Ant Colony Optimization (ACO)

• Particle Swarm Optimization (PSO)

Performance Consideration

Conclusion and Discussion

Optimization Problem

The optimization problems

– continuous

– discrete

The combinatorial optimization problem (COP) is a kind of the discrete optimization problems

Most of the COPs are NP-hard

The problem definition of COP

The combinatorial optimization problem

P = (S, f) can be defined as:

where opt is either min or max, x = {x1, x2, . . . , xn} is a set of variables, D1,D2, . . . ,Dn are the variable domains, f is an objective function to be optimized, and f : D1×D2×· · ·×Dn R+. In addition, S = {s | s ∈ D1×D2×· · ·×Dn} is the search space. Then, to solve P, one has to find a solution s ∈ S with optimal objective function value.

1 2 3 4solutions1

D1 D2 D3 D4

2 2 3 3solutions2

Combinatorial Optimization Problem and Metaheuristics (1/3)

Complex Problems

– NP-complete problem (Time)

• No optimum solution can be found in a reasonable time with limited computing resources.

• E.g., Traveling Salesman Problem

– Large scale problem (Space)

• In general, this kind of problem cannot be handled efficiently with limited memory space.

• E.g., Data Clustering Problem, astronomy, MRI

Traveling Salesman Problem (n!)

– Shortest Routing Path

Path 1:

Path 2:

Metaheuristics

– It works by guessing the right directions for finding the true or near optimal solution of complex problems so that the space searched, and thus the time required, can be significantly reduced.

The Concept of Metaheuristic Algorithms

The word “meta” means higher level while the word “heuristics” means to find. (Glover, 1986)

The operators of metaheuristics

– Transition: play the role of searching the solutions (exploration and exploitation).

– Evaluation: evaluate the objective function value of the problem in question.

– Determination: play the role of deciding the search directions.

Transition

Evaluation

Determination

s1 = (1,1)

s2 = (2,2)

o1 = 5

o2 = 3

An example-Bulls and cows

Check all candidate solutions

Guess Feedback Deduction

– Secret number: 9305

– Opponent's try: 1234

• 0A1B

• 1234

– Opponent's try: 5678

• 0A1B

• 5678

– number 0 and 9 must be the secret number

from wiki

Transition

Evaluation

Determination

Transition

Evaluation

Determination

Classification of Metaheuristics (1/2)

The most important way to classify metaheuristics

– population-based vs. single-solution-based (Blum and Roli, 2003)

The single-solution-based algorithms work on a single solution, thus the name

– Hill Climbing

– Simulated Annealing

– Tabu Search

The population-based algorithms work on a population of solutions, thus the name

– Genetic Algorithm

– Ant Colony Optimization

– Particle Swarm Optimization

Classification of Metaheuristics (2/2)

Single-solution-based

– Hill Climbing

– Simulated Annealing

– Tabu Search

Population-based

– Genetic Algorithm

Swarm Intelligence

– Ant Colony Optimization

– Particle Swarm Optimization

Hill Climbing (1/2)

greedy algorithm based on heuristic adaptation of the objective function to explore a better

landscape

Randomly create a string vc

Repeat

evaluate vc

select m new strings from the neighborhood of vc

Let vn be the best of the m new strings

If f(vc) < f(vn) then vc vn

Until t N

Hill Climbing (2/2)

Starting pointStarting point

Local optimum

Global optimum

search space

Simulated Annealing (1/3)

Metropolis et al., 1953

From the annealing process found in the thermodynamics and metallurgy

To avoid the local optimum, SA allows worse moves with a controlled probability---temperature

The temperature will become lower and lower due to the convergence condition

Repeat

evaluate vc

select 3 new strings from the neighborhood of vc

Let vn be the best of the 3 new strings

Else if (T > random()) then vc vn

Update T according to annealing schedule

Until t N

vc = 01110

f = 01110 = 3

n1 = 00110, n2 = 11110, n3 = 01100

vn = 11110

vc = vn= 11110

Starting pointStarting point

Local optimum

Global optimum

search space

Local optimum

Tabu Search (1/3)

Fred W. Glover, 1989

To avoid falling into the local optima and searching the same solutions, the solutions recently visited are saved in a list, called the tabu list (a short-term memory the size of which is a parameter).

Moreover, when a new solution is generated, it will be inserted into the tabu list and will stay in the tabu list until it is replaced by a new solution in a first-in-first-out manner.

http://spot.colorado.edu/~glover/

Tabu Search (2/3)

Repeat

evaluate vc

select 3 new strings from the neighborhood of vc and not in the tabu list

Let vn be the best of the 3 new strings

Update tabu list TL

Until t N

Tabu Search (3/3)

Starting point

Local optimum

Global optimum

search space

Local optimum

Genetic Algorithm (1/5)

John H. Holland, 1975

Indeed, the genetic algorithm is one of the most important population-based algorithms.

Schema Theorem

– short, low-order, above-average schemata receive exponentially increasing trials in subsequent generations of a genetic algorithms.

David E. Goldberg

– http://www.illigal.uiuc.edu/web/technical-reports/

Initialization operators

Selection operators

– Evaluate the fitness function (or the objective function)

– Determinate the search direction

Reproduction operators

Crossover operators

– Recombine the solutions to generate new candidate solutions

Mutation operators

– To avoid the local optima

initialize Pt

evaluate Pt

while (not terminated) do

select Pt from Pt-1

crossover and mutation Pt

evaluate Pt

p1 = 01110, p2 = 01110 p3 = 11100, p4 = 00010

f1 = 01110 = 3, f2 = 01110 = 3f3 = 11100 = 3, f4 = 00010 = 1

s1 = 01110 = 0.3, s2 = 01110 = 0.3s3 = 11100 = 0.3, s4 = 00010 = 0.1

s4 = 11100 = 0.3

p1 = 011 10, p2 = 01 110p3 = 111 00, p4 = 11 100

c1 = 011 00, c2 = 01 100 c3 = 111 10, c4 = 11 110

c1 = 01100, c2 = 01100c3 = 11110, c4 = 11110

c1 = 01101, c2 = 01110 c3 = 11010, c4 = 11111

Ant Colony Optimization (1/5)

Marco Dorigo, 1992

Ant colony optimization (ACO) is another well-known population-based metaheuristic originated from an observation of the behavior of ants by Dorigo

The ants are able to find out the shortest path from a food source to the nest by exploiting pheromone information

http://iridia.ulb.ac.be/~mdorigo/HomePageDorigo/

Create the initial weights of each path

While the termination criterion is not met

Create the ant population s = {s1, s2, . . . , sn}

Each ant si moves one step to the next city according the pheromone rule

Update the pheromone

Solution construction

– for choosing the next sub-solution is defined as follows:

where is the set of feasible (or candiate) sub-solutions that can be the next sub-solution of i; is the pheromone value between the sub-solutions i and j; and is a heuristic value which is also called the heuristic information.

Pheromone Update is employed for updating the pheromone values on each edge e(i, j), which is defined as follows:

where is the number of ants; represents either the length of the tour created by ant k.

Particle Swarm Optimization (1/4)

James Kennedy and Russ Eberhart, 1995

The particle swarm optimization originates from an observation of the social behavior by Kennedy and Eberhart

global best, local best, and trajectory

http://clerc.maurice.free.fr/pso/

IPSO, http://appshopper.com/education/pso

http://abelhas.luaforge.net/

Create the initial population (particle positions) s = {s1, s2, . . . , sn} and particle velocities v = {v1, v2, . . . , vn}

While the termination criterion is not met

Evaluate the fitness values fi of each particle si

For each particle

Update the particle position and velocity

IF (fi < f’i ) Update the local best f’i = fi

IF (fi < fg ) Update the global best fg = fi

global best

local best,personal best

trajectory, current motion

New motion

particle’s position and velocity update equations :

velocity Vik+1 = wVi

k +c1 r1 (pbi-sik) + c2 r2(gb-si

where vik: velocity of particle i at iteration k,

w, c1, c2: weighting factor, r1, r2: uniformly distributed random number between 0 and 1,

sik: current position of agent i at

iteration k, pbi: pbest of particle i, gb: gbest of the group.

position Xik+1 = Xi

k + Vik+1

Larger Larger ww global search ability global search abilitySmaller Smaller ww local search ability local search ability

Summary

Performance Consideration

Enhancing the Quality of the End Result

– How to balance the Intensification and Diversification

– Initialization Method

– Hybrid Method

– Operator Enhancement

Reducing the Running Time

– Parallel Computing

– Hybrid Metaheuristics

– Redesigning the procedure of Metaheuristics

Large Scale Problem

Methods for solving large scale problems (Xu and Wunsch, 2008)

– random sampling

– data condensation

– density-based approaches

– grid-based approaches

– divide and conquer

– Incremental learning

How to balance the Intensification and Diversification

Intensification

– Local Search, 2-opt, n-opt

Diversification

– Keeping Diversity

– Fitness Sharing

– Increase the number of individuals

• More computing resource

– Re-create

Too much intensification local optimum

Too much diversification random search

Intensification

Diversification

Reducing the Running Time

Parallel computing

– This method generally does not reduce the overall computation time.

– master-slave model, fine-grained model (cellular model) and coarse-grained model (island model) [Cant´u-Paz, 1998; Cant´u-Paz and Goldberg, 2000]

Population

Sub-Population Island 1

Migration procedure

Multiple-Search Genetic Algorithm (1/3)

The evolutionary process of MSGA.

Multiple Search Genetic Algorithm (MSGA) vs. Learnable Evolution Model (LEM)

TSP problem pcb442

Tsai’s MSGA Michalski’s LEM

It may face the premature convergence problem because diversity of metaheuristics may decrease too quickly.

Each iteration may take more computation time than that of the original algorithm.

Pattern Reduction Algorithm

Concept

Assumptions and Limitations

The Proposed Algorithm

– Detection

– Compression

– Removal

Concept (1/4)

Our observation shows that a lot of computations of most, if not all, of the metaheuristic algorithms during their convergence process are redundant.

(Data courtesy of Su and Chang)

Concept (2/4)

Concept (3/4)

Concept (4/4)

0 0 1 0

1 1 1 0

g =1, s =4

g =2, s =4

g =n, s =4

0 0 1 0

1 1 1 0

g=1, s =4

g =2, s =2

g =n, s =2

Metaheuristics Metaheuristics +

Pattern Reduction

0 0 1 0

1 1 1 0

0 0 1 0

1 1 1 0

Assumptions and Limitations

Assumptions

– Some of the sub-solutions at certain point in the evolution process will eventually end up being part of the final solution (Schema Theory, Holland 1975)

– Pattern Reduction (PR) is able to detect these sub-solutions as early as possible during the evolution process of metaheuristics.

Limitations

– The proposed algorithm requires that the sub-solutions be integer or binary encoded (i.e., combinatorial optimization problem).

Some Results of PR

The Proposed Algorithm

Create the initial solutions P = {p1, p2, . . . , pn}

While termination criterion is not met

Apply the transition, evaluation, and determination operators of the metaheuristics in question to P

/* Begin PR */

Detect the sub−solutions R = {r1, r2, . . . , rm} that have a high probability not to be changed

Compress the sub−solutions in R into a single pattern, say, c

Remove the sub−solutions in R from P; that is, P = P \ R

P = P ∪ {c}

/* End PR */

Detection Time-Oriented

– Detect patterns not changed in a certain number of iterations

– aka static patterns

Space-Oriented

– Detect sub-solutions that are common at certain loci

Problem-Specific

– E.g., for the k-means, we are assuming that patterns near a centroid are unlikely to be reassigned to another cluster.

P1: 1352476

P2: 7352614 …

P1: 1 C1 476

P2: 7 C1 614

T1: 1352476

T2: 7352614

T3: 7352416

Tn: 7 C1 416

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12

Cluster 0

Cluster 2

Cluster 1

1 1 1 1 2 2 2 2 3 3 3 3

Problem-Specific

P: Number of patterns

x10x11

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12

Cluster 0

Cluster 2

Cluster 1

1 1 1 1 2 2 2 2 3 3 3 3

Removed

P = 12 P = 9

Compression and Removal

The compression module plays the role of compressing all the sub-solutions to be removed whereas the removal module plays the role of removing all the sub-solutions once they are compressed.

Lossy Method

– May cause a “small” loss of the quality of the end result.

An Example

Simulation Environment

The empirical analysis was conducted on an IBM X3400 machine with 2.0 GHz Xeon CPU and 8GB of memory using CentOS 5.0 running Linux 2.6.18.

Enhancement in percentage

– ((Tn - To) / To ) x 100

– Traditional Genetic Algorithm (TGA), HeSEA, Learnable Evolution Model (LEM), Ant Colony System (ACS), Tabu Search (TS), Tabu GA, Simulated Annealing (SA).

Clustering

– Standard k-means (KM), Relational k-means (RKM), Kernel k-means (KKM), Scheme Kernel k-means (SKKM), Triangle Inequality k-means (TKM), Genetic k-means Algorithm (GKA), or Particle Swarm Optimization (PSO)

The Results of Traveling Salesman Problem (1/2)

The Results of Traveling Salesman Problem (2/2)

The Results of Data Clustering Problem (1/3)Data sets for Clustering

The Results of Data Clustering Problem (2/3)

The Results of Data Clustering Problem (3/3)

Time Complexity

Ideally, the running time of “k-means with PR” is independent of the number of iterations.

In reality, however, our experimental result shows that setting the removal bound to 80% gives the best result.

where n is the number of patterns, k the number of clusters, l the number of iterations, and d number of dimensions.

Conclusion and Discussion (2/2)

In this presentation, we introduce the

– Combinatorial Optimization Problem

– Several Metaheuristic Algorithms

– The Performance Enhancement Method

• MSGA, PREGA and so on.

Future Work

– Developing an efficient algorithm that will not only eliminate all the redundant computations but also guarantee that the quality of the end results by “metaheuristics with PR” is either preserved or even enhanced, with respect to those of “metaheuristics by themself.”

– Applying the proposed framework to other optimization problems and metaheuristics.

Conclusion and Discussion (2/2)

Future Work

– Applying the proposed framework to continuous optimization problem.

– Developing more efficient detection, compression, and removal methods.

Discussion

• E-mail: cwtsai87@gmail.com• MSN: cwtsai87@yahoo.com.tw• Web site: http://cwtsai.ee.ncku.edu.tw/

• Chun-Wei Tsai is currently a postdoctoral fellow at the Electrical Engineering of National Cheng Kung University.

• His research interests include evolutionary computation, web information retrieval, e-Learning, and data mining.

s1 = 0 1 1 0 0 s2 = 1 0 0 1 1

f1 = 2 f2 = 3

s1 = 0 1 1 1 0 s2 = 1 0 0 0 1 Framework for metaheuristics

Create the initial solutions s = {s1, s2, …, sn}

While termination criterion is not met

Transit s to s’

Evaluate the objective function value of each solution s’I in s’.

Determine s

p1 = 2/5 p2 = 3/5

p1 = 2/6 p2 = 4/6

p1 = 3/7 p2 = 4/7

p1 = 4/9 p2 = 5/9

g = niteration

Initialization Methods (1/4)

In general, the initial solutions of metaheuristics are randomly generated and it may take a tremendous number of iterations, and thus a great deal of time, to converge.

– Sampling

– Dimension reduction

– Greedy method

iteration

averagetotal

The refinement initialization method can provide a more stable solution and enhance the performance of metaheuristics.

The risk of using the refinement initialization methods is that it may cause metaheuristics to fall into local optima.

Local Search Methods (1/2) The local search methods play an important role in fine-tuning the

solution found by metaheuristics.

13 8 5 6 7 4 9 2 0

31 8 5 6 7 4 9 2 0 18 3 5 6 7 4 9 2 015 8 3 6 7 4 9 2 016 8 5 6 7 4 9 2 017 8 5 6 3 4 9 2 014 8 5 6 7 3 9 2 019 8 5 6 7 4 3 2 012 8 5 6 7 4 9 3 010 8 5 6 7 4 9 2 3

Local Search Methods (2/2)

We have to take into account how to balance metaheuristics and local search. Longer computation time may provide a better result, but

there is no guarantee.

Hybrid Methods

Hybrid method combines pros from different metaheuristic algorithms for enhancing the performance of metaheuristics

– E.g., GA plays the role of global search while SA plays the role of local search.

– Again, we may have to balance the performance of the hybrid algorithm.

Data Clustering Problem

– Partitioning the n patterns into k groups or clusters based on some similarity metric.

image1 codebook image2

k-means Vector Quantization

An Introduction to Metaheuristics Chun-Wei Tsai Electrical Engineering, National Cheng Kung...

Documents