Automatic Algorithm ConfigurationMethods, Applications, and Perspectives
Thomas Stutzle
IRIDIA, CoDE, Universite Libre de Bruxelles (ULB)Brussels, Belgium
iridia.ulb.ac.be/~stuetzle
Outline
1. Context
2. Automatic algorithm configuration
3. Automatic configuration methods
4. Applications
5. Concluding remarks
WCCI 2016, Vancouver, Canada 2
Optimization problems arise everywhere!
Most such problems are computationally very hard (NP-hard!)
WCCI 2016, Vancouver, Canada 3
The algorithmic solution of hard optimizationproblems is one of the OR/CS success stories!
I Exact (systematic search) algorithms
I Branch&Bound, Branch&Cut, constraint programming, . . .
I guarantees on optimality but often time/memory consuming
I Approximate algorithms
I heuristics, local search, metaheuristics, hyperheuristics . . .
I rarely provable guarantees but often fast and accurate
Much active research on hybrids betweenexact and approximate algorithms!
WCCI 2016, Vancouver, Canada 4
Design choices and parameters everywhere
Todays high-performance optimizers involve a largenumber of design choices and parameter settings
I exact solvers
I design choices include alternative models, pre-processing,variable selection, value selection, branching rules . . .
I many design choices have associated numerical parametersI example: SCIP 3.0.1 solver (fastest non-commercial MIP
solver) has more than 200 relevant parameters that influencethe solver’s search mechanism
I approximate algorithms
I design choices include solution representation, operators,neighborhoods, pre-processing, strategies, . . .
I many design choices have associated numerical parametersI example: multi-objective ACO algorithms with 22 parameters
(plus several still hidden ones)
WCCI 2016, Vancouver, Canada 5
Example: Ant Colony Optimization
WCCI 2016, Vancouver, Canada 6
ACO, Probabilistic solution construction
i
j
k
g
! ij " ij
?,
WCCI 2016, Vancouver, Canada 7
Applying Ant Colony Optimization
WCCI 2016, Vancouver, Canada 8
ACO design choices and numerical parameters
I solution constructionI choice of constructive procedureI choice of pheromone modelI choice of heuristic informationI numerical parameters
I α, β influence the weight of pheromone and heuristicinformation, respectively
I q0 determines greediness of construction procedureI m, the number of ants
I pheromone update
I which ants deposit pheromone and how much?I numerical parameters
I ρ: evaporation rateI τ0: initial pheromone level
I local searchI . . . many more . . .
WCCI 2016, Vancouver, Canada 9
Parameter types
I categorical parameters design
I choice of constructive procedure, choice of recombinationoperator, choice of branching strategy,. . .
I ordinal parameters design
I neighborhoods, lower bounds, . . .
I numerical parameters tuning, calibration
I integer or real-valued parametersI weighting factors, population sizes, temperature, hidden
constants, . . .I numerical parameters may be conditional to specific values of
categorical or ordinal parameters
Design and configuration of algorithms involves settingcategorical, ordinal, and numerical parameters
WCCI 2016, Vancouver, Canada 10
Designing optimization algorithms
Challenges
I many alternative design choices
I nonlinear interactions among algorithm componentsand/or parameters
I performance assessment is difficult
Traditional design approach
I trial–and–error design guided by expertise/intuition prone to over-generalizations, implicit independenceassumptions, limited exploration of design alternatives
Can we make this approach more principled and automatic?
WCCI 2016, Vancouver, Canada 11
Towards automatic algorithm configuration
Automated algorithm configuration
I apply powerful search techniques to design algorithms
I use computation power to explore design spaces
I assist algorithm designer in the design process
I free human creativity for higher level tasks
WCCI 2016, Vancouver, Canada 12
Offline configuration and online parameter control
Offline configuration
I configure algorithm before deploying it
I configuration on training instances
I related to algorithm design
Online parameter control
I adapt parameter setting while solving an instance
I typically limited to a set of known crucial algorithmparameters
I related to parameter calibration
Offline configuration techniques can be helpful to configure(online) parameter control strategies
WCCI 2016, Vancouver, Canada 13
Offline configuration
WCCI 2016, Vancouver, Canada 14
Configurators
WCCI 2016, Vancouver, Canada 15
Approaches to configuration
I experimental design techniquesI e.g. CALIBRA [Adenso–Dıaz, Laguna, 2006], [Ridge&Kudenko,
2007], [Coy et al., 2001], [Ruiz, Stutzle, 2005]
I numerical optimization techniquesI e.g. MADS [Audet&Orban, 2006], various [Yuan et al., 2012]
I heuristic search methodsI e.g. meta-GA [Grefenstette, 1985], ParamILS [Hutter et al., 2007,
2009], gender-based GA [Ansotegui at al., 2009], linear GP [Oltean,
2005], REVAC(++) [Eiben et al., 2007, 2009, 2010] . . .
I model-based optimization approachesI e.g. SPO [Bartz-Beielstein et al., 2005, 2006, .. ], SMAC [Hutter et
al., 2011, ..], GGA++ [Ansotegui, 2015]
I sequential statistical testingI e.g. F-race, iterated F-race [Birattari et al, 2002, 2007, . . .]
General, domain-independent methods required: (i) applicable to all variabletypes, (ii) multiple training instances, (iii) high performance, (iv) scalable
WCCI 2016, Vancouver, Canada 16
Approaches to configuration
I experimental design techniquesI e.g. CALIBRA [Adenso–Dıaz, Laguna, 2006], [Ridge&Kudenko,
2007], [Coy et al., 2001], [Ruiz, Stutzle, 2005]
I numerical optimization techniquesI e.g. MADS [Audet&Orban, 2006], various [Yuan et al., 2012]
I heuristic search methodsI e.g. meta-GA [Grefenstette, 1985], ParamILS [Hutter et al., 2007,
2009], gender-based GA [Ansotegui at al., 2009], linear GP [Oltean,
2005], REVAC(++) [Eiben et al., 2007, 2009, 2010] . . .
I model-based optimization approachesI e.g. SPO [Bartz-Beielstein et al., 2005, 2006, .. ], SMAC [Hutter et
al., 2011, ..], GGA++ [Ansotegui, 2015]
I sequential statistical testingI e.g. F-race, iterated F-race [Birattari et al, 2002, 2007, . . .]
General, domain-independent methods required: (i) applicable to all variabletypes, (ii) multiple training instances, (iii) high performance, (iv) scalable
WCCI 2016, Vancouver, Canada 17
The racing approach
Θ
i
I start with a set of initial candidates
I consider a stream of instances
I sequentially evaluate candidates
I discard inferior candidatesas sufficient evidence is gathered against them
I . . . repeat until a winner is selectedor until computation time expires
WCCI 2016, Vancouver, Canada 18
The F-Race algorithm
Statistical testing
1. family-wise tests for differences among configurations
I Friedman two-way analysis of variance by ranks
2. if Friedman rejects H0, perform pairwise comparisons to bestconfiguration
I apply Friedman post-test
WCCI 2016, Vancouver, Canada 19
Some applications of F-race
International time-tabling competition
I winning algorithm configured by F-race [Chiarandini et al., 2006]
I interactive injection of new configurations
Vehicle routing and scheduling problem
I first industrial application
I improved commerialized algorithm [Becker et al., 2005]
F-race in stochastic optimization
I evaluate “neighbours” using F-race(solution cost is a random variable!)
I good performance if variance of solution cost is high[Birattari et al., 2006]
WCCI 2016, Vancouver, Canada 20
Iterated race
Racing is a method for the selection of the best configuration andindependent of the way the set of configurations is sampled
Iterated race
sample configurations from initial distribution
While not terminate()
apply racemodify sampling distributionsample configurations
WCCI 2016, Vancouver, Canada 21
The irace package: sampling
{{0
.00.2
0.4
x1 x2 x3
0.0
0.2
0.4
x1 x2 x3
WCCI 2016, Vancouver, Canada 22
Iterated racing: sampling distributions
Numerical parameter Xd ∈ [xd , xd ]⇒ Truncated normal distribution
N (µzd , σ
id) ∈ [xd , xd ]
µzd = value of parameter d in elite configuration z
σid = decreases with the number of iterations
Categorical parameter Xd ∈ {x1, x2, . . . , xnd}⇒ Discrete probability distribution
x1 x2 . . . xndPrz{Xd = xj} = 0.1 0.3 . . . 0.4
I Updated by increasing probability of parameter value in eliteconfiguration
I Other probabilities are reduced
0.0
0.2
0.4
x1 x2 x3
WCCI 2016, Vancouver, Canada 23
The irace package
Manuel Lopez-Ibanez, Jeremie Dubois-Lacoste, Thomas Stutzle, and Mauro
Birattari. The irace package, Iterated Race for Automatic Algorithm
Configuration. Technical Report TR/IRIDIA/2011-004, IRIDIA, Universite
Libre de Bruxelles, Belgium, 2011.
The irace Package: User Guide, 2016, Technical Report TR/IRIDIA/2016-004
http://iridia.ulb.ac.be/irace
I implementation of Iterated Racing in RGoal 1: flexibleGoal 2: easy to use
I but no knowledge of R necessaryI parallel evaluation (MPI, multi-cores, grid engine .. )I initial candidatesI forbidden configurations
irace has shown to be effective for configuration taskswith several hundred of variables
WCCI 2016, Vancouver, Canada 24
The irace package: usage
iraceirace
Traininginstances
Parameterspace
Configurationscenario
targetRunner
calls with θ,i returns c(θ,i)
WCCI 2016, Vancouver, Canada 25
Example application of irace: ACOTSP
I Thomas Stutzle. ACOTSP: A software package of variousant colony optimization algorithms applied to thesymmetric traveling salesman problem, 2002.http://www.aco-metaheuristic.org/aco-code/
I ACOTSP: ant colony optimization algorithms for the TSP
Command-line program:$ ./acotsp -i instance -t 20 --mmas --ants 10 --rho 0.95 ...
Goal: find best parameter settings of ACOTSP for solvingrandom Euclidean TSP instances with n ∈ [500, 5000]within 20 CPU-seconds
WCCI 2016, Vancouver, Canada 26
Example application of irace: ACOTSP
$ cat parameters-acotsp.txt
# name switch type values conditions
algorithm "--" c (as,mmas,eas,ras,acs)
localsearch "--localsearch " c (0, 1, 2, 3)
alpha "--alpha " r (0.00, 5.00)
beta "--beta " r (0.00, 10.00)
rho "--rho " r (0.01, 1.00)
ants "--ants " i (5, 100)
q0 "--q0 " r (0.0, 1.0) | algorithm == "acs"
rasrank "--rasranks " i (1, 100) | algorithm == "ras"
elitistants "--elitistants " i (1, 750) | algorithm == "eas"
nnls "--nnls " i (5, 50) | localsearch %in% c(1,2,3)
dlb "--dlb " c (0, 1) | localsearch %in% c(1,2,3)
WCCI 2016, Vancouver, Canada 27
Example application of irace: ACOTSP
$ cat targetRunner
#!/bin/bash
INSTANCE=$1
CANDIDATENUM=$2
CAND PARAMS=$*
STDOUT="c${CANDIDATENUM}.stdout"FIXED PARAMS=" --time 1 --tries 1 --quiet "
acotsp $FIXED PARAMS -i $INSTANCE $CAND PARAMS 1> $STDOUT
COST=$(grep -oE ’Best [-+0-9.e]+’ $STDOUT |cut -d’ ’ -f2)
echo "${COST}"exit 0
WCCI 2016, Vancouver, Canada 28
Example application of irace: ACOTSP
$ ls Instances/
$ cat tune-conf
instanceDir = "./Instances"
maxExperiments = 1000
digits = 2
4 Good to go:
$ irace --parallel 2 --debug-level 1
I --parallel to execute in parallel
I --debug-level to see what irace is executing
WCCI 2016, Vancouver, Canada 29
Example application of irace: ACOTSP and more
I Initial configurations:
$ cat default.txt
algorithm localsearch alpha beta rho ants nnls dlb q0
as 0 1.0 1.0 0.95 10 NA NA NA
I Logical expressions that forbid configurations:
$ cat forbidden.txt
(alpha == 0.0) & (beta == 0.0)
WCCI 2016, Vancouver, Canada 30
Other configurators: ParamILS, SMAC
WCCI 2016, Vancouver, Canada 31
ParamILS Framework
ParamILS is an iterated local search method that worksin the parameter space
perturbation
solution space S
cost
s* s*’
s’
WCCI 2016, Vancouver, Canada 32
Main design choices for ParamILS
Parameter encodingI only categorical parameters, numerical parameters need to be
discretized
InitializationI select best configuration among default and several random
configurations
Local searchI 1-exchange neighborhood search in random order
PerturbationI change several randomly chosen parameters
Acceptance criterionI always select the better configuration
WCCI 2016, Vancouver, Canada 33
Main design choices for ParamILS
Evaluation of incumbent
I BasicILS: each configuration is evaluated on thesame number of N instances
I FocusedILS: the number of instances on which the bestconfiguration is evaluated increases at run time(intensification)
Adaptive capping
I mechanism for early pruning the evaluation ofpoor candidate configurations
I particularly effective when configuring algorithms forminimization of computation time
WCCI 2016, Vancouver, Canada 34
ParamILS: BasicILS vs. FocusedILS
10−2
100
102
104
106
2
2.5
3
3.5
4
4.5
5
5.5
6
x 104
CPU time [s]
Runle
ngth
(m
edia
n)
BasicILS(1)
BasicILS(10)
BasicILS(100)
FocusedILS
example: comparison of BasicILS and FocusedILS for configuring the SAPS
solver for SAT-encoded quasi-group with holes, taken from [Hutter et al., 2007]
WCCI 2016, Vancouver, Canada 35
Model-based Approaches (SPOT, SMAC)
Idea: Use surrogate model M to predict performance ofconfigurations
Algorithmic scheme
generate and evaluate initial set of configurations Θ0
choose best-so-far configuration θ∗ ∈ Θ0
while tuning budget available
learn surrogate model M : Θ 7→ Ruse model M to generate promising configurations Θp
evaluate configurations in Θp
Θ0 := Θ0 ∪Θp
update θ∗ ∈ Θ0
end
output: θ∗
WCCI 2016, Vancouver, Canada 36
Sequential model-based algorithm configuration (SMAC)[Hutter et al., 2011]
SMAC extends surrogate model-based configuration to complexalgorithm configuration tasks and across multiple instances
Main design decisions
I Random forests for M ⇒ categorical & numerical parameters
I Aggregate predictions from Mi for each instance i
I Local search on the surrogate model surface (EIC)⇒ promising configurations
I Instance features ⇒ improve performance predictions
I Intensification mechanism (inspired by FocusedILS)
I Further extensions ⇒ capping
WCCI 2016, Vancouver, Canada 37
Applications
WCCI 2016, Vancouver, Canada 38
Applications of automatic configuration tools
I configuration of “black-box” solvers
I e.g. mixed integer programming solvers, continuous optimizers
I supporting tool in algorithm engineering
I e.g. metaheuristics for probabilistic TSP, re-engineering PSO
I bottom-up generation of heuristic algorithms
I e.g. heuristics for SAT, FSP, etc.; metaheuristic framework
I design configurable algorithm frameworks
I e.g. Satenstein, MOACO, UACOR, MOEAs
WCCI 2016, Vancouver, Canada 39
Example, configuration of “black-box” solvers
Mixed-integer programming solvers
WCCI 2016, Vancouver, Canada 40
Mixed integer programming (MIP) solvers[Hutter, Hoos, Leyton-Brown, Stutzle, 2009, Hutter, Hoos Leyton-Brown, 2010]
I powerful commercial (e.g. CPLEX) and non-commercial (e.g.SCIP) solvers available
I large number of parameters (tens to hundreds)
I default configurations not necessarily best for specificproblems
Benchmark set Default Configured SpeedupRegions200 72 10.5 (11.4± 0.9) 6.8Conic.SCH 5.37 2.14 (2.4± 0.29) 2.51CLS 712 23.4 (327± 860) 30.43MIK 64.8 1.19 (301± 948) 54.54QP 969 525 (827± 306) 1.85
FocusedILS tuning CPLEX, 10 runs, 2 CPU days, 63 parameters
WCCI 2016, Vancouver, Canada 41
Tune known algorithms; example IPOP-CMAES
I IPOP-CMAES is state-of-the-art continuous optimizerI configuration done on benchmark problems (instances)
distinct from test set (CEC’05 benchmark function set) usingseven numerical parameters
1e−08 1e−05 1e−02 1e+01 1e+04
1e−
081e−
051e−
021e
+01
1e+
04
iCMAES−tsc (opt 7 )
iCM
AE
S−
dp (
opt
6 )
f_id_opt
4
8
910
11
12
1314
15
16
17181920212223
2425
−Win 8 −Lose 4−Draw 13
Average Errors−30D−100runs
1 2 4
12
4
910
13
10 20
1020
8
1416
150 250
180
240 17
25
15,24
600 1000
500
700
900
2218,19,20
2123
1e−08 1e−05 1e−02 1e+01 1e+04
1e−
081e−
051e−
021e
+01
1e+
04
iCMAES−tsc (opt 7 )
iCM
AE
S−
dp (
opt
5 )
f_id_opt
4
5
8910
11
12
1314
15
16
17181920212223
2425
−Win 13 +−Lose 4−Draw 8
Average Errors−50D−100runs
2 5 20
25
20 8
910
13
1416
600 800 1100
600
900 18,19,20
21
2223
50 100 250
160
220
1715,24
25
WCCI 2016, Vancouver, Canada 42
Example, supporting tool in algorithm engineering
Tuning in-the-loop (re)design of continuous optimizers
WCCI 2016, Vancouver, Canada 43
Tuning in-the-loop (re)design of continuous optimizers[Montes de Oca, Aydın, Stutzle, 2011]
I re-design of an incremental PSO algorithm for large-scalecontinuous optimization
I six steps
I local search, call and control strategy of LS, PSO rules, boundconstraint handling, stagnation handling, restarts
I iterated F-race used at each step to configure up to10 parameters
I configuration done on 19 functions of dimension 10
I scaling examined until dimension 1000
configuration results can help designer to gaininsight useful for further development
WCCI 2016, Vancouver, Canada 44
Tuning in-the-loop (re)design of continuous optimizers[Montes de Oca, Aydın, Stutzle, 2011]
DEFAULT DEFAULT TUNED TUNED
1e−14
1e−10
1e−06
1e−02
1e+02
Obj
ectiv
e F
unct
ion
Val
ue
Stage−IStage−VI
comparison on 100D, median values across 19 functions
WCCI 2016, Vancouver, Canada 45
Example, bottom-up generation of algorithms
Automatic design of hybrid SLS algorithms
WCCI 2016, Vancouver, Canada 46
Automatic design of hybrid SLS algorithms[Marmion, Mascia, Lopes-Ibanez, Stutzle, 2013]
Approach
I decompose single-point SLS methods into components
I derive generalized metaheuristic structure
I component-wise implementation of metaheuristic part
Implementation
I present possible algorithm compositions by a grammar
I instantiate grammer using a parametric representation
I allows use of standard automatic configuration toolsI shows good performance when compared to, e.g., grammatical
evolution [Mascia, Lopes-Ibanez, Dubois-Lacoste, Stutzle, 2014]
WCCI 2016, Vancouver, Canada 47
General Local Search Structure: ILS
s0 :=initSolutions∗ := ls(s0)
repeats ′ :=perturb(s∗,history)s∗′ :=ls(s ′)s∗ :=accept(s∗,s∗′,history)
until termination criterion met
I many SLS methods instantiable from this structureI abilities
I hybridizationI recursionI problem specific implementation at low-levelI separation of generic and problem-specific components
WCCI 2016, Vancouver, Canada 48
Example instantiations of some metaheuristics
perturb ls accept
SA random move ∅ Metropolis
PII random move ∅ Metropolis, fixed T
TS ∅ TS ∅
ILS any any any
IG destruct/construct any any
GRASP rand. greedy sol. any ∅
WCCI 2016, Vancouver, Canada 49
Grammar
<algorithm> ::= <initialization> <ils>
<initialization> ::= random | <pbs_initialization>
<ils> ::= ILS(<perturb>, <ls>, <accept>, <stop>)
<perturb> ::= none | <initialization> | <pbs_perturb>
<ls> ::= <ils> | <descent> | <sa> | <rii> | <pii> | <vns> | <ig> | <pbs_ls>
<accept> ::= alwaysAccept | improvingAccept <comparator>
| prob(<value_prob_accept>) | probRandom | <metropolis>
| threshold(<value_threshold_accept>) | <pbs_accept>
<descent> ::= bestDescent(<comparator>, <stop>)
| firstImprDescent(<comparator>, <stop>)
<sa> ::= ILS(<pbs_move>, no_ls, <metropolis>, <stop>)
<rii> ::= ILS(<pbs_move>, no_ls, probRandom, <stop>)
<pii> ::= ILS(<pbs_move>, no_ls, prob(<value_prob_accept>), <stop>)
<vns> ::= ILS(<pbs_variable_move>, firstImprDescent(improvingStrictly),
improvingAccept(improvingStrictly), <stop>)
<ig> ::= ILS(<deconst-construct_perturb>, <ls>, <accept>, <stop>)
<comparator> ::= improvingStrictly | improving
<value_prob_accept> ::= [0, 1]
<value_threshold_accept> ::= [0, 1]
<metropolis> ::= metropolisAccept(<init_temperature>, <final_temperature>,
<decreasing_temperature_ratio>, <span>)
<init_temperature> ::= {1, 2,..., 10000}
<final_temperature> ::= {1, 2,..., 100}
<decreasing_temperature_ratio> ::= [0, 1]
<span> ::= {1, 2,..., 10000}
WCCI 2016, Vancouver, Canada 50
Grammar
<algorithm> ::= <initialization> <ils>
<initialization> ::= random | <pbs_initialization>
<ils> ::= ILS(<perturb>, <ls>, <accept>, <stop>)
<perturb> ::= none | <initialization> | <pbs_perturb>
<ls> ::= <ils> | <descent> | <sa> | <rii> | <pii> | <vns> | <ig> | <pbs_ls>
<accept> ::= alwaysAccept | improvingAccept <comparator>
| prob(<value_prob_accept>) | probRandom | <metropolis>
| threshold(<value_threshold_accept>) | <pbs_accept>
<descent> ::= bestDescent(<comparator>, <stop>)
| firstImprDescent(<comparator>, <stop>)
<sa> ::= ILS(<pbs_move>, no_ls, <metropolis>, <stop>)
<rii> ::= ILS(<pbs_move>, no_ls, probRandom, <stop>)
<pii> ::= ILS(<pbs_move>, no_ls, prob(<value_prob_accept>), <stop>)
<vns> ::= ILS(<pbs_variable_move>, firstImprDescent(improvingStrictly),
improvingAccept(improvingStrictly), <stop>)
<ig> ::= ILS(<deconst-construct_perturb>, <ls>, <accept>, <stop>)
<comparator> ::= improvingStrictly | improving
<value_prob_accept> ::= [0, 1]
<value_threshold_accept> ::= [0, 1]
<metropolis> ::= metropolisAccept(<init_temperature>, <final_temperature>,
<decreasing_temperature_ratio>, <span>)
<init_temperature> ::= {1, 2,..., 10000}
<final_temperature> ::= {1, 2,..., 100}
<decreasing_temperature_ratio> ::= [0, 1]
<span> ::= {1, 2,..., 10000}
WCCI 2016, Vancouver, Canada 51
Grammar
<algorithm> ::= <initialization> <ils>
<initialization> ::= random | <pbs_initialization>
<ils> ::= ILS(<perturb>, <ls>, <accept>, <stop>)
<perturb> ::= none | <initialization> | <pbs_perturb>
<ls> ::= <ils> | <descent> | <sa> | <rii> | <pii> | <vns> | <ig> | <pbs_ls>
<accept> ::= alwaysAccept | improvingAccept <comparator>
| prob(<value_prob_accept>) | probRandom | <metropolis>
| threshold(<value_threshold_accept>) | <pbs_accept>
<descent> ::= bestDescent(<comparator>, <stop>)
| firstImprDescent(<comparator>, <stop>)
<sa> ::= ILS(<pbs_move>, no_ls, <metropolis>, <stop>)
<rii> ::= ILS(<pbs_move>, no_ls, probRandom, <stop>)
<pii> ::= ILS(<pbs_move>, no_ls, prob(<value_prob_accept>), <stop>)
<vns> ::= ILS(<pbs_variable_move>, firstImprDescent(improvingStrictly),
improvingAccept(improvingStrictly), <stop>)
<ig> ::= ILS(<deconst-construct_perturb>, <ls>, <accept>, <stop>)
<comparator> ::= improvingStrictly | improving
<value_prob_accept> ::= [0, 1]
<value_threshold_accept> ::= [0, 1]
<metropolis> ::= metropolisAccept(<init_temperature>, <final_temperature>,
<decreasing_temperature_ratio>, <span>)
<init_temperature> ::= {1, 2,..., 10000}
<final_temperature> ::= {1, 2,..., 100}
<decreasing_temperature_ratio> ::= [0, 1]
<span> ::= {1, 2,..., 10000}
WCCI 2016, Vancouver, Canada 52
System overview
WCCI 2016, Vancouver, Canada 53
Flow-shop problem with weighted tardiness
I Automatic configuration:I 1, 2 or 3 levels of recursion (r)I 80, 127, and 174 parameters, respectivelyI budget: r × 10 000 trials each of 30 seconds
ALS1 ALS2 ALS3 soa−IG
2660
027
000
2740
0
Algorithms
Fitn
ess
valu
e
ALS1 ALS2 ALS3 soa−IG
2420
024
600
2500
0
Algorithms
Fitn
ess
valu
e
ALS1 ALS2 ALS3 soa−IG
3300
033
400
3380
0
Algorithms
Fitn
ess
valu
e
ALS1 ALS2 ALS3 soa−IG
4100
0042
0000
Algorithms
Fitn
ess
valu
e
ALS1 ALS2 ALS3 soa−IG
3250
0033
5000
Algorithms
Fitn
ess
valu
e
ALS1 ALS2 ALS3 soa−IG
4900
0050
0000
5100
00
Algorithms
Fitn
ess
valu
e
results are competitive or superior to state-of-the-art algorithmWCCI 2016, Vancouver, Canada 54
Summary
Contributions
I approach to automate design and analysis of (hybrid)metaheuristics
I not a silver bullet, but needs right components, especiallylow-level problem-specific ones
I better or equal performance to state-of-the-art for PFSP-WT,UBQP, TSP-TW
I directly extendible for unbiased comparisons of metaheuristics
Future work
I extensions to other methods and templates
I dealing with complexity of hybrid algorithms
I increase generality to tackle full problem classes
WCCI 2016, Vancouver, Canada 55
Example, design configurable algorithm framework
Multi-objective ant colony optimization (MOACO)
WCCI 2016, Vancouver, Canada 56
Multi-objective Optimization
I many real-life problems are multiobjective
I no a priori knowledge Pareto-optimality
WCCI 2016, Vancouver, Canada 57
MOACO frameworkLopez-Ibanez, Stutzle, 2012
I algorithm framework for multi-objectiveACO algorithms
I can instantiate MOACO algorithms from literature
I 10 parameters control the multi-objective part
I 12 parameters control the underlying pure “ACO” part
Example of a top-down approach to algorithm configuration
WCCI 2016, Vancouver, Canada 58
MOACO framework
irace + hypervolume = automatic configurationof multi-objective solvers!
WCCI 2016, Vancouver, Canada 59
Automatic configuration multi-objective ACO
MOACO (5)
MOACO (4)
MOACO (3)
MOACO (2)
MOACO (1)
mACO−4
mACO−3
mACO−2
mACO−1
PACO
COMPETants
MACS
BicriterionAnt (3 col)
BicriterionAnt (1 col)
MOAQ
0.5 0.6 0.7 0.8 0.9 1.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
euclidAB100.tsp
0.5 0.6 0.7 0.8 0.9 1.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●●
euclidAB300.tsp
0.5 0.6 0.7 0.8 0.9 1.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
euclidAB500.tsp
WCCI 2016, Vancouver, Canada 60
Automatic configuration multi-objective ACO
MOACO−full (5)
MOACO−full (4)
MOACO−full (3)
MOACO−full (2)
MOACO−full (1)
MOACO−aco (5)
MOACO−aco (4)
MOACO−aco (3)
MOACO−aco (2)
MOACO−aco (1)
MOACO (5)
BicriterionAnt−aco (5)
BicriterionAnt−aco (4)
BicriterionAnt−aco (3)
BicriterionAnt−aco (2)
BicriterionAnt−aco (1)
BicriterionAnt (3 col)
0.85 0.90 0.95 1.00 1.05 1.10
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
euclidAB100.tsp
0.85 0.90 0.95 1.00 1.05 1.10
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●●●
●
●
euclidAB300.tsp
0.85 0.90 0.95 1.00 1.05 1.10
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
euclidAB500.tsp
WCCI 2016, Vancouver, Canada 61
Summary
I We propose a new MOACO algorithm that. . .
I We propose an approach to automatically design MOACOalgorithms:
1. Synthesize state-of-the-art knowledge into a flexible MOACOframework
2. Explore the space of potential designs automatically using irace
I Other examples:
I Single-objective frameworks for MIP: CPLEX, SCIP
I Single-objective framework for SAT, SATenstein
I Multi-objective algorithm frameworks (TP+PLS, MOEA)
WCCI 2016, Vancouver, Canada 62
Example, new applications
Multi-objective evolutionary algorithms (MOEA)
WCCI 2016, Vancouver, Canada 63
Multi-objective evolutionary algorithms
Pareto based Indicator based Weight based(NSGA-II, SPEA2) (IBEA, SMS-EMOA) (MOGLS, MOEA/D)
We focus on building an automatically configurablecomponent-wise framework for Pareto- and indicator-based MOEAs
WCCI 2016, Vancouver, Canada 64
MOEA Framework — outline
WCCI 2016, Vancouver, Canada 65
Preference relations in mating / replacement
Component Parameters
Preference 〈 Set-partitioning, Quality, Diversity 〉BuildMatingPool 〈PreferenceMat , Selection 〉
Replacement 〈PreferenceRep, Removal 〉ReplacementExt 〈PreferenceExt , RemovalExt 〉
WCCI 2016, Vancouver, Canada 66
Representing known MOEAs
BuildMatingPool Replacement
Alg. SetPart Quality Diversity Selection SetPart Quality Diversity Removal
MOGA rank — niche-sh. DT — — — generationalNSGA-II depth — crowding DT depth — crowding one-shot
SPEA2 strength — kNN DT strength — kNN sequentialIBEA — binary — DT — binary — one-shotHypE — I hH — DT depth I hH — sequentialSMS — — — random depth-rank I 1
H — —
(All MOEAs above use fixed size population and no external archive;in addition, SMS-EMOA uses λ = 1)
WCCI 2016, Vancouver, Canada 67
Experimental setup
I Benchmarks
I DTLZ (7) and WFG (9) of 2, 3, and 5 objectives
I Scenarios
I fixed budget, fixed computation time
I Training / Testing set
I Dtraining = {20, 21, . . . , 60} \ Dtesting = {30, 40, 50}I Configuration setup
I all compared algorithms fine-tunedI tuning budget 25 000 algorithm runs
WCCI 2016, Vancouver, Canada 68
Experimental results
DTLZ WFG
2-obj 3-obj 5-obj 2-obj 3-obj 5-obj∆R = 126 ∆R = 127 ∆R = 107 ∆R = 169 ∆R = 130 ∆R = 97
AutoD2 AutoD3 AutoD5 AutoW2 AutoW3 AutoW5
(1339) (1500) (1002) (1692) (1375) (1170)
SPEA2D2 IBEAD3 SMSD5 SPEA2W2 SMSW3 SMSW5
(1562) (1719) (1550) (2097) (1796) (1567)
IBEAD2 SMSD3 IBEAD5 NSGA-IIW2 IBEAW3 IBEAW5
(1940) (1918) (1867) (2542) (1843) (1746)
NSGA-IID2 HypED3 SPEA2D5 SMSW2 SPEA2W3 SPEA2W5
(2143) (2019) (2345) (2621) (2600) (2747)
HypED2 SPEA2D3 NSGA-IID5 IBEAW2 NSGA-IIW3 NSGA-IIW5
(2338) (2164) (2346) (2777) (3315) (3029)
SMSD2 NSGA-IID3 HypED5 HypEW2 HypEW3 MOGAW5
(2406) (2528) (2674) (2851) (3431) (4268)
MOGAD2 MOGAD3 MOGAD5 MOGAW2 MOGAW3 HypEW5
(2970) (2851) (2915) (4320) (4540) (4373)
WCCI 2016, Vancouver, Canada 69
Additional remarks
I additional results
I time-constrained scenariosI cross-benchmark comparisonI applications to multi-objective flow-shop scheduling
I extensionsI more comprehensive benchmarks setsI design space analysis (e.g. abalation)I extensions of template (weights, local search, etc.)
Time has come to automatically configure MOEAs(and other algorithms)
WCCI 2016, Vancouver, Canada 70
Example, new applications
Improving automatically the anytime behavior ofalgorithms
WCCI 2016, Vancouver, Canada 71
“Anytime” Algorithms [Zilberstein, 1996]
“Anytime” algorithms aim to produce as high quality results aspossible, independent of the computation time allowed.
1 2 5 10 20 50 100 200
0.0
0.2
0.4
0.6
0.8
time in seconds
rela
tive
devi
atio
n fr
om b
est−
know
n
ants 1ants 400
1 2 5 10 20 50 100 200
0.0
0.2
0.4
0.6
0.8
time in secondsre
lativ
e de
viat
ion
from
bes
t−kn
own
ants 1ants 400var ants
WCCI 2016, Vancouver, Canada 72
Brute-Force Approach
1. Choose many parameter settings
2. Run lots of experiments
3. Visually compare SQT plots
After about one year:
+ Strategies for varying ants, β, or q0 that significantly improvethe anytime behaviour of MMAS on the TSP.
- Extremely time consuming
- Subjective / Bias
WCCI 2016, Vancouver, Canada 73
New approachLopez–Ibanez, Stutzle, 2011
I multi-objective optimization
+ Objectively defined comparison+ Performance assessment techniques (hypervolume)
I Automatic configuration
+ Most effort done by the computer+ Best configurations selected by the computer: Unbiased
WCCI 2016, Vancouver, Canada 74
Experimental comparison
1 2 5 10 20 50 100 200
0.0
0.2
0.4
0.6
0.8
time in seconds
rela
tive
devi
atio
n fr
om b
est−
know
n
default ( 1.1599 )auto var ants ( 1.1865 )auto var beta ( 1.182 )auto var rho ( 1.1813 )auto var q0 ( 1.1935 )auto var ALL ( 1.2012 )
WCCI 2016, Vancouver, Canada 75
Conclusions on configuring anytime algorithms
I Less effort: 1 week instead of a year!
I Same or even better results
I Improving the anytime behaviour of metaheuristicsbecomes much easier
We can use offline configuration of online strategiesfor improving anytime behaviour
1. Implement several online strategies
2. Let offline automatic configuration choose the best strategyfor our algorithm / problem
Further work: Improving anytime behavior for SCIP solver v.2.1.0configuring more than 200 parameters as proof of concept.
WCCI 2016, Vancouver, Canada 76
Improving anytime behavior of SCIP
1 2 5 10 20 50 100 200
02
46
810
time in seconds
RP
D fr
om b
est−
know
n
default (0.9834)auto quality (0.9826)auto time (0.9767)auto anytime (0.9932)
Applying SCIP to Winner determination problem for combinatorial auctions; 1000 training, 1000 test instances, 300
secs CPU time; 5000 budget
WCCI 2016, Vancouver, Canada 77
Few other topics
WCCI 2016, Vancouver, Canada 78
Scaling to expensive instances
What if my problem instances are too difficult/large?
I Cloud computing / Large computing clusters
I J. Styles and H. H. Hoos. Automatically ConfiguringAlgorithms for Scaling Performance. LION, 2012;(extensions also at GECCO 2013)
Tune on small instances,then extend to increasingly larger ones
I F. Mascia, M. Birattari, and T. Stutzle. Tuning algorithmsfor tackling large instances: An experimental protocol.Learning and Intelligent Optimization, LION 7, 2013.
Tune on small /medium-size instances,then scale parameter values to difficult ones
WCCI 2016, Vancouver, Canada 79
Configuring configurators
What about configuring automatically the configurator?. . . and configuring the configurator of the configurator?
4 it can be done (Hutter et al., 2009) but . . .
8 it is costly and iterating further leads to diminishing returns
WCCI 2016, Vancouver, Canada 80
AClib: A Benchmark Library for Algorithm Configuration
F. Hutter, M. Lpez-Ibez, C. Fawcett, M. Lindauer, H. H. Hoos,
K. Leyton-Brown and T. Stutzle. AClib: a Benchmark Library for Algorithm
Configuration, Learning and Intelligent Optimization Conference (LION 8),
2014.
http://www.aclib.net/
I Standard benchmark for experimenting with configurators
I 182 heterogeneous scenarios
I SAT, MIP, ASP, time-tabling, TSP, multi-objective, machinelearning
I Extensible ⇒ new scenarios welcome !
WCCI 2016, Vancouver, Canada 81
Concluding remarks
WCCI 2016, Vancouver, Canada 82
Why automatic algorithm configuration?
I improvement over manual, ad-hoc methods for tuning
I reduction of development time and human intervention
I increase number of considerable degrees of freedom
I empirical studies, comparisons of algorithms
I support for end users of algorithms
WCCI 2016, Vancouver, Canada 83
Towards a shift of paradigm in algorithm design
WCCI 2016, Vancouver, Canada 84
Towards a shift of paradigm in algorithm design
WCCI 2016, Vancouver, Canada 85
Towards a shift of paradigm in algorithm design
WCCI 2016, Vancouver, Canada 86
Conclusions
Automatic Configuration
I leverages computing power for software design
I is rewarding w.r.t. development time and algorithmperformance
Future work
I more powerful configurators
I more and more complex applications
I paradigm shift in optimization software development
WCCI 2016, Vancouver, Canada 87
Acknowledgements
IRIDIA
External collaborators
Research funding
F.R.S.-FRNS, Projects ANTS (ARC), Meta-X (ARC), Comex (PAI),MIBISOC (FP7), COLOMBO (FP7), FRFC, Metaheuristics Network(FP5)
WCCI 2016, Vancouver, Canada 88