Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | ambrose-harrington |
View: | 215 times |
Download: | 0 times |
1
Local search and optimization
• Local search= use single current state and move to neighboring states.
• Advantages:– Use very little memory– Find often reasonable solutions in large or infinite
state spaces.
• Are also useful for pure optimization problems.– Find best state according to some objective function.– e.g. survival of the fittest as a metaphor for
optimization.
2
Local search and optimization
3
Hill-climbing search
• “is a loop that continuously moves in the direction of increasing value”– It terminates when a peak is reached.
• Hill climbing does not look ahead of the immediate neighbors of the current state.
• Hill-climbing chooses randomly among the set of best successors, if there is more than one.
• Hill-climbing a.k.a. greedy local search
4
Hill-climbing search
function HILL-CLIMBING( problem) return a state that is a local maximuminput: problem, a problemlocal variables: current, a node.
neighbor, a node.
current MAKE-NODE(INITIAL-STATE[problem])loop do
neighbor a highest valued successor of currentif VALUE [neighbor] ≤ VALUE[current] then return
STATE[current]current neighbor
5
3.3 The K-Means Algorithm
1. Choose a value for K, the total number of clusters.
2. Randomly choose K points as cluster centers.
3. Assign the remaining instances to their closest cluster center.
4. Calculate a new cluster center for each cluster.
5. Repeat steps 3-5 until the cluster centers do not change.
6
Table 3.6 • K-Means Input Values
Instance X Y
1 1.0 1.52 1.0 4.53 2.0 1.54 2.0 3.55 3.0 2.56 5.0 6.0
7
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6
f(x)
x
8
Table 3.7 • Several Applications of the K-Means Algorithm (K = 2)
Outcome Cluster Centers Cluster Points Squared Error
1 (2.67,4.67) 2, 4, 614.50
(2.00,1.83) 1, 3, 5
2 (1.5,1.5) 1, 315.94
(2.75,4.125) 2, 4, 5, 6
3 (1.8,2.7) 1, 2, 3, 4, 59.60
(5,6) 6
9
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6
x
f(x)
10
Simulated annealing
• Escape local maxima by allowing “bad” moves.– Idea: but gradually decrease their size and frequency.
• Origin; metallurgical annealing• Bouncing ball analogy:
– Shaking hard (= high temperature).– Shaking less (= lower the temperature).
• If T decreases slowly enough, best state is reached.
• Applied for VLSI layout, airline scheduling, etc.
11
Simulated annealingfunction SIMULATED-ANNEALING( problem, schedule) return a solution state
input: problem, a problemschedule, a mapping from time to temperature
local variables: current, a node. next, a node.T, a “temperature” controlling the probability of downward
steps
current MAKE-NODE(INITIAL-STATE[problem])for t 1 to ∞ do
T schedule[t]if T = 0 then return currentnext a randomly selected successor of current∆E VALUE[next] - VALUE[current]if ∆E > 0 then current next else current next only with probability e∆E /T
12
Local beam search
• Keep track of k states instead of one– Initially: k random states– Next: determine all successors of k states– If any of successors is goal finished– Else select k best from successors and repeat.
• Major difference with random-restart search– Information is shared among k search threads.
• Can suffer from lack of diversity.– Stochastic variant: choose k successors at
proportional to state success.
13
Genetic algorithms
• Variant of local beam search with sexual recombination.
14
The Genetic Algorithm
• Encoded the individual potential solutions into suitable representations– Knowledge representation
• Use mating and mutation in the population to produce a new generation– Operator selection
• A fitness function judges which individuals are the “best” life forms– The design of a fitness function
15
The General Form
16
Genetic algorithms
17
An Example of Crossover
18
The CNF-satisfaction Problem
19
Representation
• A sequence of six bits
• 101010
20
Genetic Operators
• Crossover
• Mutation
21
Fitness Function
• Value of expression assignment– Hard to judge the “quality”
• Number of clauses that pattern satisfies
22
The Traveling Salesperson Problem
• NP-hard problem
23
Representation
• Bits representation– Hard to crossover and mutation
• Give each city a numeric name– (192465783)– Crossover and mutation?
24
Genetic Operators
• Order crossover (Davis 1985)– Guarantee legitimate tours, visiting all cities
exactly once
25
Mutation
• Reversing the order would not work• Cut out a piece and invert and replace
• Randomly select a city and place it in a new randomly selected location
26
The Genetic Algorithm
• A variant of informed search– Successor states are generated by combining
two parent states
• Procedure– Knowledge representation– Operator selection– The design of a fitness function
27
• Genetic Learning Operators
• Crossover
• Mutation
• Selection
3.4 Genetic Learning
28
FitnessFunction
PopulationElements
Candidatesfor Crossover
& Mutation
TrainingData
Keep
Throw
Genetic Algorithms and Supervised Learning
29
Table 3.8 • An Initial Population for Supervised Genetic Learning
Population Income Life Insurance Credit CardElement Range Promotion Insurance Sex Age
1 20–30K No Yes Male 30–392 30–40K Yes No Female 50–593 ? No No Male 40–494 30–40K Yes Yes Male 40–49
30
Table 3.9 • Training Data for Genetic Learning
Training Income Life Insurance Credit CardInstance Range Promotion Insurance Sex Age
1 30–40K Yes Yes Male 30–392 30–40K Yes No Female 40–493 50–60K Yes No Female 30–394 20–30K No No Female 50–595 20–30K No No Male 20–296 30–40K No No Male 40–49
31
PopulationElement
AgeSexCredit CardInsurance
Life InsurancePromotion
IncomeRange
#1 30-39MaleYesNo20-30K
PopulationElement
AgeSexCredit CardInsurance
Life InsurancePromotion
IncomeRange
#2 50-59FemNoYes30-40K
PopulationElement
AgeSexCredit CardInsurance
Life InsurancePromotion
IncomeRange
#2 30-39MaleYesYes30-40K
PopulationElement
AgeSexCredit CardInsurance
Life InsurancePromotion
IncomeRange
#1 50-59FemNoNo20-30K
32
Table 3.10 • A Second-Generation Population
Population Income Life Insurance Credit CardElement Range Promotion Insurance Sex Age
1 20–30K No No Female 50–592 30–40K Yes Yes Male 30–393 ? No No Male 40–494 30–40K Yes Yes Male 40–49
33
Genetic Algorithms and Unsupervised Clustering
34
a1 a2 a3 . . . an
.
.
.
.
I1
Ip
I2.....
Pinstances
S1
Ek2
Ek1
E22
E21
E12
E11
SK
S2
Solutions
.
.
.
35
Table 3.11 • A First-Generation Population for Unsupervised Clustering
S1
S2
S3
Solution elements (1.0,1.0) (3.0,2.0) (4.0,3.0)(initial population) (5.0,5.0) (3.0,5.0) (5.0,1.0)
Fitness score 11.31 9.78 15.55
Solution elements (5.0,1.0) (3.0,2.0) (4.0,3.0)(second generation) (5.0,5.0) (3.0,5.0) (1.0,1.0)
Fitness score 17.96 9.78 11.34
Solution elements (5.0,5.0) (3.0,2.0) (4.0,3.0)(third generation) (1.0,5.0) (3.0,5.0) (1.0,1.0)
Fitness score 13.64 9.78 11.34
36
General Considerations
• Global optimization is not a guarantee.
• The fitness function determines the complexity of the algorithm.• Explain their results provided the fitness
function is understandable.• Transforming the data to a form suitable for
genetic learning can be a challenge.
37
3.5 Choosing a Data Mining Technique
38
Initial Considerations
• Is learning supervised or unsupervised?
• Is explanation required?–Neural networks, regression models are black-box
• What is the interaction between input and output attributes?• What are the data types of the input and output
attributes?
39
Further Considerations
• Do We Know the Distribution of the Data?–Many statistical techniques assume the data to be normally distributed
• Do We Know Which Attributes Best Define the Data?
–Decision trees and certain statistical approaches–Neural network, nearest neighbor, various clustering approaches
•
40
Further Considerations
• Does the Data Contain Missing Values?– Neural networks
• Is Time an Issue?– Decision trees
• Which Technique Is Most Likely to Give a Best Test Set Accuracy?– Multiple model approaches (Chp. 11)