Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | horatio-owens |
View: | 217 times |
Download: | 2 times |
Introduction to GAs: Genetic Algorithms
How to apply GAs to SNA?
Thank you for all pictures and information referred
2
Agenda
Introduction Genetic Algorithms Steps of GA Design Example of Genetic Algorithm
Single Objective Optimization
Benefits and Applications of GAs Applying GAs to SNA
2
3
Introduction
Optimization Problems: How to solve them? Single-objective optimization problems
Multi-objective optimization Problems
http://bizzbangbuzz.blogspot.com/2010/03/five-modes-of-decision-making.html
4
Introduction
Evolutionary Algorithm (EA) is a more suitable optimization technique. WHY??
Natural selection and recombination to find an optimal configuration for a specific problem within specific constraints
Yield good quality approximate solutions to large real world problems
Time consuming Flexible Robust Appropriate when traditional methods break down
Approximate solution to real problems!!
5
Introduction
Genetic Algorithm (GA) is a member of Evolutionary Algorithms (EA)
The major classes of EA Genetic Algorithm (GA) Evolutionary Programming (EP) Genetic Programming (GP) Evolution Strategy (ES).
6
Introduction
An idea of evolutionary computing was discovered by I. Rechenberg in 1960s Idea of GA was proposed by John Holland in
1970
In 1992, John Koza used GA for developing programs to perform certain tasks that are called “Genetic Programming”
7
Genetic Algorithm
GA uses principles of evolutionary and natural selection to solve problems. HOW??
GA maintains a population of structures according to rules of selection and search operators such as a crossover or recombination and a mutation Individuals in the population accept a measurement of fitness in objective
functions the reproduction focuses on the higher fitness individuals
GA is an iterative process the selection operation acts on the current population according to the
certain regulation the operation of crossover and mutation are used in individual selection.
8
Biological Terms of Genetic Algorithm All organisms consist of many cells
each cell obtains the same set of chromosomes The chromosomes are strings of DNA. The characteristics of chromosomes are defined by
genes. There are many forms of genes that are called alleles.
The allele produces the difference set of characteristics in each gene.
The set of chromosomes are called a genotype. A genotype defines an encoded solution in the search
space. phenotype is a solution in the real problem domain
9
Terms of Genetic Algorithm
Chromosome (string): Solution
Genes (bits): Part of solution
Locus: Position of gene
Alleles: Value of gene Phenotype: Decoded
solution Genotype: Encoded
solution
10
Genetic Algorithm’s Processes An initial population of individuals is a set of solutions that is
represented by chromosomes.
Chromosomes are generated randomly.
Every iteration of evolutionary is a generation of the algorithm.
Individuals in the current population are decoded and evaluated according to some predefined quality criteria that are referred to as the fitness function.
Each individual is selected according to the fitness value in which existing members of the current solution pool are replaced by new created members.
11
Genetic Algorithm’s Processes The new members of the population are created from the
crossover and mutation operations.
Better members will survive but weaker ones will be eliminated.
The higher fitness individuals (the better members) have more chance to reproduce than the lower ones.
The weaker members are eliminated.
The GA process will be repeated until the convergence criterion is satisfied.
12
Steps of Genetic Algorithm
Step 0 :Initialization
Step 1 : Evaluation
Step 2 : ParentSelection
Step 3 : Crossover
Step 4 : Mutation
Step 6 : TerminationTest
Solution
Update the optimalsolutions
Step 5 : ElitistStrategy
Encoding
Optimal Solutions
Decoding
Genotype Phenotype
No
Yes
Step 0: Initialization randomly generates an initial population.
Step 1: Evaluation decodes strings to solutions and calculates the value of the objective function for each solution then transforms the value of the objective function for each solution to the value of the fitness function for each string in the genotype world.
Step 2: Selection selects a number of pairs of strings from the current population according to the selection probability.
13
Steps of Genetic Algorithm
Step 0 :Initialization
Step 1 : Evaluation
Step 2 : ParentSelection
Step 3 : Crossover
Step 4 : Mutation
Step 6 : TerminationTest
Solution
Update the optimalsolutions
Step 5 : ElitistStrategy
Encoding
Optimal Solutions
Decoding
Genotype Phenotype
No
Yes
Step 3: Crossover applies the crossover operator to each of the selected pairs in Step 2 to generate strings with the crossover probability.
Step 4: Mutation applies the mutation operator to each of the generated strings with the mutation probability.
Step 5: Elitist strategy randomly removes a string from the current population and add the best string in the previous population to the current.
Step 6: Termination test, if the condition is satisfied, stop this algorithm. Otherwise, return to Step 1.
14
Steps of GA Design
Initial Population and Representation Population consists of individuals which are potential
solutions in the problem domain.
The parameters of problem domain are encoded to be initial population.
Population contains individuals that are chromosomes. The population always consists of 30-100 individuals; if the
numbers of individuals are less than 30 individuals then it is called MicroGA
Each solution to an optimization problem is encoded as a finite-length string.
15
Steps of GA Design
There are many types of solution representations. The basic representations of parameters are the binary
coding and the permutation coding.
Binary Coding It consists of binary digit “0” and “1” Each bit in a string represents the characteristics of a
solution.
The whole string represents the meaning of solution.
The decision variables in the parameter set are encoded to be the binary string by using the gray code method or hamming code method.
16
Steps of GA Design
Permutation coding It is used for sequencing problems
such as scheduling problems and traveling salesman problems where the permutation string consists of number “1” to “n”.
Each numeral corresponds to a job in scheduling problems or a city in traveling salesman problems and “n” is the number of jobs or cities.
17
Permutation representation: TSP Problem:
• Given n cities• Find a complete tour with
minimal length Encoding:
• Label the cities 1, 2, … , n• One complete tour is one
permutation (e.g. for n =4 [1,2,3,4], [3,4,2,1] are OK)
Search space is BIG:
for 30 cities there are 30! 1032 possible tours
18
Steps of GA Design
There is a problem when using the binary code the binary representation: obscures the nature of the
search, but there are many strategies for encoding.
The real value of parameters, representation needs not to decode chromosomes to the phenotype, which is fast and uses less memory.
An integer representation is easier than the binary code, because it can look-up tables for decoding the representation.
19
The binary string is called a genotype, consisting of “0” and “1”.
The solution is decoded from the binary string called a phenotype.
GA searches the binary strings from the genotype.
After the algorithm converges the genotype is decoded.
20
Steps of GA Design
Objective and fitness function An objective function is used to measure each individual in
the population for measuring the suitability in the problem domain. For a minimization problem, the individual that makes
the objective function to be the lowest value; it is fit for this problem.
The individual that makes the objective function to be the highest value; it is fit for a maximization problem.
A fitness function is used for transforming the objective function value to the fitness value, which is used for assigning the fitness to the individual.
21
Steps of GA Design
The objective function value, f(x), is calculated by using the value of “x” that is a variable decoded from the binary string, for instance.
If the objective function value is better than the binary string in the genotype corresponding to the solution “x”. It is a better fitness value.
When transforming the objective function value of solution “x” to the fitness value, the solution that has a higher fitness value is kept for future reproduction. Fitness value: maximize Fitness value: minimize
An individual has the probability of reproduction according to fitness value.
22
Steps of GA Design
Selection A selection is the process of determining a particular
individual that is chosen for reproduction and the number of offspring in which an individual will produce.
It transforms the fitness values of individuals to the probability value for reproducing by the probability of reproduction according to the fitness values.
For instance, the binary strings that have higher fitness values are more chance to be selected as parents.
An efficiency selection method is motivated by the need to maintain an overall time complexity.
23
Steps of GA Design
New population creation A crossover and mutation operations are the major parts of the process
that shows the efficient performance of GA.
For the crossover operator, the probability is the most frequently used.
the mutation probability is rarely used.
The mutation is the a random operator and it serves to introduce the diversity in the population.
It changes an element from a binary string that is generated by the crossover method.
It replaces a bit by digit “0” or “1”. It is relative with only a single parent and the result is an offspring but the
crossover operation is two parents and the result is two offspring.
24
Steps of GA Design
The crossover operator takes two individuals and cuts their chromosome strings by using the random position. They have two-head and two-tail segments. The tail segments are swapped over to produce
two new full-length chromosomes where two new offspring inherit some genes from each parent.
The basic crossover is a one-point crossover that is a basic operator for binary coding
25
Steps of GA Design
The crossover point is a randomly selected between two adjacent elements by swapping all elements in the head and tail part.
The normal probability of crossover is 0.6 to 1.0
26
Steps of GA Design
N-point Crossover Choose n random crossover points Split along those points Glue parts, alternating between parents
27
Steps of GA Design
Uniform Crossover Assign 'heads' to one parent, 'tails' to the other Flip a coin for each gene of the first child Make an inverse copy of the gene for the second child Inheritance is independent of position
28
Crossover operators for permutations
Many specialised operators have been devised which focus on combining order or adjacency information from the two parents
Steps of GA Design
1 2 3 4 5
5 4 3 2 1
1 2 3 2 1
5 4 3 4 5
Parent 1
Parent 2
Child 1
Child 2
29
Steps of GA Design
Mutation Operation It is applied to each offspring individually after the
crossover method. It provides a small amount of random search and
helps to show that there is not a zero mutation probability in the search space.
The probability of mutation is 0.001 to 0.1 that is a small probability.
30
Steps of GA Design
Mutation for Permutations Pick two allele values at random Move the second to follow the first, shifting the
rest along to accommodate Note that this preserves most of the order and the
adjacency information
31
Steps of GA Design
Crossover OR mutation? Decade long debate:
which one is better / necessary
Answer: it depends on the problem, but in general, it is good to have both both have another role
32
Crossover OR mutation?
Exploration: Discovering promising areas in the search space, i.e. gaining
information on the problem
Exploitation: Optimising within a promising area, i.e. using information There is co-operation AND competition between them
Crossover is explorative, it makes a big jump to an area somewhere “in
between” two (parent) areas
Mutation is exploitative, it creates random small diversions, thereby
staying near (in the area of ) the parent
Steps of GA Design
33
Crossover OR mutation? Only crossover can combine information from
two parents
Only mutation can introduce new information
(alleles)
To hit the optimum you often need a ‘lucky’
mutation
Steps of GA Design
34
Steps of GA Design
Elitist Strategy An elitist is a method that copies the best
chromosome (string) to the new population (next generation).
It protects the best string that is not affected by the genetic operator (crossover and mutation).
It can increase the performance of GA so, the best string has more chance to be a parent string.
35
Steps of GA Design
Termination A termination tests the quality of the best
members of the population with the problem definition.
If the solution is not acceptable or the maximum number of iterations is not reached then the GA process is restarted. Number of iterations Acceptable value: Convergence value
36
Example of Genetic AlgorithmMaximize f(x)=x2, X in the interval {0,…,31}.
Objective function: f(x)=x2
Decision variable: xConstrain: X={0,…,31}
Binary Representation Encoding (representation): chromosomes:
0=00000, 1=00001, 2=00010, 3=00011,…
Generate initial population at random:01101, 00001, 11000, 10011,…
Evaluate the fitness according to f(x)01101 = 13 x= 13, f(x)=169
37
Example of Genetic Algorithm Select 2 individuals for crossover based on
their fitness.
parents 0110|1(=13) 1100|0(=24)
offspring 0110|0 1100|1mutated 01101 11000
Repeated until termination
Decoded ValueOr Fitness Value
38
Example of Genetic Algorithm Maximization
39
Example of Genetic Algorithm Crossover
40
Example of Genetic Algorithm Mutation
41
Benefits of Genetic Algorithms
Concept is easy to understand Modular, separate from application Supports multi-objective optimization Good for “noisy” environments Always an answer; answer gets better with
time Inherently parallel; easily distributed
Applying GAs to SNA
1. Community Detection The information contained in social network is
often represented as a graph.
The idea of graph partitioning of graph theory can be apply to split a graph into node groups based on its topology information. Clustering process combining different measures
of network topology (Density, Centralization, Heterogeneity, Neighbourhood, Clustering Coefficient).
42
Applying GAs to SNA
Using GAs to find the best K-communities in a network where any particular node could belong to different communities
Solution: a group of communities
“1”: the node belongs to the communities
“0”: the node does not belong to the communities
43**Evolutionary Clustering Algorithm for Community Detection Using Graph-based Information
Applying GAs to SNA
2. Friend Recommendation (Link Recommendation) Friend recommendation system is based on
the structural properties of social networks.
44**Friend recommendations in social networks using genetic algorithms and network topology
References
Bello-Orgaz, G., Camacho, D., “Evolutionary clustering algorithm for community detection using graph-based information,” IEEE Congress on Evolutionary Computation (CEC), 2014.
Naruchitparames, J. , Gunes, M.H. , Louis, S.J., “Friend recommendations in social networks using genetic algorithms and network topology,” IEEE Congress on Evolutionary Computation (CEC), 2011.
45