+ All Categories
Home > Technology > Linkage Learning for Pittsburgh LCS: Making Problems Tractable

Linkage Learning for Pittsburgh LCS: Making Problems Tractable

Date post: 29-Nov-2014
Category:
Upload: xavier-llora
View: 1,084 times
Download: 1 times
Share this document with a friend
Description:
Presentation by Xavier Llorà, Kumara Sastry, & David E. Goldberg showing how linkage learning is possible on Pittsburgh style learning classifier systems
22
Linkage Learning for Pittsburgh LCS: Making Problems Tractable Xavier Llorà, Kumara Sastry, & David E. Goldberg Illinois Genetic Algorithms Lab University of Illinois at Urbana-Champaign { xllora ,kumara,deg} @illigal . ge . uiuc . edu
Transcript
  • 1. Linkage Learning for Pittsburgh LCS: Making Problems Tractable Xavier Llor, Kumara Sastry, & David E. Goldberg Illinois Genetic Algorithms Lab University of Illinois at Urbana-Champaign {xllora,kumara,deg}@illigal.ge.uiuc.edu
  • 2. Motivation and Early Work Can we apply Wilsons ideas for evolving rule sets formed only by maximally accurate and general rules in Pittsburgh LCS? Previous Multi-objective approaches: Bottom up (Bernad, 2002) Panmictic populations Multimodal optimization (sharing/crowding for niche formation) Top down (Llor, Goldberg, Traus, Bernad, 2003) Explicitly address accuracy and generality Use it to push and product compact rule sets The compact classifier system (CCS) roots on the bottom up approach. NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 2
  • 3. Maximally Accurate and General Rules Accuracy and generality can be compute as n t + (r) + n t# (r) n t + (r) quot;(r) = quot;(r) = nt nm Fitness should combine accuracy and generality f (r) = quot;(r) # $(r)% ! ! Such measure can be either applied to rules or rule sets. The CCS uses this fitness and a compact genetic algorithm ! (cGA) to evolve such rules. One cGA run provides one rule. Multiple rules are required to form a rule set. NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 3
  • 4. The cGA Can Make It Rules may be obtained optimizing f (r) = quot;(r) # $(r)% The basic CGA scheme 0 1. Initialization px i = 0.5 ! 2. Model sampling (two individuals are generated) 3. Evaluation (f(r)) 4. Selection (tournament selection) ! 5. Probabilistic model updation 6. Repeat steps 2-5 until termination criteria are met NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 4
  • 5. cGA Model Perturbation Facilitate the evolution of different rules Explore the frequency of appearance of each optimal rule Initial model perturbation 0 px i = 0.5 + U(quot;0.4,0.4) Experiments using the 3-input multiplexer 1,000 independent runs ! Visualize the pair-wise relations of the genes NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 5
  • 6. But One Rule Is Not Enough Model perturbation in cGA evolve different rules The goal: evolve population of rules that solve the problem together The fitness measure (f(r)) can be also be applied to rule sets Two mechanism: Spawn a population until the solution is meet Fusing populations when they represent the same rule NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 6
  • 7. Spawning and Fusing Populations NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 7
  • 8. Experiments & Scalability Analysis using multiplexer problems (3-, 6-, and 11-input) The number of rules in [O] grow exponentially. It grows as 2i, where i is the number of inputs. Assume equal probability of hitting a rule (binomial model). The number or runs to achieve all the rules in [O] grows exponentially. The cGA success as a function of the problem size! 3-input: 97% 6-input: 73.93% 11-input: 43.03% Scalability over 10,000 independent runs NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 8
  • 9. Scalability of CCS NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 9
  • 10. So? Open questions: Multiple runs is not an option. Could the poor cGA scalability be the result of the existence of linkage? The -ary extended compact classifier system (eCCS) needs to provide answers to: Perform linkage learning to improve the scalability of the rule learning process. Evolve [O] in a single run (rule niching?). The eCCS answer: Use the extended compact genetic algorithm (Harik, 1999) Rule niching via restricted tournament replacement (Harik, 1995) NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 10
  • 11. Extended Compact Genetic Algorithm A Probabilistic model building GA (Harik, 1999) Builds models of good solutions as linkage groups Key idea: Good probability distribution Linkage learning Key components: Representation: Marginal product model (MPM) Marginal distribution of a gene partition Quality: Minimum description length (MDL) Occams razor principle All things being equal, simpler models are better Search Method: Greedy heuristic search NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 11
  • 12. Marginal Product Model (MPM) Partition variables into clusters Product of marginal distributions on a partition of genes Gene partition maps to linkage groups MPM: [1, 2, 3], [4, 5, 6], [l-2, l -1, l] ... xl-2 xl-1 xl x1 x2 x3 x4 x5 x6 {p000, p001, p010, p100, p011, p101, p110, p111} NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 12
  • 13. Minimum Description Length Metric Hypothesis: For an optimal model Model size and error is minimum Model complexity, Cm # of bits required to store all marginal probabilities Compressed population complexity, Cp Entropy of the marginal distribution over all partitions MDL metric, Cc = Cm + Cp NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 13
  • 14. Building an Optimal MPM Assume independent genes ([1],[2],,[l]) Compute MDL metric, Cc All combinations of two subset merges Eg., {([1,2],[3],,[l]), ([1,3],[2],,[l]), ([1],[2],,[l-1,l])} Compute MDL metric for all model candidates Select the set with minimum MDL, If , accept the model and go to step 2. Else, the current model is optimal NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 14
  • 15. Extended Compact Genetic Algorithm Initialize the population (usually random initialization) Evaluate the fitness of individuals Select promising solutions (e.g., tournament selection) Build the probabilistic model Optimize structure & parameters to best fit selected individuals Automatic identification of sub-structures Sample the model to create new candidate solutions Effective exchange of building blocks Repeat steps 27 till some convergence criteria are met NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 15
  • 16. Models built by eCGA Use model-building procedure of extended compact GA Partition genes into (mutually) independent groups Start with the lowest complexity model Search for a least-complex, most-accurate model Model Structure Metric [X0] [X1] [X2] [X3] [X4] [X5] [X6] [X7] [X8] [X9] [X10] [X11] 1.0000 [X0] [X1] [X2] [X3] [X4X5] [X6] [X7] [X8] [X9] [X10] [X11] 0.9933 [X0] [X1] [X2] [X3] [X4X5X7] [X6] [X8] [X9] [X10] [X11] 0.9819 [X0] [X1] [X2] [X3] [X4X5X6X7] [X8] [X9] [X10] [X11] 0.9644 M M [X0] [X1] [X2] [X3] [X4X5X6X7] [X8X9X10X11] 0.9273 M M [X0X1X2X3] [X4X5X6X7] [X8X9X10X11] 0.8895 NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 16
  • 17. Modifying ecGA for Rule Learning Rules are described using -ary alphabets {0, 1, #}. eCCS uses a -ary version of ecGA (Sastry and Goldberg, 2003; de la Osa, Sastry, and Lobo, 2006). Maximally general and maximally accurate rules may be obtained using: f (r) = quot;(r) # $(r)% Needs to maintain multiple rules in a run niching We need an efficient niching method, that does not adversely ! affect the quality of the probabilistic models. Restricted tournament replacement (Harik, 1995) NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 17
  • 18. Experiments Goals 1. Is linkage learning useful to solve the multiplexer problem using Pittsburgh LCS? 2. How far can we push it? Multiplexer problems Address bits determine what input to use There is un underlying structure, isnt it? The larger solved using Pittsburgh approaches (11-input) Match all the examples No linkage learning available We borrowed the population sizing theory for ecGA. NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 18
  • 19. eCCS Models for Different Multiplexers Building Block Size Increases NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 19
  • 20. eCCS Scalability Follows facet-wise theory: 1. Grows exponential with the number of address bits (building block size) 2. Quadratically with the problem size NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 20
  • 21. Conclusions The eCCS builds on competent GAs The facetwise models from GA theory hold The eCCS is able to: 1. Perform linkage learning to improve the scalability of the rule learning process. 2. Evolve [O] in a single run. The eCCS show the need for linkage learning in Pittsburgh LCS to effectively solve multiplexer problems. eCCS solved 20-input, 37-input, and 70-input multiplexers problems for the first time using Pittsburgh LCS. NIGEL 2006 Llor, X., Sastry, K., and Goldberg, D. 21
  • 22. Linkage Learning for Pittsburgh LCS: Making Problems Tractable Xavier Llor, Kumara Sastry, & David E. Goldberg Illinois Genetic Algorithms Lab University of Illinois at Urbana-Champaign {xllora,kumara,deg}@illigal.ge.uiuc.edu

Recommended