+ All Categories
Home > Documents > 7.Genalgo Application

7.Genalgo Application

Date post: 04-Apr-2018
Category:
Upload: srikar-chintala
View: 213 times
Download: 0 times
Share this document with a friend

of 31

Transcript
  • 7/30/2019 7.Genalgo Application

    1/31

    Part II: Applications of GAs

    GA and the Internet

    Genetic search based on multiple mutation approaches

  • 7/30/2019 7.Genalgo Application

    2/31

    GAs are useful and efficient when

    The search sapace is large, complex or poorly

    understood

    Domain knowledge is scarce or expertknowledge is difficult to encode to narrow thesearch space

    No mathematical analysis is available

    Traditional search methods fail

    For problem solving and for modeling

  • 7/30/2019 7.Genalgo Application

    3/31

    Applications

    GAs are applied to many scientific, engineering problems ,In business and entertainment , including:

    1. Optimization: It is used in wide variety of optimization tasks

    including numerical optimization such as travelingSalesman Problem, Job Scheduling Problem, video andsound quality optimization.

    2. Automatic Programming: It is used to evolve or generatecomputer program for specific task automatically

    3. In machine and robot Learning

    4. In Models of social systems

    5. Interactions between evolution and learning

  • 7/30/2019 7.Genalgo Application

    4/31

    Some Applications of Gas

    GAInternet search

    Data mining

    Software guided circuit designControl systems design

    Stock prize prediction

    Path finding Mobile robotssearch

    Optimization

    Trend spotting

  • 7/30/2019 7.Genalgo Application

    5/31

    Algorithms Phases

    Process set of URLs given by user

    Select all links from input set

    Evaluate fitness function for all genomes

    Perform crossover, mutation, and reproduction

    Satisfactorysolution

    obtained?

    The End

  • 7/30/2019 7.Genalgo Application

    6/31

    Introduction

    GA can be used for intelligent internet search.

    GA is used in cases when search spaceis relatively large.

    GA is adoptive search.

    GA is heuristic search method.

  • 7/30/2019 7.Genalgo Application

    7/31

    System for GA Internet

    Search Designed at faculty for electrical engineering, university of belgrade

    CO

    NTROL

    PROGRAM

    Agent Spider

    Input set

    Topic

    Space

    Time

    Output set

    Current set

    Top data

    Net data

    Generator

  • 7/30/2019 7.Genalgo Application

    8/31

    Spider

    Spider is software packages,that picks up internet documents

    from user supplied input with depth specified by user.

    Spider takes one URL, fetches all links,and documents thy contain with predefined depth.

    The fetched documents are stored on local hard disk with samestructure as on the original location.

    Spiders task is to produce the first generation.

    Spider is used during crossover and mutation.

  • 7/30/2019 7.Genalgo Application

    9/31

    Agent

    Agent takes as an input a set of urls,and calls spider, for every one of them, with depth 1.

    Then, agent performs extraction of keywords

    from each document, and stores it in local hard disk.

  • 7/30/2019 7.Genalgo Application

    10/31

    Generator

    Generator generates a set of urls from given keywords,using some conventional search engine.

    It takes as input the desired topic, calls yahoo search engine,and submits a query looking for all documentscovering the specific topic.

    Generator stores URL and topic of given web pagein database called topdata.

  • 7/30/2019 7.Genalgo Application

    11/31

    Topic

    It uses topdata DB inorder to insert random urlsfrom database into current set.

    Topic performs mutation.

  • 7/30/2019 7.Genalgo Application

    12/31

    Space

    Space takes as input the current setfrom the agent applicationand injects into it those urlsfrom the database netdata

    that appeared with the greatest frequencyin the output set of previous searches.

  • 7/30/2019 7.Genalgo Application

    13/31

    Time

    Time takes set of urls from agentand inserts ones with greatest frequency into DB netdata.

    The netdata DB contains of three fields: URL, topic,and count number.

    The DB is updated in each algorithm iteration.

  • 7/30/2019 7.Genalgo Application

    14/31

    How Does The System Work?

    CONTROL

    PR

    OGRAM

    Agent Spider

    Input set

    Topic

    Space

    Time

    Output set

    Current set

    Top data

    Net data

    Generator

    command flow

    data flow

  • 7/30/2019 7.Genalgo Application

    15/31

    GA and the Internet: Conclusion

    GA for internet search, on contrary to other gas,is much faster and more efficient that conventional solutions,such as standard internet search engines.

    INTERNET

  • 7/30/2019 7.Genalgo Application

    16/31

    Genetic Search Based on

    Multiple Mutation Approaches

    Concept and its improvements adapted to specificapplications in e-business, and concrete software package

    Main problems in finding information on the Internet:

    How to find quickly and retrieve efficiently the potentially usefulinformation considering the fact of the fast growth of the quantityand variety of Internet sites

    Huge number of documents , many of which are completelyunrelated to what the user originally attempted to find, searchedwith indexing engines

    Documents placed on the top of the result list are often lessacceptable then the lower ones

    Indexing process may take days, weeks , or even longer, becausethe volume of new information being created daily

  • 7/30/2019 7.Genalgo Application

    17/31

    Links Based Approach

    The question is:

    How to locate and retrieve the needed information before it gets indexed?

    The efficient way to locate the new not-yet-indexed information:

    Using links-based approaches genetic search

    simulated annealing

    Best result:

    indexing - based approaches

    +

    links - based approaches

  • 7/30/2019 7.Genalgo Application

    18/31

    Genetic Search Algorithm

    GENETIC ALGORITHM OF ZERO ORDER, with no mutation

    Start:

    Model Web presentation that contains all the needed types of

    information (fitness function is evaluated).

    It is assumes that it includes URL pointers to other similar Webpresentations, and these are downloaded.

    The Web presentations that survived the fitness function areassumed to include additional URL pointers, and their related Webpresentations are downloaded next.

    After the end-of-search condition is met, the Web presentations areranked according to their fitness value.

  • 7/30/2019 7.Genalgo Application

    19/31

    Genetic Search Algorithm

    Type of mutation:

    Topic-oriented database mutation

    Semantic mutations

    - based on the principles of spatial locality

    - based on the principles of temporal locality

    Logical reasoning and semantics consideration is involve inpicking out URLs for mutation.

  • 7/30/2019 7.Genalgo Application

    20/31

    Innovations Required by Domain Area

    APPLICATION LEVEL

    LEVEL OF THE GENERAL PROJECT APPROACHAND PRODUCT ARCHITECTURE

    ALGORITHMIC LEVEL

    IMPLEMENTATION LEVEL

  • 7/30/2019 7.Genalgo Application

    21/31

    Application Level

    Statistical analysis and data mining has to be performed,

    in order to figure out the common and typical patterns ofbehavior and need

    The state-of-the-art of mutual referencing has to be determined

    The trends and asymptotic situations foreseen for the time ofproject finalization has to be determined

  • 7/30/2019 7.Genalgo Application

    22/31

    Level of the General Project

    Approach and Product ArchitectureDecisions have to be made about the most important goals to beachieved:

    Maximizing the speed of search

    Maximizing the sophistication of search

    Maximizing specific effects of interest for a given institution or acustomer

    Maximizing a combination of the above

    Decision on this level affect the applicability of the final product /tool.

  • 7/30/2019 7.Genalgo Application

    23/31

    Algorithmic Level

    Develop an efficient mutation algorithm of interest for the application

    in the direction of database architecture and design

    in introducing the elements of semantic-based mutation

    Semantics-based mutations are especially of interest for chaoticmarkets, typical of new markets in developed countries ortraditional markets in under-developed countries.

  • 7/30/2019 7.Genalgo Application

    24/31

    Semantics-based Mutation

    Mutation based on spatial localities

    After a fruitful Web presentation is reached (using a tradicional

    algorithm with mutation), the site of the same Internet service provideris searched for other presentations on the same or similar topic

    Explanation :

    In chaotic markets, it is very unlikely that service/product offers fromthe same small geographic area each other on their Web presentations

    After a successful side trip based on spatial mutation, one continue

    with the traditional database mutation.

  • 7/30/2019 7.Genalgo Application

    25/31

    Semantics-based Mutation

    Mutation based on temporal localities

    One comes back periodically to a Web presentation which wasfruitful in the past

    One comes back periodically to other Web presentationsdeveloped by the author who created some fruitful Web

    presentations in the past

    Temporal mutation can use direct revisits or a number of indirectforms or revisit.

  • 7/30/2019 7.Genalgo Application

    26/31

    Implementation Level

    Utilization of novel technologies, for maximal performance andminimal implementation complexity

    Important for:

    - good flexibility

    - extendibility

    - reliability

    - availability

    Utilization of mobile platforms and mobile agents

  • 7/30/2019 7.Genalgo Application

    27/31

    Implementation Level

    Static agents

    - one has to download megabytes of information

    - treat that information with a decision-making code of sizemeasured in kilobytes

    - derive the final business related decision, which is binary in size(one bit: yes or no)

    A huge amount of data is transferred through the network invain, because only a small percent of fetched documents will turnout to be useful

    Mobile agents

    - they would browse through the network and perform the searchlocally, on the remote servers, transferring only the neededdocuments and data

    - they load the network only with kilobytes and a single bit

  • 7/30/2019 7.Genalgo Application

    28/31

    Simulation Result

    Links-based approach in the static domain

    How various mutation strategies can affect the search efficiency

    Set of software packages have developed , that would performInternet search using genetic algorithms (by Veljko Milutinovic,Dragana Cvetkovic, and Jelena Mirkovic)

    As the fitness function they have measured average Jaccards

    score for the output documents, while changing the type and rateof mutation

  • 7/30/2019 7.Genalgo Application

    29/31

    Simulation Result

    The simulation result for topicmutation

    The simulation result fortemporal and spatial mutationcombined with topic mutation

  • 7/30/2019 7.Genalgo Application

    30/31

    Simulation Result

    The simulation result for topic,spatial and temporal mutationcombined.

    Constant increase in the qualityof pages found.

  • 7/30/2019 7.Genalgo Application

    31/31

    Conclusion: Evolution

    Tutorial download: galeb.etf.bg.ac.yu/~vm Option:Tutorials


Recommended