+ All Categories
Home > Documents > GA by Prashant & Vivek

GA by Prashant & Vivek

Date post: 06-Apr-2018
Category:
Upload: patkar-college
View: 215 times
Download: 0 times
Share this document with a friend

of 23

Transcript
  • 8/3/2019 GA by Prashant & Vivek

    1/23

    New Approach for Classification Of

    Protein By Genetic Algorithm

    By

    Prashant Makwana And

    Vivek Soni

    1

  • 8/3/2019 GA by Prashant & Vivek

    2/23

    Introductiony WhyWe need Classification?

    Huge amount of Protein Data and still increasing

    To know Structure And Function

    Database Management(DBMS) purpose

    2

  • 8/3/2019 GA by Prashant & Vivek

    3/23

    What is GA?yA class of probabilistic optimization algorithms

    y Inspired by the biological evolution process

    y Uses concepts ofNatural Selection and Genetic

    Inheritance

    (Darwin 1859)

    y Originally developed by John Holland (1975)

    3

  • 8/3/2019 GA by Prashant & Vivek

    4/23

    Conty Definition:

    class of Evolutionary Algorithm(EA), which generate

    useful solutions to optimization problems usingtechniques inspired by natural evolution, such asinheritance, mutation, selection, and crossover.

    y Genetic algorithms find application in bioinformatics,phylogenetics, computational science, economics,manufacturing, mathematics, physics and other fields.

    4

  • 8/3/2019 GA by Prashant & Vivek

    5/23

    Classes of Search TechniquesSearch Techniques

    Calculus BaseTechniques

    Guided random searchtechniques

    EnumerativeTechniques

    BFSDFS DynamicProgramming

    Tabu Search Hill

    Climbing

    Simulated

    Annealing

    EvolutionaryAlgorithms

    GeneticProgramming

    GeneticAlgorithms

    Fibonacci Sort

    5

  • 8/3/2019 GA by Prashant & Vivek

    6/23

    GA Overview

    y GA terms

    Population of string(Chromosomes/genotypes ofgenome)

    Fitness function

    y Initialization

    y Selection

    y Reproduction

    y Termination

    6

  • 8/3/2019 GA by Prashant & Vivek

    7/23

    Cont..y Initialization

    Random process of selecting population/chromosome

    y Selection

    a proportion of the existing population is selected to breeda new generation

    Based on Fitness Function/Fitness

    y Reproduction

    The next step is to generate a second generation populationof solutions from those selected through genetic operators:Crossover (also called recombination), and/or Mutation.

    7

  • 8/3/2019 GA by Prashant & Vivek

    8/23

    Conty Termination

    This generational process is repeated until a termination

    condition has been reached. Common terminatingconditions are:1. A solution is found that satisfies minimum criteria2. Fixed number of generations reached3. Allocated budget (computation time/money) reached

    4. The highest ranking solution's fitness is reaching or hasreached a plateau such that successive iterations nolonger produce better results

    5. Manual inspection6. Combinations of the above

    8

  • 8/3/2019 GA by Prashant & Vivek

    9/23

    GA procedure1. Choose the initial population of individuals

    2. Evaluate the fitness of each individual in thatpopulation

    3. Repeat on this generation until termination (timelimit, sufficient fitness achieved, etc.):

    y Select the best-fit individuals for reproduction

    y Breed new individuals through crossover and mutationoperations to give birth to offspring

    y Evaluate the individual fitness of new individuals

    y Replace least-fit population with new individuals9

  • 8/3/2019 GA by Prashant & Vivek

    10/23

    Our Worky Protein

    Olfactory receptor

    y OrganismHuman

    y Database

    Uniprot

    y Datasets

    5 sets each of 50 fasta sequences

    10

  • 8/3/2019 GA by Prashant & Vivek

    11/23

    Cont1. Find Conserved Domains for each protein sequence

    using prosite

    2. Multiple sequence alignment(MSA) for each set3. Develop PSSM matrix for each set

    4. Use GA operators: Mutation & Crossover

    5. Generate new sets by GA operators

    6. Calculate Determinant for each set7. Calculate fitness for each set

    8. Repeat steps 4,5,6,7 till termination

    11

  • 8/3/2019 GA by Prashant & Vivek

    12/23

    new set sequencefitness

    Conty Final constant values(Fitness) which then compared

    with a new set of sequence fitness to check for family

    criteriaSet 1

    Set 2

    Set 3

    Set n

    Final Constant Values range for

    given protein family

    New sequencebelongs to givenprotein family

    12

  • 8/3/2019 GA by Prashant & Vivek

    13/23

    Reason for each stepsy Use of Prosite

    to determine conserved domain for given sequence

    Conserved domain selected as a parameter forclassification

    y Use of PSSM

    To obtain numerical values for given sets, whichfurther used for GA

    y Use of mutation & crossover

    to check for maximum possible chances of variationsfor given sets

    13

  • 8/3/2019 GA by Prashant & Vivek

    14/23

    Why GA/Advantage of GA?y It solves problems with multiple solutions.

    y Genetic algorithm is a method which is very easy tounderstand and it practically does not demand theknowledge of mathematics.

    y Genetic algorithms are easily transferred to existingsimulations and models.

    14

  • 8/3/2019 GA by Prashant & Vivek

    15/23

    Disadvantagey Certain optimization problems cannot be solved by

    means of genetic algorithms. This occurs due to poorly

    known fitness functions .

    yAnother drawback that GAs require large number ofresponse (fitness) function evaluations depending on

    the number of individuals and the number ofgenerations.

    y GA is usually slower than traditional techniques.15

  • 8/3/2019 GA by Prashant & Vivek

    16/23

    Future of GA in Bioinformaticsy Future approach for our work will be to consider more

    parameter for protein classification .

    Protein structure prediction by combining GA withother algorithm

    y Immune system models: GAs have been used to modelvarious aspects of the natural immune system,including somatic mutation during an individualslifetime and the discovery of multi-gene familiesduring evolutionary time.

    16

  • 8/3/2019 GA by Prashant & Vivek

    17/23

    Conty Ecological models: GAs have been used to model

    ecological phenomena such as host-parasite co-evolutions, symbiosis.

    17

  • 8/3/2019 GA by Prashant & Vivek

    18/23

    Conclusiony

    18

  • 8/3/2019 GA by Prashant & Vivek

    19/23

    Referencey jin-xiong : Essential bioinformatics

    y http://en.wikipedia.org/wiki/Genetic_algorithm

    y http://www.informatics.indiana.edu/fil/CAS/PPT/Davis/sld001.htm

    y http://fasta.bioch.virginia.edu/fasta_www2/chaps.cgi

    19

  • 8/3/2019 GA by Prashant & Vivek

    20/23

    Uniprot(Oflactory Receptor)

    20

  • 8/3/2019 GA by Prashant & Vivek

    21/23

    CHAPS

    21

  • 8/3/2019 GA by Prashant & Vivek

    22/23

    PSSM

    22

  • 8/3/2019 GA by Prashant & Vivek

    23/23

    23


Recommended