+ All Categories
Home > Documents > An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Date post: 31-Dec-2015
Category:
Upload: imani-short
View: 23 times
Download: 3 times
Share this document with a friend
Description:
An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States. EDDA KLOPPMANN, G. MATTHIAS ULLMANN, TORSTEN BECKER. Improved Pruning algorithms and Divide-and-Conquer strategies for Dead-End Elimination, with application to protein design. - PowerPoint PPT Presentation
Popular Tags:
37
An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States EDDA KLOPPMANN, G. MATTHIAS ULLMANN, TORSTEN BECKER
Transcript
Page 1: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

An Extended Dead-End Elimination Algorithm to

Determine Gap-Free Lists of Low Energy States

EDDA KLOPPMANN, G. MATTHIAS ULLMANN, TORSTEN BECKER

Page 2: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Improved Pruning algorithms and Divide-and-Conquer strategiesfor Dead-End Elimination, with application to protein design

Ivelin Georgiev1, Ryan H. Lilien, Bruce R. Donald

2006

Page 3: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Dead End Elimination Motivation• Structure determines function• Lowest free energy state is most probable by laws of

thermodynamics• Direct calculation rarely possible

So:• Conformation space is discretized• Allows for exhaustive search• Desire for an algorithm which deterministically finds

the lowest energy state while circumventing combinatorial exhaustion

Page 4: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DEE Overview(Desment, et al, 1992)

• Originally applied to predict side chain positions in homology modeling

• Views proteins as a set of residues (sites), each of which may adopt a finite number of rotamers (forms)

• DEE identifies the highest energy forms of sites which are incompatible with the state of lowest energy

• High energy forms are considered dead-ends and pruned from consideration

Page 5: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DEE Overview Continued

• DEE solves the combinatorial problem of identifying the global energy minimum for discrete pairwise system

• Energy is expressed in terms of intrinsic energies of sites and pairwise interactions between sites

• Each site adopts a discrete form that determines its contribution to the total energy

Page 6: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DEE Theory• DEE identifies and eliminate forms of sites which

cannot contribute to the lowest energy conformation in order to circumvent an exhaustive search

• The DEE criterion employs rotameric energy interactions to identify and prune rotamers that are provably not part of the GMEC.

• DEE criterion compares the energy of two forms of a site μ, dμ and cμ

• If all states that contain dμ are higher in energy than the corresponding states that contain cμ, dμ is a dead end and removed from consideration

Page 7: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Motivation for X-DEE(Kloppmann, et al, 2007)

• Proteins are flexible systems which may adapt several functionally relevant states

• Preference for a more complete picture of the available low energy states

• X-DEE produces a gap-free list of low energy states (i.e., complete up to a given distance from the global energy minimum)

• Implemented to determine the lowest energy protonation states of proteins

Page 8: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

X-DEE Intuition• General idea is to exclude a list of states from the search space

explored by DEE in order to construct a gap-free list• Basic idea: If a gap-free list of k low energy states {x1, · · ·, xk} is

already known, the (k + 1)th state can be found by restricting the search for the lowest energy state to the set of all states M excluding the set of already known states

• General idea: restrict the DEE search space to a set M (complete set of states) \ L (list of states to be excluded) for any given list L of states.– In case L is not gap-free, identify the state of lowest energy not

included in L until a gap-free list of low energy states is obtained.

Page 9: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Excluding a list of states from consideration

• There is no straightforward way to exclude an arbitrary list of states L from the search space explored by DEE

• So, we aim to restrict a DEE search to a specific type of subset of M:– Fixing a number of sites during a DEE search yields the

state of lowest energy of a subset S of M characterized by the forms of the fixed sites

– So, applying DEE to the subset S of those states that have form f at site s will determine the state of lowest energy with form f at site s

• How do we do this?

Page 10: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Constructing a Search Bias• The idea of X-DEE is to derive a search basis B composed of

a set of search keys bS, such that L is excluded from the search and the complete set M \ L is searched.

• The authors present a recursive procedure “CreateSearchBias” which given the list of states L to be excluded, constructs a search bias keys

• Initial conditions– List L of states to be excluded from the search– Associated list vector T that contains an element for each site

which keeps track of the sites which are already fixed to specific forms

– Initially, all sites are unfixed (i.e., undefined)

Page 11: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Constructing a Search Bias: Overview

• With each recursion, L is divided into sublists and one additional site is fixed in the associated list vectors. CreateSearchBasis terminates when all sites of a list vector are fixed.

• With each recursion, search keys can be generated that differ from the list vector in one form. The search keys are added to the search basis B.

• CreateSearchBasis generates a set of search keys bS characterizing subsets S whose union represent M\ L.

Page 12: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Introducing Search Keys• This subset S can be represented by a so-called search key

bS = (h1, 2, · · · , μ, · · · , N), where:∗ ∗ ∗– h is the specified form of site 1 and indicates that this site is ∗

undefined (the idea being undefined sites will be determined during the DEE search)

– For each site μ of the system, these search keys have a component bμ which is either fixed to a specific form or undefined.

• X-DEE will define search keys bS = (b1, · · · , bμ, · · · , bN) such that the subsets S represented by the individual search keys together represent M \ L.

• Determining the state of lowest energy of all subsets via the DEE algorithm yields the desired state of lowest energy of M \ L.

Page 13: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Recursive CreateSearchBias (L, T)

1. Base case: Return if T does not contain any undefined sites.

Page 14: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Recursive CreateSearchBias (L, T)

1. Base case: Return if T does not contain any undefined sites..

2. Find a site μ with unused forms (i.e., forms which are not present in any of the state vectors in L). If no such site exists, choose the first undefined site and jump to step 4.

Page 15: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Recursive CreateSearchBias (L, T)3. Create a search key: For

each unused form h of site μ, a search key b is defined by copying the list vector t to b and fixing site μ to form h in b; bμ = h.

So, each search key differs from the current list vector only at site μ.

Fixing site μ to forms h not occurring in , guarantees that the subset represented by b and L are disjoint, i.e., b represents a subset of M \ L. Now add b to the search basis B.

Page 16: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Recursive CreateSearchBias (L, T)4. Divide the vectors L

into sublists such that site μ has form g in all state vectors x in Lsub, i.e., xμ = g for all states in Lsub.

To each sublist Lsub, a separate list vector tsub is assigned by copying list vector t to tsub and fixing site μ to the form g common to all state vectors in Lsub; tμ = g.

Page 17: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Recursive CreateSearchBias (L, T)4. Divide the vectors L into

sublists such that site μ has form g in all state vectors x in Lsub, i.e., xμ = g for all states in Lsub.

To each sublist Lsub, a separate list vector tsub is assigned by copying list vector t to tsub and fixing site μ to the form g common to all state vectors in Lsub; tμ = g.

5. Recurse on each sublist Ls and its list vector t

Page 18: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Using the Search Keys

• All search keys in B are subjected to a DEE search yielding the states of lowest energies of the represented subsets S.

• These states include the state of lowest energy of M \ L.

• The completeness of the Search Bias B is provable– Basic idea is to show (i) all subsets of states S

represented by the search keys are subsets of M\ L and that (ii) the union of all subsets S represent the complete set M\ L

Page 19: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

X-DEE Application Domain• On the right: light absorption

triggers Bacteriorhodopsin’s pumping cycle during which a proton is transferred from the cytoplasm to the extracellularspace.

• Basic idea: Proteins contain protonatable residues whose charged state depends on their interaction with the protein environment.

• These protonatable residues are treated as sites and each site with each site adopting one of two forms (protonated, unprotonated).

Page 20: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

X-DEE Application Domain

• Charge distribution of a protein is essential to its function– In proteins, not only the

state of lowest energy but also the next higher protonation states are commonly significantly populated and often play a functional role

Page 21: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

X-DEE Performance Characteristics • Total search keys generated depends approximately linearly on the

number of states in L, which influence the number of search keys in two different ways:– Each additional state in L increases the number of states to be

excluded from the search and thereby tends to increase the number of generated keys

– Each additional state in L decreases the search space M \ L and thereby tends to decrease the number of generated keys

• Ultimately, the number of search keys will decrease with the number of states in L. However, as long as L is small compared to M \ L, an approximately linear increase of the total number of search keys can be observed

Page 22: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

X-DEE Performance Characteristics

• Computational cost of X-DEE depends approximately linearly on the size of the system and the number of states to be excluded from the search

• For low energy states which are built up one after the other, the computational cost to determine an additional state remains on average constant.

Page 23: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Improved Pruning algorithms and Divide-and-Conquer strategiesfor Dead-End Elimination, with application to protein design

Ivelin Georgiev1, Ryan H. Lilien, Bruce R. Donald

2006

Page 24: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DACS Motivation

• DACS: a provably-accurate divide-and-conquer enhancement to traditional-DEE.

• Protein design for a rigid backbone and using rotamers and a pairwise energy function is provably NP-hard

• Desire for provable, deterministic algorithms which make real guarantees (as opposed to heuristic methods, Monte Carlo, genetic algorithms, etc)

Page 25: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

Traditional DEE• The DEE criterion uses rotameric energy interactions to identify and

prune rotamers that are provably not part of the GMEC• A target rotamer is pruned if a competitor rotamer is found such

that the lowest possible energy among conformations containing the competitor rotamer is higher than the worst possible energy among conformations containing the target

• DEE does not guarantee a unique solution: multiple unpruned conformations may remain after pruning with DEE is exhausted.

• If this happens, the DEE pruning stage is be followed by an enumeration stage, in which the remaining conformations are examined and the GMEC is identified – exponential time

• One improvement is to partition the search space

Page 26: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

split-DEE and DACS• By partitioning the conformational search space, split-DEE

enhances the pruning efficiency of traditional-DEE• In split-DEE, the conformational space can be divided into

several partitions, such that for each partition, there is some competitor that has better conformational energies than a rotamer within that partition

• The advantage of split-DEE is that no single competitor is required to outperform a rotamer for every conformation as long as there exists a different dominant competitor for each partition, a rotamer can be pruned

• We can still do better:• DACS enhances split-DEE by performing DEE pruning within

individual partitions

Page 27: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DACS as an enhancement to split-DEE(Divide-And-Conquer Splitting)

Like in split-DEE, the conformational space is divided into partitions

Page 28: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DACS as an enhancement to split-DEE(Divide-And-Conquer Splitting)

Like in split-DEE, the conformational space is divided into partitionsWithin each partition, DEE pruning is applied to determine if there is a competitor rotamer at a residue that always outperforms our original rotamer

Page 29: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DACS as an enhancement to split-DEE(Divide-And-Conquer Splitting)

Like in split-DEE, the conformational space is divided into partitionsWithin each partition, DEE pruning is applied to determine if there is a competitor rotamer at a residue that always outperforms our original rotamer If DEE pruning does not produce a unique solution, enumeration of the conformations in the current partition must be performed by A*

Page 30: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DACS as an enhancement to split-DEE(Divide-And-Conquer Splitting)Like in split-DEE, the conformational space is divided into partitionsWithin each partition, DEE pruning is applied to determine if there is a competitor rotamer at a residue that always outperforms our original rotamer If DEE pruning does not produce a unique solution, enumeration of the conformations in the current partition must be performed by A*.The lowest-energy conformation among the local rigid-GMECs for all partitions is the overall rigid-GMEC

Page 31: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

split-Flags• The general advantage of DACS over split-DEE is the ability

to prune an additional combinatorial subset of the conformational space by exploiting partition-specific prunings

• The DEE pruning stage in DACS can incorporate any combination of the available provably-accurate traditional-DEE techniques

• The split-flags (Gordon et al., 2003) algorithm has similar intent– If a target rotamer cannot be pruned for all partitions, the

partitions in which it can be pruned are flagged as dead-ending. – Like DACS, split-flags uses pruning information discarded by

split-DEE

Page 32: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

split-Flags vs DACS• One advantage of DACS over split flags stems

from the divide-and-conquer paradigm.– The cost of expanding the A search tree depends

combinatorially on the number of rotamers for each residue position

– A divide-and-conquer approach (which reduces the number of rotamers in each partition) is more efficient than directly finding the global solution

• A bonus of divide and conquer approaches is that they are naturally parallelizable, reducing real-world running time

Page 33: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

min-DEE Overview• Used when the protein design process incorporates

rotameric energy minimization (DEE no longer provably-accurate)

• MinDEE is similar to traditional-DEE in that rotameric energy interactions are used to determine which rotamers are provably not part of the minGMEC and can be pruned.

• MinDEE guarantees that no rotamers are pruned which belong to the conformation with the lowest energy among all energy-minimized conformations

• Since rotamers are allowed to energy-minimize, lower and upper bounds on the self- and pairwise rotamer energies must be used, instead of the rigid-energy terms

Page 34: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

min-DEE vs. DEE• Without energy minimization, a rotamer stays in the

same rigid conformation, independent of the rotamer identities for the remaining residues.

• With energy minimization, a rotamer may minimize from its initial conformation in order to accommodate a change in another rotamer

• So that one rotamer does not minimize into another, rotameric movement is constrained to a voxel of conformation space

• The most significant difference between traditional-DEE and MinDEE is the accounting for possible energy changes during minimization

Page 35: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DACS and minDEE

• It’s straightforward to modify DACS to incorporate energy minimization

• To only prune rotamers that are provably not part of the minGMEC, the traditional-DEE criteria in the DEE cycle of DACS must be discarded and their MinDEE equivalents used instead

Page 36: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

MinDEE/A*

• Incorporates splitting, MinBounds (a provably-correct with energy minimization approach analogous to (Gordon et al., 2003) for traditional-DEE), and DACS for MinDEE

• A* is then applied in the enumeration stage to extract the minGMEC from the set of remaining conformations.

• Similar to DACS, the lowest-energy conformation among the rigid-GMECs for all mutation sequences is identified as the overall rigid-GMEC

Page 37: An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States

DACS / MinDEE-A*Performance• Partition specific prunings

– By using a divide-and-conquer approach to partition the conformational space and identify partition-specific prunings, DACS allows for additional elimination, after pruning with the original split-DEE and split flags techniques is exhausted.

• Reduced cost of expending A* search trees– The improved execution times of DACS stems from the reduced cost

of expanding the A search trees for each partition, resulting from the divide-and-conquer approach as opposed to expanding the single A tree for the full conformational space.

• Increased pruning efficiency– MinDEE benefits from increased pruning efficiency, and so works best

on MinDEE/A larger systems where the cost of expanding the search tree in the enumeration stage dominates the computation (rather than the energy minimization).


Recommended