Download - Ligand efficiency: nice concept shame about the metrics

Ligand efficiency: nice concept, shame about the metricsPeter W Kenny

http://fbdd-lit.blotspot.com | http://www.slideshare.net/pwkenny

Some stuff to think about

• We need to be honest with ourselves and make a clear distinction between what we know and what we believe

• If we do bad data analysis then how will we be able to convince people that drug discovery is really difficult?

Things that make drug discovery difficult

• Having to exploit targets that are weakly-linked to human disease

• Poor understanding and predictability of toxicity• Inability to measure free (unbound) physiological

concentrations of drug for remote targets (e.g. intracellular or on far side of blood brain barrier)

Dans la merde, FBDD & Molecular Design blog :

http://fbdd-lit.blogspot.com/2011/09/dans-la-merde.html



Molecular Design

• Control of behavior of compounds and materials by manipulation of molecular properties

• Sampling of chemical space– For example, does fragment-based screening allow better control

of sampling resolution?

• Hypothesis-driven or prediction-driven– There’s more to molecular design than making predictions (from

Molecular Design blog): link

Montanari, Propopczyk, Sala, Sartori (2013) JCAMD 27:655-664 DOIKenny JCIM 2009 49:1234-1244 DOI

New year, new blog name, Molecular Design blog

http://fbdd-lit.blogspot.com/2015/02/theres-more-to-molecular-design-than.html

http://dx.doi.org/10.1007/s10822-013-9676-0

http://dx.doi.org/10.1021/ci9000234

http://fbdd-lit.blogspot.com/2015/01/new-year-new-blog-name.html

TEP = log10()

Target engagement potential (TEP) A basis for pharmaceutical molecular design?

Design objectives• Low Kd for target(s)• High (hopefully undetectable) Kd for antitargets• Ability to control [Drug(X,t)]free

Kenny, Leitão & Montanari JCAMD 2014 28:699-710 DOI

http://dx.doi.org/10.1007/s10822-014-9757-8

Property-based design as search for ‘sweet spot’

Green and red lines represent probability of achieving ‘satisfactory’ affinity and ‘satisfactory’ ADMET characteristics respectively. The blue line shows the product of these probabilities and characterizes the ‘sweet spot’. This way of thinking about the ‘sweet spot’ has similarities with molecular complexity model proposed by Hann et al.

Kenny & Montanari, JCAMD 2013 27:1-13 DOI

http://dx.doi.org/10.1021/ci000403i

http://dx.doi.org/10.1007/s10822-012-9631-5

Rules, guidelines and metrics

• It’s not a rule, it’s a guideline… OK why did you call it a rule?

• Strength of a trend tells us how rigidly we should adhere to guidelines based on that trend

• Think carefully about physicochemical basis of guidelines and metrics– Using logD to define compound quality metrics suggests

that compounds can be made better by simply increasing the extent of ionization

Know your data

• Assays are typically run in replicate making it possible to estimate assay variance

• Every assay has a finite dynamic range and it may not always be obvious what this is for a particular assay

• Dynamic range may have been sacrificed for thoughput but this, by itself, does not make the assay bad

• We are likely to need to be able analyse in-range and out-of-range data within single unified framework– See Lind (2010) QSAR analysis involving assay results which are only known to

be greater than, or less than some cut-off limit. Mol Inf 29:845-852 DOI

http://dx.doi.org/10.1002/minf.201000074

Introduction to ligand efficiency metrics (LEMs)

• We use LEMs to normalize activity with respect to risk factors such as molecular size and lipophilicity

• What do we mean by normalization?• How predictive are risk factors of bad outcomes?• We make assumptions about underlying relationship

between activity and risk factor(s) when we define an LEM• LEM as measure of extent to which activity beats a trend?

Kenny, Leitão & Montanari (2014) JCAMD 28:699-701 DOI Ligand efficiency metrics considered harmful, Molecular design blog

http://dx.doi.org/10.1007/s10822-014-9757-8

http://fbdd-lit.blogspot.com/2014/08/ligand-efficiency-metrics-considered.html

Scale activity/affinity by risk factor

LE = ΔG/HA

Offset activity/affinity by risk factor

LipE = pIC50 ClogP

Ligand efficiency metrics

There is no reason that normalization of activity with respect to risk factor should be restricted to either of these functional forms.

Kenny, Leitão & Montanari (2014) JCAMD 28:699-701 DOI

http://dx.doi.org/10.1007/s10822-014-9757-8

Use trend actually observed in data for normalization rather than some arbitrarily assumed trend


Can we accurately claim to have normalized a data set if we have made no attempt to analyse it?

Green: line of fitPurple: constant LEBlue: constant LipE

http://dx.doi.org/10.1007/s10822-014-9757-8

There’s a reason why we say standard free energy of binding

DG = DH TDS = RTln(Kd/C0)

• Adoption of 1 M as standard concentration is arbitrary

• A view of a chemical system that changes with the choice of standard concentration is thermodynamically invalid (and, with apologies to Pauli, is ‘not even wrong’)

Kenny, Leitão & Montanari (2014) JCAMD 28:699-701 DOI Efficient voodoo thermodynamics, FBDD & Molecular design blog

http://dx.doi.org/10.1007/s10822-014-9757-8

http://fbdd-lit.blogspot.com/2013/03/efficient-voodoo-thermodynamics.html

NHA Kd/M C/M (1/NHA) log10(Kd/C)

10 10-3 1 0.30

20 10-6 1 0.30

30 10-9 1 0.30

10 10-3 0.1 0.20

20 10-6 0.1 0.25

30 10-9 0.1 0.27

10 10-3 10 0.40

20 10-6 10 0.35

30 10-9 10 0.33

Effect on LE of changing standard concentration

Analysis from Kenny, Leitão & Montanari (2014) JCAMD 28:699-701 DOI Note that our article overlooked similar observations 5 years earlier by

Zhou & Gilson (2009) Chem Rev 109:4092-4107 DOI

http://dx.doi.org/10.1007/s10822-014-9757-8

http://dx.doi.org/10.1021/cr800551w

Scaling transformation of parallel lines by dividing Y by X(This is how ligand efficiency is calculated)

Size dependency of LE in this example is consequence of non-zero intercept


http://dx.doi.org/10.1007/s10822-014-9757-8

Affinity plotted against molecular weight for minimal binding elements against various targets in inhibitor deconstruction

study showing variation in intercept term

Data from Hajduk (2006) JMC 49:6972–6976 DOI

Each line corresponds to a different target and no attempt has been made to indicate targets for individual data points. Is it valid to combine results from different assays when using LE?


http://dx.doi.org/10.1021/jm060511h

http://dx.doi.org/10.1007/s10822-014-9757-8

Offsetting transformation of lines with different slope and common intercept by subtracting X from Y

(This is how lipophilic efficiency is calculated)

Thankfully (hopefully?) lipophilicity-dependent lipophilic efficiency has not yet been ‘discovered’


http://dx.doi.org/10.1007/s10822-014-9757-8

Water

Octanol

pIC50

LipE

What we try to capture when we use lipophilic efficiency


There are two problems with this approach. Firstly octanol, is not ideal non-polar reference state because it can form hydrogen bonds with solutes (and is also wet). Secondly, logP does not model cost of transfer from water to octanol for ligands that bind as ionized forms

logP

http://dx.doi.org/10.1007/s10822-014-9757-8

Linear fit of ΔG to HA for published PKB ligands

Data from Verdonk & Rees (2008) ChemMedChem 3:1179-1180 DOI

HA

ΔG/

kcal

mol

-1ΔG/kcalmol-1 0.87 (0.44 HA)R2 0.98 ; RMSE 0.43 kcalmol-1

-ΔGrigid

NH

N

http://dx.doi.org/10.1002/cmdc.200800132

Ligand efficiency, group efficiency and residuals plotted for PKB binding data

Resi

d |

GE

GE/kcalmol-1HA-1

Resid/kcalmol-1

LE/kcalmol-1HA-1

Residuals and group efficiency values show similar trends with pyrazole (HA = 5) appearing as outlier (GE is calculated using ΔGrigid ). Using residuals to compare activity eliminates need to use ΔGrigid estimate (see Murray & Verdonk 2002 JCAMD 16:741-753 DOI) which is subject to uncertainty.

http://dx.doi.org/10.1023/A:1022446720849

Use residuals to quantify extent to which activity beats trend

• Normalize activity using trend(s) actually observed in data (this means we have to model the data)

• All risk factors can be treated within the same data-analytic framework

• Residuals are invariant with respect to choice of concentration units

• Uncertainty in residuals is not explicitly dependent of value of risk factor (not the case for scaled LEMs)

• Residuals can be used with other functional forms (e.g. non-linear and multi-linear)


http://dx.doi.org/10.1007/s10822-014-9757-8

Some more stuff to think about

• The function of a metric is to measure and not tell you what you want to hear

• Efficiency as response of activity to risk factor• Efficiency as extent to which to which activity

beats a trend• Need to model activity data if you’re claiming to

have normalized it• Using LEMs distorts your perception of data

unnecessarily