Post on 05-Aug-2015
transcript
The distribu,on of gene,c variance across phenotypic space and the response to selec,on
Mark Blows
School of Biological Sciences University of Queensland
Nothing would happen without these guys at UQ
• Katrina McGuigan and Emma Hine – Long-‐term collaborators on mul,variate responses to selec,on and the gene,c analysis of high dimensional traits
• Steve Chenoweth, Julie Collet and ScoI Allen – Collabora,on on the gene,c analysis of gene expression
Framework for today’s talk: Understanding the distribu,on of gene,c variance
How does gene,c covariance (pleiotropy) change the availability of
gene,c variance? (a bit of theory and data)
How does pleiotropy influence the response to selec,on in
small sets of traits? (a selec,on experiment)
How widespread is pleiotropy among small sets of traits?
(some random matrix theory)
What is the phenome-‐wide extent of pleiotropy?
(gene,c analysis of 1000s of gene expression traits)
The geometry of gene,c varia,on and mul,variate evolu,on
A covariance matrix of gene,c rela,onships among
traits
1v 1,2cov … 1,ncov2v 2,ncov
nv
!
"
######
$
%
&&&&&&
G
∆z = Gβ
€
β1β 2β n
#
$
% % % %
&
'
( ( ( (
A vector of the strength of selec,on ac,ng on mul,ple
traits
β
How does it all fit together? Spectral decomposi,on of Lande’s equa,on
ββββ TTT1 λλλ nnn222maxmax ggggggΔz +++== …G
Projec,on of the direc,on of selec,on along the first eigenvector of G, weighted by its eigenvalue
Consequence 1 If gene,c variance is unevenly distributed in G (eigenvalues vary greatly in size), the response to selec,on will o\en be biased
away from β
Consequence 2 The response of individual traits may be in a direc,on opposite to the selec,on applied on them, par,cularly when β is in a direc,on with low gene,c variance
What does the distribu,on of gene,c variance look like for func,onally related traits?
€
1 −0.52 −0.52−0.52 1 −0.45−0.52 −0.45 1
#
$
% % %
&
'
( ( (
Toy example: A nearly singular
G matrix
Time (minutes)8.0 8.5 9.0 9.5 10.0 10.5
30
34
38
42
46
Sig
nal s
treng
th (p
A)
5,9-C24
5,9-C25
9-C259-C26
2-Me-C26
5,9-C27
5,9-C29
2-Me-C28
2-Me-C30
Time (minutes)8.0 8.5 9.0 9.5 10.0 10.5
30
34
38
42
46
Sig
nal s
treng
th (p
A)
5,9-C24
5,9-C25
9-C259-C26
2-Me-C26
5,9-C27
5,9-C29
2-Me-C28
2-Me-C30
99% of the gene,c variance in the 10 Drosophila wing traits explained by 5 gene,cally independent traits
(McGuigan and Blows 2007, Evolu,on)
98% of the gene,c variance in 9 cu,cular hydrocarbons is contained
in 5 significant dimensions (Van Homrigh et al 2007, Current Biology)
High dimensional selec,on experiment: will the en,re phenotypic space respond?
Select along all 8 gene,c eigenvectors of the 8 dimensional
phenotypic space Time (minutes)
8.0 8.5 9.0 9.5 10.0 10.5
30
34
38
42
46
Sig
nal s
treng
th (p
A)
5,9-C24
5,9-C25
9-C259-C26
2-Me-C26
5,9-C27
5,9-C29
2-Me-C28
2-Me-C30
Time (minutes)8.0 8.5 9.0 9.5 10.0 10.5
30
34
38
42
46
Sig
nal s
treng
th (p
A)
5,9-C24
5,9-C25
9-C259-C26
2-Me-C26
5,9-C27
5,9-C29
2-Me-C28
2-Me-C30
Index T1 0.659 T2 0.209 T3 0.052 T4 0.316 T5 0.325 T6 0.528 T7 -‐0.116 T8 0.146
Design of selec,on experiment • 3 replicate popula,ons for each
of the 8 selec,on indices • Select for 6 genera,ons • 50% trunca,on selec,on • 2 control lines
Selec,on along all eight gene,c eigenvectors
Distribu,on of standing gene,c variance in the base popula,on
✓
✓
✓
✓
✓
?
?
? Hine et al 2014, American Naturalist
Black indicates a trait evolved in the direc,on opposite to its selec,on gradient
95% Bayesian uncertainty intervals
How to compare base G and the response: The realised G matrix
Δz11Δz12Δz13Δz21Δz22Δz23
"
#
$$$$$$$$$
%
&
'''''''''
=
w11 w12 w13 0 0 0
0 w11 0 w12 w13 0
0 0 w11 0 w12 w13
w11 w12 w13 0 0 0
0 w11 0 w12 w13 0
0 0 w11 0 w12 w13
"
#
$$$$$$$$$
%
&
'''''''''
v1cov12cov13v2
cov23v3
"
#
$$$$$$$$$
%
&
'''''''''
Response to selec,on of traits
Example for 3 traits and 2 selec,on indices (popula,ons)
Selec,on weights
Realised gene,c (co)variances
Note the overes,ma,on of the gene,c variance in the base popula,on
Mul,variate gene,c variances successfully predicted responses
Hine et al 2014, American Naturalist
Can we make any generaliza,ons about the spectral distribu,on of gene,c variance?
40 5-‐d G matrices From Pitchers et al (2014)
Yes, gene,c “nearly-‐null” spaces may be common But, how do we know if the exponen,al decline in eigenvalues is more than expected in the absence of gene,c covariance? In other words, what is the null model for mul,variate quan,ta,ve gene,cs?
Recent colla,on of all es,mated G matrices by Pitchers et al 2014, Phil. Trans. R. Soc. B
6
5
4
3
2
1
0
-1
-2
-3
-4
Ge
ne
tic V
aria
nce
1 2 3 4 5Eigenvector
Life History - 2
Morphology - 32
Sexually Selected - 7
61 2 3 4 5
Eigenvector
6
4
2
0
-2
-4
Ge
ne
tic V
aria
nce
Life History - 9
Morphology - 26
Sexually Selected - 38
10
12
-6
761 2 3 4 5Eigenvector
6
5
4
3
2
1
0
-1
-2
-3
Ge
ne
tic V
aria
nce
7Life History - 4
Morphology - 11
Sexually Selected - 7
(b)
(a)
(c)
Blows and McGuigan 2015, Molecular Ecology
Random Matrix Theory: The quarter-‐circle law for P matrices
BA
1 2 3 4 5 6 7 8 9 10
Trait
1.0
0.6
0.8
1.2
1.4
1.6
VP
0.5
1.0
1.5
0.00.6 0.8 1.0 1.2 1.4 1.6
Eigenvalues
De
nsity
10 traits drawn from N(0,1), with 250 individuals measured, and the iden,ty matrix as the covariance matrix
The bulk and edge behaviour of eigenvalues of symmetrical matrices follow some surprisingly universal laws
Marchenko-‐Pastur distribu,on
Individual trait
1st eigenvalue
Blows and McGuigan 2015, Molecular Ecology
SNP relatedness matrices will conform to this distribu,on
The need for a null model in mul,variate quan,ta,ve gene,cs
€
1v 1,2cov … 1,ncov2v 2,ncov
nv
"
#
$ $ $ $ $ $
%
&
' ' ' ' ' '
1. 10 traits drawn at random from N(0,1), with the iden,ty matrix as the covariance matrix
2. Experimental design; 50 lines, 5 individuals per line.
3. Use mul,variate linear model to es,mate among-‐line G
for 200 replica,ons of a 10-‐trait G matrix:
Eigenvector1 2 3 4 5 6 7 8 9 10
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
Eig
enva
lue
AB
-5 -4 -3 -2 -1 0 54321 6
600
0
500
300
200
100
400
Fre
quency
TW Statistic
Clearly, there is a need for determining how real data deviate from the random expecta,on for the spectral distribu,on
If, Then,
The leading eigenvalue of G and the Tracy-‐Widom Law
200 leading eigenvalues from the simulated G (10 of 200 significantly deviate from random)
Caveat Analy,cal expressions for the centering and scaling parameters of the TW distribu,on for variance-‐component matrices are unknown at present. Approxima,on method of Saccen, et al 2011 used.
10000 samples from the Tracy-‐Widom distribu,on for the leading eigenvalue
5% cut-‐off value of 0.979
-5 -4 -3 -2 -1 0 54321
600
0
500
300
200
100
400
Fre
quency
TW Statistic
Blows and McGuigan 2015, Molecular Ecology
Using the TW distribu,on for real data: 40 5-‐
dimensional G matrices
-5 -4 -3 -2 -1 0 54321 6 7 8
600
0
500
300
200
100
400
Fre
qu
en
cy
70
-4 -3 -2 -1 0 43210
60
50
40
30
20
10
Fre
qu
en
cy
1451cases
BA
40 leading eigenvalues from es,mated G in Pitchers et al (2014) data set
(26 significantly deviate from random expecta,on)
Grey bars: 10000 samples from the TW distribu,on Open bars: 10000 simulated values of the TW sta,s,c for the leading eigenvalues of 5-‐d correla,on matrices with random off-‐diagonal elements
Analy,cal steps: 1. Simulate the structure of the real
matrices (here, 5-‐d correla,on matrices)
2. Scale leading eigenvalues to the TW distribu,on to establish scaling parameters
3. Use scaling parameters to adjust observed leading eigenvalues
Blows and McGuigan 2015, Molecular Ecology
What is the extent of gene,c covariance across the phenome?
Grey bars: TW distribu,on Open bars: Simulated 5 traits sets Dark bars: leading eigenvalue from 5-‐d G
Gene,c analysis of gene expression; 30 inbred lines derived from a natural popula,on of D. serrata (data from McGuigan et al 2014, Gene,cs)
Vast majority (95%) of 5-‐trait G had a leading eigenvalue larger than expected by chance
Approach: randomly allocate 8750 expression traits with significant gene,c variance at 5% FDR to one of 1756 5-‐trait sets
-4 -3 -2 -1 0 43210
60
40
20
Fre
quency
TW Statistic-5
80
100
120
Fre
quency
0.0 1.00.80.60.40.2Broad Sense Heritability
0
100
200
300
400
500
BA
What is the extent of muta,onal covariance? Muta,onal pleiotropy in gene expression
8,000
600
500
400
300
200
100
00.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Count
ofT
raits
Among-Line Variance
~
A
B
600
500
400
300
200
100
0
Co
un
to
fT
raits
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Among-Line Variance
Average = 0.344
Average = 0.033
• Most traits (71%) unaffected by muta,on in 27 genera,ons
• 3385 traits (29%) with non-‐zero VM • Mean muta,onal heritability of 0.001 (0.1%
of the phenotypic variance generated by muta,on each genera,on)
• Randomly allocate the 3385 traits to 5-‐trait sets
• 21% of trait sets displayed significant muta,onal gene,c covariance (at 5% FDR)
• Suggests a muta,on affects 70 traits on average (assuming modules of equal size)
McGuigan et al 2014, Gene,cs
Standing VG
Muta,onal VM
Univariate parameters
Evidence for muta,onal pleiotropy
Gene,c analysis of 41 muta,on accumula,ons lines of D. serrata
Why do we see widespread gene,c covariance among small sets of random traits?
G =
B1,1 B1,2 B1,m
B2,2
Bm,m
!
"
#####
$
%
&&&&&
K =
B1,1 0 0
B2,2
0Bm,m
!
"
#####
$
%
&&&&&
Gnk =
B1,11/2 0 0
B2,21/2 0 0
0Bm,m
1/2 0 0
!
"
######
$
%
&&&&&&
B1,11/2 B2,2
1/2 Bm,m1/2
0 0 0 0 0 0 0
!
"
#####
$
%
&&&&&
Take the 8750 expression traits, arranged in the 5x5 matrices Bi,j. There are too many elements in G to es,mate (3.83 x 107)
Es,mate only the 5x5 principal submatrices along the diagonal of K. Only need to es,mate 223125 (0.6%) elements
Complete an approxima,on of the 8750-‐d G using a geometric mean approach
What does the eigenstructure of BIG G reveal?
Blows et al 2015, American Naturalist
Pleiotropy exists among a large number of traits
500
400
300
200
100
0
-0.36 -0.24 -0.12 0.00 0.240.12Trait Loading on gmax
Fre
qu
en
cy
BA
0 5 10 15 20 25 30 35-15 -10 -50
10
20
2
4
6
8
12
14
16
18
22
Fre
qu
en
cy
Sum of +/- Loadings
Null distribu,on of loadings on the leading eigenvector
Large number of traits influenced in the same direc,on by gmax
Trait contribu,ons to the leading eigenvector (gmax)
Consistent with the ubiquitous presence of correlated responses to selec,on
Blows et al 2015, American Naturalist
Gene enrichment analysis of GO terms 500
400
300
200
100
0
-0.36 -0.24 -0.12 0.00 0.240.12Trait Loading on gmax
Fre
qu
en
cy
BA
0 5 10 15 20 25 30 35-15 -10 -50
10
20
2
4
6
8
12
14
16
18
22
Fre
qu
en
cy
Sum of +/- Loadings
100 genes with lowest contribu,ons to gmax
2 GO terms enriched:
transferase ac,vity (GO:0016740) cell surface (GO:0009986)
100 genes with highest contribu,ons to gmax
31 GO terms enriched:
Captured many processes related to regula,on of gene expression, including transcrip,on factors
Blows et al 2015, American Naturalist
PLEIOTROPY
1. Reduces the availability of gene,c variance in some dimensions, resul,ng in gene,c nearly-‐null subspaces
3. Gene,c nearly-‐null subspaces likely to be common, but more work in RMT needed
2. Response to selec,on governed by the spectral distribu,on of gene,c variance
4. Muta,onal pleiotropy is widespread among expression traits for selec,on to act upon
5. Pleiotropy can be among a very large number of traits of disparate func,on