+ All Categories
Home > Documents > Prediction of genetic Values using Neural Networks

Prediction of genetic Values using Neural Networks

Date post: 16-Oct-2021
Category:
Upload: others
View: 13 times
Download: 0 times
Share this document with a friend
26
Prediction of genetic Values using Neural Networks Paulino Perez 1 Daniel Gianola 2 Jose Crossa 1 1 CIMMyT-Mexico 2 University of Wisconsin, Madison. September, 2014 SLU,Sweden Prediction of genetic Values using Neural Networks 1/26
Transcript
Page 1: Prediction of genetic Values using Neural Networks

Prediction of genetic Values using Neural Networks

Paulino Perez 1

Daniel Gianola 2

Jose Crossa 1

1CIMMyT-Mexico 2University of Wisconsin, Madison.

September, 2014

SLU,Sweden Prediction of genetic Values using Neural Networks 1/26

Page 2: Prediction of genetic Values using Neural Networks

Contents

1 Introduction

2 Non linear models and NN

3 Model fitting

4 Case study: Wheat

5 Application examples

SLU,Sweden Prediction of genetic Values using Neural Networks 2/26

Page 3: Prediction of genetic Values using Neural Networks

Introduction

Introduction

High density marker panels enable genomic selection (GS).Marker based models performs better than pedigree based models (e.g.de los Campos et al., 2009).Most research done with linear additive models (see eq. 1).It might be possible to increase accuracy using non-linear models withdominance and additive effects.

yi =

p∑j=1

xijβj + ei (1)

SLU,Sweden Prediction of genetic Values using Neural Networks 3/26

Page 4: Prediction of genetic Values using Neural Networks

Introduction

Continued...Recent studies with non-additive effects:

SLU,Sweden Prediction of genetic Values using Neural Networks 4/26

Page 5: Prediction of genetic Values using Neural Networks

Introduction

Continued...

SLU,Sweden Prediction of genetic Values using Neural Networks 5/26

Page 6: Prediction of genetic Values using Neural Networks

Non linear models and NN

Non linear models and neural networks

yi = µ+ f (x i) + ei (2)

Any non linear function can be exactly represented as (Kolmogorov’stheorem):

f (x i) = f (xi1, ..., xip) =

2p+1∑q=1

g

( p∑r=1

λr hq(xir )

)(3)

In Neural Networks (NN) non-linear functions are “approximated” assums of finite series of smooth functions.Most basic and well known NN is the Single Hidden Layer Feed ForwardNeural Network (SHLNN).

SLU,Sweden Prediction of genetic Values using Neural Networks 6/26

Page 7: Prediction of genetic Values using Neural Networks

Non linear models and NN

Continued...

Figure 1: Graphical representation of a SHLNN.

SLU,Sweden Prediction of genetic Values using Neural Networks 7/26

Page 8: Prediction of genetic Values using Neural Networks

Non linear models and NN

Continued...

Figure 2: Inputs (e.g. Markers) and output (phenotype) for a SHLNN.SLU,Sweden Prediction of genetic Values using Neural Networks 8/26

Page 9: Prediction of genetic Values using Neural Networks

Non linear models and NN

Continued...

Prediction has two (automated) steps:

Inputs transformed non-linearly in the hidden layer.Outputs from hidden layer combined to obtain predictions.

yi = µ+

Combine output from hidden layer︷ ︸︸ ︷S∑

k=1

wk gk

bk +

p∑j=1

xijβ[k ]j

︸ ︷︷ ︸

output from hidden layer

+ei

gk (·) is the activation (transformation) function.

SLU,Sweden Prediction of genetic Values using Neural Networks 9/26

Page 10: Prediction of genetic Values using Neural Networks

Model fitting

Model fitting

Parameters to be estimated in a NN are the weights (w1, ...,wS) , biases(b1, ...,bS), connection strengths (β

[1]1 , ...., β

[1]p ; ..., β

[S]1 , ...., β

[S]p ), µ and σ2

e .

When number of predictors (p) and of neurons (S) increase, the numberof parameters to estimate grows quickly.

=⇒ Can cause over-fitting.

To prevent over fitting use penalized methods, via Bayesian approaches.

SLU,Sweden Prediction of genetic Values using Neural Networks 10/26

Page 11: Prediction of genetic Values using Neural Networks

Model fitting Empirical Bayes

Contents

1 Introduction

2 Non linear models and NN

3 Model fittingEmpirical Bayes

4 Case study: Wheat

5 Application examples

SLU,Sweden Prediction of genetic Values using Neural Networks 11/26

Page 12: Prediction of genetic Values using Neural Networks

Model fitting Empirical Bayes

Empirical Bayes

McKay (1995) developed Empirical Bayes approach framework forestimating parameters in a NN.

Let θ = (w1, ...,wS,b1, ...,bS, β[1]1 , ...., β

[1]p ; ..., β

[S]1 , ...., β

[S]p , µ)′

p(θ|σ2θ) = MN(0, σ2

θ I)

Estimation requires two steps,1) Obtain conditional posterior modes of the elements in θ assuming σ2

θ, σ2e

known. These are obtained by maximizing,

p(θ|y , σ2θ, σ

2e) =

p(y |θ, σ2e)p(θ|σ2

θ)

p(y |σ2θ, σ

2e)

=p(y |θ, σ2

e)p(θ|σ2θ)∫

Rm p(y |θ, σ2e)p(θ|σ2

θ)dθ

which is equivalent to minimizing the “augmented” sum of squares:

F (θ) =1

2σ2e

n∑i=1

ei +1

2σ2θ

m∑j=1

θ2j (4)

SLU,Sweden Prediction of genetic Values using Neural Networks 12/26

Page 13: Prediction of genetic Values using Neural Networks

Model fitting Empirical Bayes

Continued...

2) Update σ2θ , σ

2e by maximizing marginal likelihood of the data p(y |σ2

θ , σ2e).

The marginal log-likelihood aproximated as:

log p(y |σ2θ , σ

2e) ≈ k +

n2

logβ +m2

logα− 12

log |Σ|θ=θmap − F (θ)|θ=θmap

where Σ = ∂2

∂θθ′ F (θ).It can be shown that this function is maximized when:

α =γ

2∑m

j=1 θ2j, β =

n − γ∑ni=1 e2

i, γ = m − 2αTrace(Σ−1)

Iterate between 1 and 2 until convergence.

NOTE:SIMILAR TO USING BLUP AND ML IN GAUSSIAN LINEAR MODELS.

SLU,Sweden Prediction of genetic Values using Neural Networks 13/26

Page 14: Prediction of genetic Values using Neural Networks

Model fitting Empirical Bayes

Problems with the approach

Huge number of parameters to estimate,

m = 1 + S × (1 + 1 + p)

where S is the number of neurons and p is the number of covariates.Gauss-Newton algorithm used to minimize (4) requires solving linearsystems of order m ×m, complexity O(m3).Updating formulas for the variance components requires inverting amatrix of order m ×m, complexity O(m3).

Alternatives:

Derivative free algorithms (may have poor performance, unstable).Parallel computing.

SLU,Sweden Prediction of genetic Values using Neural Networks 14/26

Page 15: Prediction of genetic Values using Neural Networks

Model fitting Empirical Bayes

brnn

We developed an R package (brnn) that implements the Empirical Bayesapproach to fiting a NN. It will be available in a few months in the R-mirrors.

Figure 3: Help page for the trainbr package.

SLU,Sweden Prediction of genetic Values using Neural Networks 15/26

Page 16: Prediction of genetic Values using Neural Networks

Case study: Wheat

Case study: additive genetic effects (wheat)

Prediction of Grain yield (GY) and Days to heading (DTH) in wheat lines,

306 wheat lines from Global Wheat Program of CIMMyT.1,717 binary markers (DArT).Two traits analyzed:

1 GY (5 Environments).2 DTH (10 Environments).

Bayesian regularized neural networks fitted by using the MCMC approach.

Predictive ability of BRNN compared against standard models by generating50 random partitions with 90% of observations in training and 10% in testing.

SLU,Sweden Prediction of genetic Values using Neural Networks 16/26

Page 17: Prediction of genetic Values using Neural Networks

Case study: Wheat

Continued...

Table 1: Correlations between observed and predictedphenotypes for DTH and GY (“winner” underlined).

NOTE:Non-parametricmethods better in15/15comparisons.

SLU,Sweden Prediction of genetic Values using Neural Networks 17/26

Page 18: Prediction of genetic Values using Neural Networks

Case study: Wheat

Continued...

Figure 4: Plot of the correlation for each of 50 partitions and 10 environments for daysto heading (DTH) in different combination of models.

SLU,Sweden Prediction of genetic Values using Neural Networks 18/26

Page 19: Prediction of genetic Values using Neural Networks

Application examples

Toy examples

#Example 1#Noise triangle wave function, similar to example 1 in Foresee and Hagan (1997)

#Generating the datax1=seq(0,0.23,length.out=25)y1=4*x1+rnorm(25,sd=0.1)x2=seq(0.25,0.75,length.out=50)y2=2-4*x2+rnorm(50,sd=0.1)x3=seq(0.77,1,length.out=25)y3=4*x3-4+rnorm(25,sd=0.1)x=c(x1,x2,x3)y=c(y1,y2,y3)X=as.matrix(x)

neurons=2out=brnn(y,X,neurons=neurons)cat("Message: ",out$reason,"\n")

plot(x,y,xlim=c(0,1),ylim=c(-1.5,1.5),main="Bayesian Regularization for ANN 1-2-1")

Note:1 Type library(brnn) and then demo(’Example_1’) to run this example in the

R console.SLU,Sweden Prediction of genetic Values using Neural Networks 19/26

Page 20: Prediction of genetic Values using Neural Networks

Application examples

Continued...

•••

•••••

••••••••••••

•••••••••••••••

••••••

•••••••

••••••

••••••••••••

••••••••

•••••••••••

•••••••••••

•••

0.0 0.2 0.4 0.6 0.8 1.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

x

yMatlabR

SLU,Sweden Prediction of genetic Values using Neural Networks 20/26

Page 21: Prediction of genetic Values using Neural Networks

Application examples

Continued...

#2 Inputs and 1 output#the data used in Paciorek and#Schervish (2004). The data is from a two input one output function with Gaussian noise#with mean zero and standard deviation 0.25.

data(twoinput)X=normalize(as.matrix(twoinput[,1:2]))y=as.vector(twoinput[,3])

neurons=10out=brnn(y,X,neurons=neurons)cat("Message: ",out$reason,"\n")

f=function(x1,x2,theta,neurons) predictions.nn(X=cbind(x1,x2),theta,neurons)x1=seq(min(X[,1]),max(X[,1]),length.out=50)x2=seq(min(X[,1]),max(X[,1]),length.out=50)z=outer(x1,x2,f,theta=out$theta,neurons=neurons) # calculating the density values

transformation_matrix=persp(x1, x2, z,main="Fitted model",sub=expression(y==italic(g)~(bold(x))+e),col="lightgreen",theta=30, phi=20,r=50, d=0.1,expand=0.5,ltheta=90, lphi=180,shade=0.75, ticktype="detailed",nticks=5)

points(trans3d(X[,1],X[,2], f(X[,1],X[,2],theta=out$theta,neurons=neurons), transformation_matrix), col = "red")

SLU,Sweden Prediction of genetic Values using Neural Networks 21/26

Page 22: Prediction of genetic Values using Neural Networks

Application examples

Continued...

x1

−1.0−0.5

0.00.5

x2

−1.0

−0.5

0.0

0.5

z

−2

0

2

y = g (x) + e

••

••

• ••

•••

••

• •

• •

••

••

• ••

••

•••

••

••

•••

••

•••

••

••

•• ••

••

••

••

•••

• •• ••••

• ••

••

••

••

••

• •

•• •

••

••

••

••

••

••

••

••

•• •• ••

•••

• ••• •

••

• •

••

••

••

••

•• ••

• •••

••

••

••

• • ••

••

• ••

SLU,Sweden Prediction of genetic Values using Neural Networks 22/26

Page 23: Prediction of genetic Values using Neural Networks

Application examples

Application for the wheat datasetWarning: This analysis can take a while,... We are selected only somemarkers. You can select markers based on p-values for example or try toreduce the dimensionality of your problem using G matrix as input or principalscores.rm(list=ls())setwd("/tmp")library(brnn)library(BLR)#Load the wheat datasetdata(wheat)

#Normalize inputsy=normalize(Y[,1])X=normalize(X)

p=300

#Fit the model with the FULL DATA, but some markers,#You can select the markers based on p-values for exampleout=brnn(y=y,X=X[,1:p],neurons=2)cat("Message: ",out$reason,"\n")

#Obtain predictionsyhat_R=predictions.nn(X[,1:p],out$theta,neurons=2)plot(y,yhat_R)

SLU,Sweden Prediction of genetic Values using Neural Networks 23/26

Page 24: Prediction of genetic Values using Neural Networks

Application examples

Continued...

● ●

●●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

● ●●

●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

● ● ●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

y

yhat

_R

Notes:The function predictions.nnobtains y . This function takes asarguments the vector of estimatedparameters and the number ofneurons.The vector of estimatedparameters can be obtained usingthe function brnn.The brnn software works faster inthe R version developed byRevolution Analytics in Linuxenvironments.

SLU,Sweden Prediction of genetic Values using Neural Networks 24/26

Page 25: Prediction of genetic Values using Neural Networks

Application examples

References

de los Campos G., H. Naya, D. Gianola, J. Crossa, A. Legarra, E.Manfredi, K. Weigel and J. Cotes. 2009.Predicting Quantitative Traits with Regression Models for DenseMolecular Markers and Pedigree,Genetics 182: 375-385.

Foresee, F. D., and M. T. Hagan. 1997.Gauss-Newton approximation to Bayesian regularization,Proceedings of the 1997 International Joint Conference on NeuralNetworks.

Gianola D., Fernando R, Stella A. 2006.Genomic-assisted prediction of genetic values with semi-parametricprocedures,Genetics 173:1761-1776.

Gianola D, van Kamm JBCHM. 2008.Reproducing kernel Hilbert space regression methods forgenomic-assisted prediction of quantitative traits.Genetics 178: 2289-2303.

SLU,Sweden Prediction of genetic Values using Neural Networks 25/26

Page 26: Prediction of genetic Values using Neural Networks

Application examples

Continued...

Gianola, D. Okut, H., Weigel, K. and Rosa, G. 2011.Predicting complex quantitative traits with Bayesian neural networks: acase study with Jersey cows and wheat.BMC Genetics.

MacKay, D.1995.Probable Networks and plausible predictions - a review of practicalBayesian methods,Network: Computation in Neural Systems.

SLU,Sweden Prediction of genetic Values using Neural Networks 26/26


Recommended