Prediction of genetic Values using Neural Networks

Prediction of genetic Values using Neural Networks

Paulino Perez 1

Daniel Gianola 2

Jose Crossa 1

1CIMMyT-Mexico 2University of Wisconsin, Madison.

September, 2014

SLU,Sweden Prediction of genetic Values using Neural Networks 1/26

Contents

1 Introduction

2 Non linear models and NN

3 Model fitting

4 Case study: Wheat

5 Application examples


Introduction

Introduction

High density marker panels enable genomic selection (GS).Marker based models performs better than pedigree based models (e.g.de los Campos et al., 2009).Most research done with linear additive models (see eq. 1).It might be possible to increase accuracy using non-linear models withdominance and additive effects.

yi =

p∑j=1

xijβj + ei (1)


Introduction

Continued...Recent studies with non-additive effects:


Introduction

Continued...


Non linear models and NN

Non linear models and neural networks

yi = µ+ f (x i) + ei (2)

Any non linear function can be exactly represented as (Kolmogorov’stheorem):

f (x i) = f (xi1, ..., xip) =

2p+1∑q=1

g

( p∑r=1

λr hq(xir )

)(3)

In Neural Networks (NN) non-linear functions are “approximated” assums of finite series of smooth functions.Most basic and well known NN is the Single Hidden Layer Feed ForwardNeural Network (SHLNN).



Continued...

Figure 1: Graphical representation of a SHLNN.



Continued...

Figure 2: Inputs (e.g. Markers) and output (phenotype) for a SHLNN.SLU,Sweden Prediction of genetic Values using Neural Networks 8/26


Continued...

Prediction has two (automated) steps:

Inputs transformed non-linearly in the hidden layer.Outputs from hidden layer combined to obtain predictions.

yi = µ+

Combine output from hidden layer︷︸︸︷S∑

k=1

wk gk

bk +

p∑j=1

xijβ[k ]j

︸︷︷︸

output from hidden layer

+ei

gk (·) is the activation (transformation) function.


Model fitting

Model fitting

Parameters to be estimated in a NN are the weights (w1, ...,wS) , biases(b1, ...,bS), connection strengths (β

[1]1 , ...., β

[1]p ; ..., β

[S]1 , ...., β

[S]p ), µ and σ2

e .

When number of predictors (p) and of neurons (S) increase, the numberof parameters to estimate grows quickly.

=⇒ Can cause over-fitting.

To prevent over fitting use penalized methods, via Bayesian approaches.


Model fitting Empirical Bayes

Contents

1 Introduction

2 Non linear models and NN

3 Model fittingEmpirical Bayes

4 Case study: Wheat

5 Application examples



Empirical Bayes

McKay (1995) developed Empirical Bayes approach framework forestimating parameters in a NN.

Let θ = (w1, ...,wS,b1, ...,bS, β[1]1 , ...., β

[1]p ; ..., β

[S]1 , ...., β

[S]p , µ)′

p(θ|σ2θ) = MN(0, σ2

θ I)

Estimation requires two steps,1) Obtain conditional posterior modes of the elements in θ assuming σ2

θ, σ2e

known. These are obtained by maximizing,

p(θ|y , σ2θ, σ

2e) =

p(y |θ, σ2e)p(θ|σ2

θ)

p(y |σ2θ, σ

2e)

=p(y |θ, σ2

e)p(θ|σ2θ)∫

Rm p(y |θ, σ2e)p(θ|σ2

θ)dθ

which is equivalent to minimizing the “augmented” sum of squares:

F (θ) =1

2σ2e

n∑i=1

ei +1

2σ2θ

m∑j=1

θ2j (4)



Continued...

2) Update σ2θ , σ

2e by maximizing marginal likelihood of the data p(y |σ2

θ , σ2e).

The marginal log-likelihood aproximated as:

log p(y |σ2θ , σ

2e) ≈ k +

n2

logβ +m2

logα− 12

log |Σ|θ=θmap − F (θ)|θ=θmap

where Σ = ∂2

∂θθ′ F (θ).It can be shown that this function is maximized when:

α =γ

2∑m

j=1 θ2j, β =

n − γ∑ni=1 e2

i, γ = m − 2αTrace(Σ−1)

Iterate between 1 and 2 until convergence.

NOTE:SIMILAR TO USING BLUP AND ML IN GAUSSIAN LINEAR MODELS.



Problems with the approach

Huge number of parameters to estimate,

m = 1 + S × (1 + 1 + p)

where S is the number of neurons and p is the number of covariates.Gauss-Newton algorithm used to minimize (4) requires solving linearsystems of order m ×m, complexity O(m3).Updating formulas for the variance components requires inverting amatrix of order m ×m, complexity O(m3).

Alternatives:

Derivative free algorithms (may have poor performance, unstable).Parallel computing.



brnn

We developed an R package (brnn) that implements the Empirical Bayesapproach to fiting a NN. It will be available in a few months in the R-mirrors.

Figure 3: Help page for the trainbr package.


Case study: Wheat

Case study: additive genetic effects (wheat)

Prediction of Grain yield (GY) and Days to heading (DTH) in wheat lines,

306 wheat lines from Global Wheat Program of CIMMyT.1,717 binary markers (DArT).Two traits analyzed:

1 GY (5 Environments).2 DTH (10 Environments).

Bayesian regularized neural networks fitted by using the MCMC approach.

Predictive ability of BRNN compared against standard models by generating50 random partitions with 90% of observations in training and 10% in testing.


Case study: Wheat

Continued...

Table 1: Correlations between observed and predictedphenotypes for DTH and GY (“winner” underlined).

NOTE:Non-parametricmethods better in15/15comparisons.


Case study: Wheat

Continued...

Figure 4: Plot of the correlation for each of 50 partitions and 10 environments for daysto heading (DTH) in different combination of models.


Application examples

Toy examples

#Example 1#Noise triangle wave function, similar to example 1 in Foresee and Hagan (1997)

#Generating the datax1=seq(0,0.23,length.out=25)y1=4*x1+rnorm(25,sd=0.1)x2=seq(0.25,0.75,length.out=50)y2=2-4*x2+rnorm(50,sd=0.1)x3=seq(0.77,1,length.out=25)y3=4*x3-4+rnorm(25,sd=0.1)x=c(x1,x2,x3)y=c(y1,y2,y3)X=as.matrix(x)

neurons=2out=brnn(y,X,neurons=neurons)cat("Message: ",out$reason,"\n")

plot(x,y,xlim=c(0,1),ylim=c(-1.5,1.5),main="Bayesian Regularization for ANN 1-2-1")

Note:1 Type library(brnn) and then demo(’Example_1’) to run this example in the

R console.SLU,Sweden Prediction of genetic Values using Neural Networks 19/26


Continued...

•••

•••••

••••••••••••

•••••••••••••••

••••••

•••••••

••••••

•

••••••••••••

••••••••

•••••••••••

•••••••••••

•••

0.0 0.2 0.4 0.6 0.8 1.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5

x

yMatlabR



Continued...

#2 Inputs and 1 output#the data used in Paciorek and#Schervish (2004). The data is from a two input one output function with Gaussian noise#with mean zero and standard deviation 0.25.

data(twoinput)X=normalize(as.matrix(twoinput[,1:2]))y=as.vector(twoinput[,3])

neurons=10out=brnn(y,X,neurons=neurons)cat("Message: ",out$reason,"\n")

f=function(x1,x2,theta,neurons) predictions.nn(X=cbind(x1,x2),theta,neurons)x1=seq(min(X[,1]),max(X[,1]),length.out=50)x2=seq(min(X[,1]),max(X[,1]),length.out=50)z=outer(x1,x2,f,theta=out$theta,neurons=neurons) # calculating the density values

transformation_matrix=persp(x1, x2, z,main="Fitted model",sub=expression(y==italic(g)~(bold(x))+e),col="lightgreen",theta=30, phi=20,r=50, d=0.1,expand=0.5,ltheta=90, lphi=180,shade=0.75, ticktype="detailed",nticks=5)

points(trans3d(X[,1],X[,2], f(X[,1],X[,2],theta=out$theta,neurons=neurons), transformation_matrix), col = "red")



Continued...

x1

−1.0−0.5

0.00.5

x2

−1.0

−0.5

0.0

0.5

z

−2

0

2

y = g (x) + e

•

•

••

••

•

•

• ••

•

•

•••

•

••

• •

• •

••

•

••

• ••

•

••

•

•

•

•••

••

••

•

•••

•

•

•

•

••

•

•

•

•

•

•••

•

•

••

•

•

•

•

•

••

•• ••

•

•

••

•

•

•

•

••

••

•••

•

• •• ••••

•

• ••

••

••

•

•

••

•

•

•

•

••

•

• •

•

•

•• •

••

•

••

••

••

•

•

••

•

•

•

•

•

•

•

••

••

••

•• •• ••

•••

• ••• •

••

•

• •

•

••

•

•

••

••

••

•

•

•• ••

•

• •••

••

•

•

•

••

•

•

••

•

•

•

•

•

•

•

•

• • ••

••

•

•

• ••

•

•



Application for the wheat datasetWarning: This analysis can take a while,... We are selected only somemarkers. You can select markers based on p-values for example or try toreduce the dimensionality of your problem using G matrix as input or principalscores.rm(list=ls())setwd("/tmp")library(brnn)library(BLR)#Load the wheat datasetdata(wheat)

#Normalize inputsy=normalize(Y[,1])X=normalize(X)

p=300

#Fit the model with the FULL DATA, but some markers,#You can select the markers based on p-values for exampleout=brnn(y=y,X=X[,1:p],neurons=2)cat("Message: ",out$reason,"\n")

#Obtain predictionsyhat_R=predictions.nn(X[,1:p],out$theta,neurons=2)plot(y,yhat_R)



Continued...

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

● ●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●●

●

● ●

●

●

●

●

●●

●●

●

●●

●

●

●

●●

●

●

● ●●

●

●

●●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ● ●●

●

●

●

●

●

●

●

●

●

●

● ●

●●

●

●●

●

●

●

●

●

●●

●

●

●

●

● ●

●●

●

●

●

●●

● ●

●

●

●●

●

●

● ●

●

●●

●

●

●

●

●

●

●●

●

●●

●

●

● ●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

● ●

●

●

●●

●●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●●●

●●

●●

●

●

●

●

●●

●

●

●

●

●●

●●

●

●● ●

●

●

●

●

●

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

y

yhat

_R

Notes:The function predictions.nnobtains y . This function takes asarguments the vector of estimatedparameters and the number ofneurons.The vector of estimatedparameters can be obtained usingthe function brnn.The brnn software works faster inthe R version developed byRevolution Analytics in Linuxenvironments.



References

de los Campos G., H. Naya, D. Gianola, J. Crossa, A. Legarra, E.Manfredi, K. Weigel and J. Cotes. 2009.Predicting Quantitative Traits with Regression Models for DenseMolecular Markers and Pedigree,Genetics 182: 375-385.

Foresee, F. D., and M. T. Hagan. 1997.Gauss-Newton approximation to Bayesian regularization,Proceedings of the 1997 International Joint Conference on NeuralNetworks.

Gianola D., Fernando R, Stella A. 2006.Genomic-assisted prediction of genetic values with semi-parametricprocedures,Genetics 173:1761-1776.

Gianola D, van Kamm JBCHM. 2008.Reproducing kernel Hilbert space regression methods forgenomic-assisted prediction of quantitative traits.Genetics 178: 2289-2303.



Continued...

Gianola, D. Okut, H., Weigel, K. and Rosa, G. 2011.Predicting complex quantitative traits with Bayesian neural networks: acase study with Jersey cows and wheat.BMC Genetics.

MacKay, D.1995.Probable Networks and plausible predictions - a review of practicalBayesian methods,Network: Computation in Neural Systems.


Date post:	16-Oct-2021
Category:	Documents
Upload:	others
View:	13 times
Download:	0 times

Prediction of genetic Values using Neural Networks

Documents