+ All Categories
Home > Documents > Statistical analysis methods

Statistical analysis methods

Date post: 25-Feb-2016
Category:
Upload: lael
View: 51 times
Download: 0 times
Share this document with a friend
Description:
Hugh Morgan. Statistical analysis methods. Introduction. Role of statistics Current Methods EuroPhenome Numerical Parameters Categorical Parameters MGP Problems with these methods and alternatives Worked Example. Tasks. Role of statistics. - PowerPoint PPT Presentation
Popular Tags:
30
An International Centre for Mouse Genetics STATISTICAL ANALYSIS METHODS Hugh Morgan
Transcript
Page 1: Statistical analysis methods

An International Centre for Mouse Genetics

STATISTICAL ANALYSIS METHODS

Hugh Morgan

Page 2: Statistical analysis methods

An International Centre for Mouse Genetics

Introduction

• Role of statistics

• Current Methods• EuroPhenome

• Numerical Parameters• Categorical Parameters

• MGP

• Problems with these methods and alternatives

• Worked Example.

• Tasks.

Page 3: Statistical analysis methods

An International Centre for Mouse Genetics

Role of statistics

• To determine the effect of the genomic alteration on the phenotype

of the animal

• Distinguish effect from substantial multi-factorial noise

• Provide an estimate of the confidence in the veracity of the effect

Page 4: Statistical analysis methods

An International Centre for Mouse Genetics

Current Methods

• EuroPhenome• Numerical Parameters - Wilcoxon rank-sum test• Categorical Parameters – Fishers Exact or Chi-Squared• p-value threashold: 0.0001 (equivalent to 4% change of a false

positive in 400 measured parameters)

• Sanger Mouse Portal / MGP• Numerical Parameters – Reference Range• Categorical Parameters – Fishers Exact with absolute change

threshold

Page 5: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• All commands are at:• http://mrcmousenetwork.har.mrc.ac.uk/r-commands-mrc-mouse-network-

training• Get data:

• Akt2, Fat mass, View Data, Get as CSV, Save Page• Install R (if required, google R)

• akt2Fat=read.csv("akt2Fat.csv")• summary(akt2Fat)

• Wilcoxon rank-sum test• wilcox.test(Value~Genotype, data = akt2Fat)

• W = 1, p-value = 6.252e-06• T Test

• t.test(Value~Genotype, data = akt2Fat)• t = -9.5627, df = 23.909, p-value = 1.212e-09

Page 6: Statistical analysis methods

An International Centre for Mouse Genetics

Page 7: Statistical analysis methods

An International Centre for Mouse Genetics

Page 8: Statistical analysis methods

An International Centre for Mouse Genetics

Page 9: Statistical analysis methods

An International Centre for Mouse Genetics

Page 10: Statistical analysis methods

An International Centre for Mouse Genetics

Page 11: Statistical analysis methods

An International Centre for Mouse Genetics

Page 12: Statistical analysis methods

An International Centre for Mouse Genetics

Page 13: Statistical analysis methods

An International Centre for Mouse Genetics

Page 14: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• All commands are at:• http://mrcmousenetwork.har.mrc.ac.uk/r-commands-mrc-mouse-network-

training• Get data:

• Akt2, Fat mass, View Data, Get as CSV, Save Page• Install R (if required, google R)

• akt2Fat=read.csv("akt2Fat.csv")• summary(akt2Fat)

• Wilcoxon rank-sum test• wilcox.test(Value~Genotype, data = akt2Fat)

• W = 1, p-value = 6.252e-06• T Test

• t.test(Value~Genotype, data = akt2Fat)• t = -9.5627, df = 23.909, p-value = 1.212e-09

Page 15: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Get data:• Abcd4, Touch escape

• R• abcd4Touch=matrix(c(122,9,2,8),2)

• Fishers Exact Test• fisher.test(abc4Touch)

Page 16: Statistical analysis methods

An International Centre for Mouse Genetics

• abcd4Touch=matrix(c(122,9,2,8),2)

Page 17: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Get data:• Abcd4, Touch escape

• R• abcd4Touch=matrix(c(122,9,2,8),2)

• Fishers Exact Test• fisher.test(abcd4Touch)

Fisher's Exact Test for Count Data

data: abcd4Touch p-value = 3.052e-07alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 8.491575 550.552750 sample estimates:odds ratio 50.40908

Page 18: Statistical analysis methods

An International Centre for Mouse Genetics

Sanger Mouse Portal / MGP

• Numerical Parameters – Reference Range• Calculate the range of values that encompases 95% of the

baseline dataset• Call a line phenodeviant in a parameter if 60% or more of the

animals fall outside of that range

• Categorical Parameters – Fishers Exact with absolute change threshold• Fishers Exact test gives p-value < 5% AND• Absolute change of proportion > 60%

Page 19: Statistical analysis methods

An International Centre for Mouse Genetics

Sanger Mouse Portal / MGP

• Numerical Parameters – Reference Range• Calculate the range of values that encompases 95% of the

baseline dataset• Call a line phenodeviant in a parameter if 60% or more of the

animals fall outside of that range

Page 20: Statistical analysis methods

An International Centre for Mouse Genetics

Sanger Mouse Portal / MGP

• Numerical Parameters – Reference Range• Calculate the range of values that encompases 95% of the

baseline dataset• Call a line phenodeviant in a parameter if 60% or more of the

animals fall outside of that range

• Categorical Parameters – Fishers Exact with absolute change threshold• Fishers Exact test gives p-value < 5% AND• Absolute change of proportion > 60%

Page 21: Statistical analysis methods

An International Centre for Mouse Genetics

Problems with these methods and alternatives

• Local structure / Lack of independence

• Numerical Parameters - Wilcoxon rank-sum test• Categorical Parameters – Fishers Exact or Chi-Squared

• MGP• Numerical Parameters – Reference Range• Categorical Parameters – Fishers Exact with absolute change

threshold

Page 22: Statistical analysis methods

An International Centre for Mouse Genetics

Problems with these methods and alternatives

• Local structure / Lack of independence• Inter day variance greater than intra day variance• 2 measurements on the same day are likely to be more similar

than 2 measurements on different days• Cause

• ?• Solution

• Model the structure• Linear Mixed Model

Page 23: Statistical analysis methods

An International Centre for Mouse Genetics

Mixed Model

• Model data as sum of 2 normal distributions, plus a number of fixed effects• Normally distributed

• Inter animal difference• Inter day difference

• Fixed• Gender• Other parameters such as Weight• Genomic alteration (Genotype)• Gender / Genotype effect

• Calculate p value given that Genotype effect is zero

Page 24: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Get data:• Ptk7, Grip-Strength, Forelimb grip strength measurement mean,

View Data, Get as CSV, Save File• R

• ptk7GS=read.csv("ptk7GS.csv")• summary(ptk7GS)

Centre Strain Genotype Zygosity Gender Parameter WTSI:29 129/SvEv:29 Akt2 :14 :15 Male:29 Fat mass:29 baseline:15 Hom:14

Page 25: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Linear Model (no batch effect modeled)• ptk7GSLM=lm(Value~Genotype + Gender + Genotype*Gender,

ptk7GS, na.action="na.omit")• summary(ptk7GSLM)

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 68.777 2.475 27.794 < 2e-16 ***GenotypePtk7 -14.134 5.891 -2.399 0.01777 * GenderMale 11.454 4.011 2.855 0.00497 ** GenotypePtk7:GenderMale 1.987 8.966 0.222 0.82496 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 20.7 on 136 degrees of freedomMultiple R-squared: 0.1222, Adjusted R-squared: 0.1028 F-statistic: 6.311 on 3 and 136 DF, p-value: 0.0004862

Page 26: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Look at Fit• ptk7GSLMRes<-residuals(ptk7GSLM)• qqnorm(scale(ptk7GSLMRes))

Page 27: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Mixed Model• Excel

• load ptk7GS.csv• =LEFT(H2,(SEARCH("_",H2)-2))• Save ptk7GSLitter.csv

• R• ptk7GSLitter=read.csv("ptk7GSLitter.csv")• ptk7GSMM=lme(Value~Genotype + Gender +

Genotype*Gender,random=~1|Litter, ptk7GSLitter, na.action="na.omit“)• summary(ptk7GSMM)

Page 28: Statistical analysis methods
Page 29: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Mixed Model• R

• ptk7GSLitter=read.csv("ptk7GSLitter.csv")• ptk7GSMM=lme(Value~Genotype + Gender + Genotype*Gender, random=~1|Litter, ptk7GSLitter, na.action="na.omit“)

• summary(ptk7GSMM)

Linear mixed-effects model fit by REML

Fixed effects: Value ~ Genotype + Gender + Genotype * Gender Value Std.Error DF t-value p-value(Intercept) 67.02067 3.377184 85 19.845137 0.0000GenotypePtk7 -12.05973 7.461470 85 -1.616267 0.1097GenderMale 12.59607 4.403984 85 2.860154 0.0053GenotypePtk7:GenderMale 1.42342 8.819061 85 0.161403 0.8722

Page 30: Statistical analysis methods

An International Centre for Mouse Genetics

Do them yourself

• Mixed Model• ptk7GSMMRes<-residuals(ptk7GSMM)• qqnorm(scale(ptk7GSLMRes))


Recommended