Date post: | 31-Dec-2015 |
Category: | Documents |
View: | 19 times |
Download: | 0 times |
eatworms.swmed.edu/[email protected]
Basic StatisticsCombining probabilitiesSamples and PopulationsFour useful statistics:The mean, or average.The median, or 50% value.Standard deviation.Standard Error of the Mean (SEM).Three distributions:The binomial distribution.The Poisson distribution.The normal distribution.Four testsThe chi-squared goodness-of-fit test.The chi-squared test of independence.Students t-testThe Mann-Whitney U-test.
Combining probabilitiesWhen you throw a pair of dice, what is the probability of getting 11?
Combining probabilitiesThe probability that all of several independent events occurs is the product of the individual event probabilities.
The probability that one of several mutually exclusive events occurs is the sum of the individual event probabilities.
Combining probabilitiesWhen you throw a pair of dice, what is the probability of getting 11?
When you throw five dice, what is the probability that at least one shows a 6?
Combining probabilitiesWhen you throw a pair of dice, what is the probability of getting 11?
When you throw five dice, what is the probability that at least one shows a 6?
Populations and samplesWhat proportion of the population is female?
Populations and samplesWhat proportion of the population is female?Abstract populations: what does a mouse weigh?
Populations and samplesWhat proportion of the population is female?Abstract populations: what does a mouse weigh?Population characteristics:Central tendency: mean, medianDispersion: standard deviation
Four sample statistics
Sample mean:
Sample median:
is the middle value in a sample of odd size, the average of the two middle values in a sample of even size.
Sample standard deviation:
Standard Error of the Mean:
_936962343.unknown
_937035986.unknown
_937036511.unknown
_936962342.unknown
Standard deviation and SEMUse standard deviation to describe how much variation there is in a population.Example: income, if youre interested in how much income varies within the US population.Use SEM to say how accurate your estimate of a population mean is.Example: measurement of -gal activity from a 2-hybrid test.
Sample stats: recommendationsWhen you report an average, report it as meanSEM. Same for error bars in graphs.In the figure caption or the table heading or somewhere, say explicitly that thats what youre reporting.Use the median for highly skewed data.
Three distributionsThe binomial distributionWhen you count how many of a sample of fixed size have a certain characteristic.The Poisson distributionWhen you count how many times something happens, and there is no upper limit.The normal distributionWhen you measure something that doesnt have to be an integer or when you average several continuous measurements.
The binomial distribution
When you count how many of a sample of fixed size have a certain characteristic.
Parameters:N: the fixed sample sizep: the probability that one thing has the characteristicq: the probability that it doesnt: (1-p)
Formula:
Example:Females in a population, animals having a certain genetic characteristic.
_937036628.unknown
_1031483742.unknown
The Poisson distribution
When you count how many times something happens, and there is no (or only a very large) upper limit.
Parameter:(: the population mean
Formula:
Example:Radioactivity counts, positive clones in a library.
_1031484037.unknown
The normal distribution
When you measure a something that doesnt have to be an integer, e.g. weight of a mouse, or velocity of an enzyme reaction, and especially when you average several such continuous measurements.
Parameters:(: the population mean
: the population variance
Formula:
Example:Weight, heart rate, enzyme activity
_1031484223.unknown
_1031484231.unknown
Hypothesis testing
A genetic mapping problem
Moms genotype:
Dads genotype:
At SSR:
(/(
(/(
At disease locus:
e/+
e/+
Assume we know that Mom inherited both the ( allele of the SSR and the e mutation from her father, and likewise that Dad inherited ( and e from his father.
Suppose SSR and disease locus are unlinked (the null hypothesis). What is the probability that an epileptic (e/e) child has SSR genotype (/(?
A genetic mapping problem
Moms genotype:
Dads genotype:
At SSR:
(/(
(/(
At disease locus:
e/+
e/+
Assume we know that Mom inherited both the ( allele of the SSR and the e mutation from her father, and likewise that Dad inherited ( and e from his father.
Suppose SSR and disease locus are unlinked (the null hypothesis). What is the probability that an epileptic (e/e) child has SSR genotype (/(?
Answer: 1/4
Now suppose that SSR and disease locus are genetically linked. What is the probability that an epileptic (e/e) child has SSR genotype (/(?
A genetic mapping problem
Moms genotype:
Dads genotype:
At SSR:
(/(
(/(
At disease locus:
e/+
e/+
Assume we know that Mom inherited both the ( allele of the SSR and the e mutation from her father, and likewise that Dad inherited ( and e from his father.
Suppose SSR and disease locus are unlinked (the null hypothesis). What is the probability that an epileptic (e/e) child has SSR genotype (/(?
Answer: 1/4
Now suppose that SSR and disease locus are genetically linked. What is the probability that an epileptic (e/e) child has SSR genotype (/(?
Answer: Something less than 1/4
The experimentLook at the SSR genotype of 40 e/e kids.If about 1/4 are /, the SSR is probably unlinked.If the number of / is much less than 1/4, the SSR is probably linked.Were going to figure out how to make the decision in advance, before we see the results.
Expected results if unlinked
Chart3
0.0000100566
0.0001340878
0.0008715707
0.0036799652
0.0113465595
0.0272317428
0.0529506109
0.0857295605
0.1178781457
0.139707432
0.1443643464
0.1312403149
0.1057213648
0.0759025183
0.048794476
0.0281923639
0.0146835229
0.0069098931
0.0029431026
0.0011359343
0.000397577
0.0001262149
0.0000363346
0.0000094786
0.000002238
0.0000004774
0.0000000918
0.0000000159
0.0000000025
0.0000000003
0
0
0
0
0
0
0
0
0
0
0
Pr(x)
x
Pr(x)
Binomial, N=40, p=0.25
Sheet1
pN
0.2540
0
xPr(x)Upper tailLower tail
00.00001005660.00001005661
10.00013408780.00014414440.9999899434
20.00087157070.00101571510.9998558556
30.00367996520.00469568030.9989842849
40.01134655950.01604223980.9953043197
50.02723174280.04327398260.9839577602
60.05295061090.09622459350.9567260174
70.08572956050.1819541540.9037754065
80.11787814570.29983229970.818045846
90.1397074320.43953973170.7001677003
100.14436434640.5839040780.5604602683
110.13124031490.71514439290.416095922
120.10572136480.82086575770.2848556071
130.07590251830.89676827590.1791342423
140.0487944760.9455627520.1032317241
150.02819236390.97375511590.054437248
160.01468352290.98843863880.0262448841
170.00690989310.99534853190.0115613612
180.00294310260.99829163450.0046514681
190.00113593430.99942756890.0017083655
200.0003975770.99982514590.0005724311
210.00012621490.99995136080.0001748541
220.00003633460.99998769540.0000486392
230.00000947860.9999971740.0000123046
240.0000022380.9999994120.000002826
250.00000047740.99999988950.000000588
260.00000009180.99999998130.0000001105
270.00000001590.99999999720.0000000187
280.00000000250.99999999960.0000000028
290.000000000310.0000000004
30010
31010
32010
33010
34010
35010
36010
37010
38010
39010
40010
0
Sheet1
Upper tail
Lower tail
Pr(x)
x
Pr(x)
Tail probabilities
Binomial, N=40, p=0.25
Sheet2
Pr(x)
x
Pr(x)
Binomial, N=40, p=0.25
Sheet3
Is the SSR linked?We want to know if the SSR is linked to the epilepsy gene.What would your answer be if:10/40 kids were /?0/40 kids were /?5/40 kids were /?Need a way to set the cut-off.
Whats the probability of a type I error () if we cut off at 5?
Sheet1
pN
0.2500040
0.0
x0Pr(x = x0)Pr(x
Probability of a type I error
Chart3
0.00001005660.0000100566
0.00014414440.0001340878
0.00101571510.0008715707
0.00469568030.0036799652
0.01604223980.0113465595
0.04327398260.0272317428
0.09622459350.0529506109
0.1819541540.0857295605
0.29983229970.1178781457
0.43953973170.139707432
0.5839040780.1443643464
0.71514439290.1312403149
0.82086575770.1057213648
0.89676827590.0759025183
0.9455627520.048794476
0.97375511590.0281923639
0.98843863880.0146835229
0.99534853190.0069098931
0.99829163450.0029431026
0.99942756890.0011359343
0.99982514590.000397577
0.99995136080.0001262149
Click here to load reader