+ All Categories
Home > Documents > Multiple testing correction

Multiple testing correction

Date post: 23-Feb-2016
Category:
Upload: arvin
View: 105 times
Download: 0 times
Share this document with a friend
Description:
Multiple testing correction. Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington [email protected]. Outline. One-minute response Revision Multiple testing correction Motivation - PowerPoint PPT Presentation
Popular Tags:
36
Multiple testing correction Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington [email protected]
Transcript
Page 1: Multiple  testing correction

Multiple testing correction

Prof. William Stafford NobleDepartment of Genome Sciences

Department of Computer Science and EngineeringUniversity of Washington

[email protected]

Page 2: Multiple  testing correction

Outline

• One-minute response• Revision• Multiple testing correction– Motivation– Bonferroni adjustment and the E-value– False discovery rate

• Python

Page 3: Multiple  testing correction

One-minute responses• The class was too fast.• I did not understand making the multiple alignment from

pairwise alignments. Can we do more examples in class tomorrow?

• Python approach worked well.• Python is hard for me to understand.• We need extra Python tutorials.• I did not understand some of the Python operations.• Everything was clear, especially the Python code.• Python part was not productive at all; I did not gain anything.

Can we go back to the way we used to do the problems?

Page 4: Multiple  testing correction

Summary

Building a PSSM involves 5 steps:1. Count observations2. Compute pseudocounts3. Sum counts and pseudocounts4. Normalize5. Compute log-odds

Page 5: Multiple  testing correction

1. Count observations

EQRGKAFA

Observedresidues

A 2C 0D 0E 1F 1G 1H 0I 0K 1L 0M 0P 0Q 1R 1S 0T 0V 0W 0Y 0

Observedcounts

Page 6: Multiple  testing correction

2. Compute pseudocounts

• The user specifies a pseudocount weight β.

• β controls how much you trust the data versus your prior knowledge.

• In this case, let β = 2.

Backgroundfrequencies

A 0.085C 0.019D 0.054E 0.065F 0.040G 0.072H 0.023I 0.058K 0.056L 0.096M 0.024P 0.053Q 0.042R 0.054S 0.072T 0.063V 0.073W 0.016Y 0.034

Pseudocounts

A 0.170C 0.038D 0.108E 0.130F 0.080G 0.144H 0.046I 0.116K 0.112L 0.192M 0.048P 0.106Q 0.084R 0.108S 0.144T 0.126V 0.146W 0.032Y 0.068

2

Page 7: Multiple  testing correction

3. Sum counts and pseudocountsA 2C 0D 0E 1F 1G 1H 0I 0K 1L 0M 0P 0Q 1R 1S 0T 0V 0W 0Y 0

Observedcounts

Pseudo-counts

A 0.170C 0.038D 0.108E 0.130F 0.080G 0.144H 0.046I 0.116K 0.112L 0.192M 0.048P 0.106Q 0.084R 0.108S 0.144T 0.126V 0.146W 0.032Y 0.068

A 2.170C 0.038D 0.108E 1.130F 1.080G 1.144H 0.046I 0.116K 1.112L 0.192M 0.048P 0.106Q 1.084R 1.108S 0.144T 0.126V 0.146W 0.032Y 0.068

+ =

Page 8: Multiple  testing correction

4. NormalizeA 2.170C 0.038D 0.108E 1.130F 1.080G 1.144H 0.046I 0.116K 1.112L 0.192M 0.048P 0.106Q 1.084R 1.108S 0.144T 0.126V 0.146W 0.032Y 0.068

+

8 counts + 2 pseudocounts = 10

A 0.2170C 0.0038D 0.0108E 0.1130F 0.1080G 0.1144H 0.0046I 0.0116K 0.1112L 0.0192M 0.0048P 0.0106Q 0.1084R 0.1108S 0.0144T 0.0126V 0.0146W 0.0032Y 0.0068

10

Page 9: Multiple  testing correction

5. Compute log-oddsA 0.085C 0.019D 0.054E 0.065F 0.040G 0.072H 0.023I 0.058K 0.056L 0.096M 0.024P 0.053Q 0.042R 0.054S 0.072T 0.063V 0.073W 0.016Y 0.034

A 0.2170C 0.0038D 0.0108E 0.1130F 0.1080G 0.1144H 0.0046I 0.0116K 0.1112L 0.0192M 0.0048P 0.0106Q 0.1084R 0.1108S 0.0144T 0.0126V 0.0146W 0.0032Y 0.0068

A 1.35C -2.32D -2.32E 0.80F 1.43G 0.67H -2.32I -2.32K 0.99L -2.32M -2.32P -2.32Q 1.37R 1.04S -2.32T -2.32V -2.32W -2.32Y -2.32

Foregroundprobability

Backgroundprobability

Log-oddsscores

35.130103.040705.0

2log553.2log553.2log

085.02170.0log

PrPr

log 222

BAMA

Page 10: Multiple  testing correction

5. Compute log-oddsA 0.085C 0.019D 0.054E 0.065F 0.040G 0.072H 0.023I 0.058K 0.056L 0.096M 0.024P 0.053Q 0.042R 0.054S 0.072T 0.063V 0.073W 0.016Y 0.034

A 0.2170C 0.0038D 0.0108E 0.1130F 0.1080G 0.1144H 0.0046I 0.0116K 0.1112L 0.0192M 0.0048P 0.0106Q 0.1084R 0.1108S 0.0144T 0.0126V 0.0146W 0.0032Y 0.0068

A 1.35C -2.32D -2.32E 0.80F 1.43G 0.67H -2.32I -2.32K 0.99L -2.32M -2.32P -2.32Q 1.37R 1.04S -2.32T -2.32V -2.32W -2.32Y -2.32

Foregroundprobability

Backgroundprobability

Log-oddsscores

Page 11: Multiple  testing correction

Now you try …

1. Compute counts.2. Add pseudocounts.3. Normalize.4. Compute log-odds.

ACATAGATACAAAAATCAAT

Background: A=0.26 C=0.28 G=0.24 T=0.22Pseudocount weight: β=1

Page 12: Multiple  testing correction

Now you try …

1. Compute counts.2. Add pseudocounts.3. Normalize.4. Compute log-odds.

ACATAGATACAAAAATCAAT

Background: A=0.26 C=0.28 G=0.24 T=0.22Pseudocount weight: β=1

A 4 2 5 1C 1 2 0 0G 0 1 0 0T 0 0 0 4

Counts

Page 13: Multiple  testing correction

Now you try …

1. Compute counts.2. Add pseudocounts.3. Normalize.4. Compute log-odds.

ACATAGATACAAAAATCAAT

Background: A=0.26 C=0.28 G=0.24 T=0.22Pseudocount weight: β=1

A 4.26 2.26 5.26 1.26C 1.28 2.28 0.28 0.28G 0.24 1.24 0.24 0.24T 0.22 0.22 0.22 4.22

A 4 2 5 1C 1 2 0 0G 0 1 0 0T 0 0 0 4

Counts Counts + pseudocounts

Page 14: Multiple  testing correction

Now you try …

1. Compute counts.2. Add pseudocounts.3. Normalize.4. Compute log-odds.

ACATAGATACAAAAATCAAT

Background: A=0.26 C=0.28 G=0.24 T=0.22Pseudocount weight: β=1

A 4.26 2.26 5.26 1.26C 1.28 2.28 0.28 0.28G 0.24 1.24 0.24 0.24T 0.22 0.22 0.22 4.22

A 4 2 5 1C 1 2 0 0G 0 1 0 0T 0 0 0 4

A 0.71C 0.21G 0.04T 0.04

Counts Counts + pseudocounts Frequencies

Page 15: Multiple  testing correction

Now you try …

1. Compute counts.2. Add pseudocounts.3. Normalize.4. Compute log-odds.

ACATAGATACAAAAATCAAT

Background: A=0.26 C=0.28 G=0.24 T=0.22Pseudocount weight: β=1

A 4.26 2.26 5.26 1.26C 1.28 2.28 0.28 0.28G 0.24 1.24 0.24 0.24T 0.22 0.22 0.22 4.22

A 4 2 5 1C 1 2 0 0G 0 1 0 0T 0 0 0 4

A 0.71C 0.21G 0.04T 0.04

A 1.44C -0.42G -2.58T -2.46

Counts Counts + pseudocounts Frequencies Log-odds

Page 16: Multiple  testing correction
Page 17: Multiple  testing correction

Multiple testing

• Say that you perform a statistical test with a 0.05 threshold, but you repeat the test on twenty different observations.

• Assume that all of the observations are explainable by the null hypothesis.

• What is the chance that at least one of the observations will receive a p-value less than 0.05?

Page 18: Multiple  testing correction

Multiple testing• Say that you perform a statistical test with a 0.05 threshold, but you repeat the

test on twenty different observations. Assuming that all of the observations are explainable by the null hypothesis, what is the chance that at least one of the observations will receive a p-value less than 0.05?

• Pr(making a mistake) = 0.05• Pr(not making a mistake) = 0.95• Pr(not making any mistake) = 0.9520 = 0.358• Pr(making at least one mistake) = 1 - 0.358 = 0.642

• There is a 64.2% chance of making at least one mistake.

Page 19: Multiple  testing correction

Bonferroni correction

• Assume that individual tests are independent. • Divide the desired p-value threshold by the number

of tests performed.• For the previous example, 0.05 / 20 = 0.0025.• Pr(making a mistake) = 0.0025• Pr(not making a mistake) = 0.9975• Pr(not making any mistake) = 0.997520 = 0.9512• Pr(making at least one mistake) = 1 - 0.9512 = 0.0488

Page 20: Multiple  testing correction

Sample problem #1

• You have used a local alignment algorithm to search a query sequence against a database containing 10,000 protein sequences.

• You estimate that the p-value of your top-scoring alignment is 2.1 × 10-5.

• Is this alignment significance at a 95% confidence threshold?

• No, because 0.05 / 10000 = 5 × 10-6.

Page 21: Multiple  testing correction

Sample problem #2

• Say that you search the non-redundant protein database at NCBI, containing roughly one million sequences.

• You want to use a conservative confidence threshold of 0.001.

• What p-value threshold should you use?• A Bonferroni correction would suggest using a p-

value threshold of 0.001 / 1,000,000 = 0.000000001 = 10-9.

Page 22: Multiple  testing correction

E-values• The p-value is the probability of observing a given

score, assuming the data is generated according to the null hypothesis.

• The E-value is the expected number of times that the given score would appear in a random database of the given size.

• One simple way to compute the E-value is to multiply the p-value times the size of the database.

• Thus, for a p-value of 0.001 and a database of 1,000,000 sequences, the corresponding E-value is 0.001 × 1,000,000 = 1,000.

BLAST actually calculates E-values in a more complex way.

Page 23: Multiple  testing correction
Page 24: Multiple  testing correction
Page 25: Multiple  testing correction
Page 26: Multiple  testing correction

False discovery rate: Motivation

• Scenario #1: You have used PSI-BLAST to identify a new protein homology, and you plan to publish a paper describing this result.

• Scenario #2: You have used PSI-BLAST to discover many potential homologs of a single query protein, and you plan to carry out a wet lab experiment to validate your findings. The experiment can be done in parallel on 96 proteins.

Page 27: Multiple  testing correction

Types of errors• False positive: the algorithm indicates that the

sequences are homologs, but actually they are not.• False negative: the sequences are homologs, but the

algorithm indicates that they are not.

• Both types of errors are defined relative to some confidence threshold.

• Typically, researchers are more concerned about false positives.

Page 28: Multiple  testing correction

False discovery rate• The false discovery rate (FDR) is the

percentage of target sequences above the threshold that are false positives.

• In the context of sequence database searching, the false discovery rate is the percentage of sequences above the threshold that are not homologous to the query.

5 FP13 TP

33 TN5 FN

FDR = FP / (FP + TP) = 5/18 = 27.8%

Homolog of the query sequenceNon-homolog of the query sequence

Page 29: Multiple  testing correction

Bonferroni vs. FDR

• Bonferroni controls the family-wise error rate; i.e., the probability of at least one false positive among the sequences that score better than the threshold.

• FDR controls the percentage of false positives among the target sequences that score better than the threshold.

Page 30: Multiple  testing correction

Controlling the FDR

• Order the unadjusted p-values p1 p2 … pm.• To control FDR at level α,

• Reject the null hypothesis for j = 1, …, j*.

(Benjamini & Hochberg, 1995)

Page 31: Multiple  testing correction

FDR example

• Choose the largest threshold j so that (jα)/m is less than the corresponding p-value.

• Approximately 5% of the examples above the line are expected to be false positives.

Rank (jα)/m p-value1 0.00005 0.00000082 0.00010 0.00000123 0.00015 0.00000134 0.00020 0.00000565 0.00025 0.00000786 0.00030 0.00002357 0.00035 0.00009458 0.00040 0.00024509 0.00045 0.000470010 0.00050 0.0008900…1000 0.05000 1.0000000

Page 32: Multiple  testing correction

Benjamini-Hochberg test

• Test of 100 uniformly distributed p-values (p-values from non-significant results)

• P-values as blue dots• Significance threshold for

FDR = 0.2 as red line

www.complextrait.org/Powerpoint/ctc2002/KenAffyQTL2002.ppt

Page 33: Multiple  testing correction

Benjamini-Hochberg test

• Test of 10 low p-values (significant results) mixed with 90 p-values from non-significant results

• P-values as blue dots• Significance threshold for

FDR = 0.2 as red line• Eleven cases declared

significant

Declare significant

Page 34: Multiple  testing correction

Summary• Selecting a significance threshold requires evaluating the cost of

making a mistake.• Bonferroni correction divides the desired p-value threshold by the

number of statistical tests performed.• The E-value is the expected number of times that the given score

would appear in a random database of the given size.• The false discovery rate is the percentage of false positives among

the target sequences that score better than the threshold.• Use Bonferroni correction when you want to avoid making a

single mistake; control the false discovery rate when you can tolerate a certain percentage of mistakes.

Page 35: Multiple  testing correction

Sample problem #1

• Given:– a confidence threshold, and– a list of p-values

• Return:– a set of p-values with the specified false discovery

rate

./compute-fdr.py 0.05 pvalues.txt

Page 36: Multiple  testing correction

Sample problem #2

• Modify your program so that it will work with an arbitrarily large collection of p-values.

• You may assume that the p-values are given in sorted order.

• Read the file twice: once to find out how many p-values there are, and a second time to do the actual calculation.


Recommended