Area Test for Observations Indexed by Time

Area Test for Observations Indexed by TimeL. B. Green Middle Tennessee State University

E. M. BoczkoVanderbilt University

Outline

Problem The Null Hypothesis The Statistic Determining Significance Comparison to Other Tests Extending the Test

The Problem

Four observations of mouse RNA each at 2, 3, 7, and 21 days after birth.

Test to see if there is a change in metabolic regulation of fatty acid metabolism and, if so, when the change happens.

The ProblemIndependent observations at each time, represented by:

niti ,,1,0 , jiX ,ikj ,,1,0

A value of zero represents “no change.” Positive values represent an increase, negative values represent a decrease.

The Problem

1t 4t3t2t

The Null Hypothesis

There is no change at any time point.

jiXH ,0 : are identically distributed, with mean (or median) of zero.

The Null Hypothesis

If the null hypothesis is true, then the order of the observations is completely due to chance.

The Statistic

Create a piecewise linear function whose value at each time point is the mean (or median) of the observations at that time point.

Calculate the square of the L2 norm of this function.

The Statistic

1t 4t3t2t

nt

tdtffLl

0

222 )(:

The Statistic

3)( 12

110

2 iiiii

n

ii

ttmmmml

ik

jji

ii Xk

m0

,

1

Note: It is possible for the mi’s to be medians rather than means.

The Statistic

kkk nnn

nnn

nnn

A

111000000

000111

000

000000111

222

111

360000

630000

0036

00

00636

0

000636

000063

11

12

3534

342423

231312

1212

nnnn

nnnn

tttt

tttt

tttt

tttttt

tttttt

tttt

L

LAxAxxl TT)(

Determining Significance

Bootstrap:

Sample from a distribution (constructed from the data) that does satisfy H0.

Calculate new values of and compare to original value.

If H0 is true, the original value will not be different from the new values.

l


Calculate , the mean of all the data. Calculate Repeat B times

Choose a new set of from , with replacement.

Calculate the new value of the test statistic, Calculate Reject if

XXXY jiji ,,

*, jiX }{ , jiY

*lBllp }{# *

p

Determining SignificanceWhy sample from original data?

The empirical distribution is the closest distribution we have to the true distribution.


Why re-center the data?

We must ensure that the distribution we are sampling from satisfies H0.

XXY jiji ,,


Reject if

If the sample size is large, this p-value is uniformly distributed. So

B

llp

}{# *

)( pP

Determining SignificanceIf sample size is small:

t=(0,3,6,10)

Four observations per time point.

Other Tests

Multiple t-tests

At each time point, perform a t-test to see if the mean is different from zero.

Combine these results using Bonferroni Correction factor.

Other Tests

Multiple t-tests

Do not deal with time explicitly.

Have very small samples at each time point.

Assumes normality in data.

Other TestsANOVA

Test for difference in means using one-way ANOVA.

Doesn’t explicitly deal with time.Null hypothesis is that means are the same, not that they are equal to zero.Assumes normality.

Other Tests

Area test is more powerful than multiple t-tests or ANOVA when applied to simulated data sets.

Simulated using data from distributions with means that increase linearly over time. In this case, power depends on slope of the line.

Extending the Test

Use median instead of mean at each time point.

Allows test to be used in cases where the existence of the mean is in doubt.

Extending the Test

Two data sets.

Test to see whether both sets of data come from the same distribution, and there is no change in distribution over time.

)(: 22 gfLl

Extending the Test

1t 4t3t2t

Extending the TestTwo data sets. Distribution may change over time.

For example: Comparison to a control data set.

Resample within time points rather than across whole set.

)(: 22 gfLl

Extending the Test

1t 4t3t2t

Thank You

Date post:	02-Jan-2016
Category:	Documents
Upload:	melanie-mccoy
View:	22 times
Download:	0 times

Area Test for Observations Indexed by Time

Documents