Lab 5 Hypothesis testing and Confidence Interval.

transcript

Hypothesis testing and Confidence Interval

Outline

One sample t-test

Two sample t-test

Paired t-test

One-sample t-test

One sample t-test

The hypotheses: One sided

Two sided

One sample t-test

Test statistics

One sample t-test

Conclusion Compare the test statistics with the critical value

… Compare the p-value with the level of significance

α (e.g. 0.05, 0.1) Reject H0 if p-value < α (enough evidence)

Cannot reject H0 if p-value > α (not enough evidence)

Example

Download the biotest.txt data file

Read into R using function read.table() Extract the 1st column and store as ‘X1’ Store the 2nd column as ‘X2’

Example

> X1 = read.table(“biotest.txt”) [ ,1]

> X2 = read.table(“biotest.txt”) [ ,2]

Example

Take ‘X1’ as the sample in this case,

Test H0 : μ = 115 against H1 : μ ≠ 115

at significant level α = 0.05

[R] command

t.test()

Syntax:t.test(x=“data”, alternative = “less / greater /

two.sided”, mu=“μ0” )

Example 1

> t.test(X1, alternative = “two.sided”, mu=115)

One Sample t-test

data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6

Example 1

One Sample t-test

Example 1

One Sample t-test

larger than 0.05

Cannot reject H0 at 0.05 level of significance

Example 1

One Sample t-test

μ0 inside the 95% CI

Example 2

Test H0 : μ ≤ 108 against H1 : μ > 108

at significant level α = 0.05

Example 2

> t.test(X1, alternative = “greater”, mu=108)

One Sample t-test

data: X1 t = 2.3314, df = 9, p-value = 0.02232alternative hypothesis: true mean is greater than 108 95 percent confidence interval: 109.6243 Inf sample estimates:mean of x 115.6

Example 2

One Sample t-test

smaller than 0.05

Reject H0 at 0.05 level of significance

Example 2

Conclude that the population mean is significantly greater than 108

Example 2

One Sample t-test

Statistical significance

Practical significance

Confidence Interval

By default, the function t.test() includes a 95% confidence interval

Question: Can we change the confidence level?

Confidence Interval

e.g. want a 99% confidence interval

> t.test(x1, alternative=“greater”, mu=108,

conf.level = 0.99)

Two-sample t-test

Testing the population mean of two independent samples

Two-sample t-test

Two-sided

One-sided

Example 3

Consider the two sample X1 and X2

Want to test if there is there is a significant difference between the mean of X1 and mean of X2.

Example 3

Two sided testH0 : μ1 = μ2 against H1 : μ1 ≠ μ2

at 0.05 level of significance

Assuming equal variance

Example 3

> t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE)

Two Sample t-test

data: X1 and X2 t = -0.9052, df = 18, p-value = 0.3773alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -15.940831 6.340831 sample estimates:mean of x mean of y 115.6 120.4

Example 3

> t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE)

Two Sample t-test

data: X1 and X2 t = -0.9052, df = 18, p-value = 0.3773alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -15.940831 6.340831 sample estimates:mean of x mean of y 115.6 120.4

Example 3

Not assuming equal variance?

> t.test(X1, X2, alternative = “two.sided”,

var.equal = FALSE)

Paired t-test

Two samples problem But they are no longer independent Example:

Measurement taken twice at different time point from the same group of subjects

Blood pressure before and after some treatment Want to test the difference of the means

Paired t-test

If we take the difference of the measurements of each subject.

Reduce to a one sample problem The rest is the same as a one sample t-test

Example 4

Consider again the dataset X1 and X2, and assume they are pairwise observations

Test the equality of the means

i.e. test if difference in mean = 0H0 : μ1 = μ2 against H1 : μ1 ≠ μ2

at 0.05 level of significance

Example 4

> t.test(X1, X2, alternative = “two.sided”, paired = TRUE)

Paired t-test

data: X1 and X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of the differences

Example 4

> t.test(X1, X2, alternative = “two.sided”, paired = TRUE)

Paired t-test

data: X1 and X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of the differences

Alternatively…

> t.test(X1-X2, alternative = “two.sided”)

One Sample t-test

data: X1 - X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of x -4.8

Alternatively…

> t.test(X1-X2, alternative = “two.sided”)

One Sample t-test

data: X1 - X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of x -4.8

EXACTLY THE

SAME RESULT!!

Final Remarks

Notice that the conclusion from the two sample t-test and the paired t-test are different even if we are looking at the same data set.

Should check if the two sample are independent or not

Final Remarks

Using the wrong test either lead to loss of sensitivity or invalid analysis.

Lab 5 Hypothesis testing and Confidence Interval.

Documents