+ All Categories
Home > Documents > Comparisons Between Two Populations

Comparisons Between Two Populations

Date post: 04-Jun-2018
Category:
Upload: renaisans
View: 217 times
Download: 0 times
Share this document with a friend

of 38

Transcript
  • 8/13/2019 Comparisons Between Two Populations

    1/38

    Comparisons Between Two Populations

    Statistics dan Probability

    Semester 1, 2013

    Ira M. Anjasmara

    Jurusan Teknik Geomatika

  • 8/13/2019 Comparisons Between Two Populations

    2/38

    Introduction

    Previously, we have covered applications to samples drawn from onepopulation:

    testing means through the z-test (n 30) or the t-test (n

  • 8/13/2019 Comparisons Between Two Populations

    3/38

    Comparing Means Large Samples

    The large sample case occurs when both samples have n 30.

    Suppose we have two normally-distributed populations with differentmeans and variances: 1, 21 and 2,

    22. Now, the difference in the

    population means, 1 2, is also normally-distributed.

    The sampling distribution of interest is:

    x1

    x2Samples are taken from each population, with x1, s

    21 and x2, s

    22, and both

    n1 and n2 30.

    The mean of the difference distribution is:

    E[x1 x2] =1 2

    The standard error of the mean of the difference distribution is then:

    x1x2 =

    21n1

    +22n2

    Statistics dan Probability 3/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    4/38

    Hypothesis testing

    The procedure for hypothesis testing two population means follows thesame 8-step procedure as for a normal distribution, except we use thefollowing test statistic:

    z=

    (x1 x2) (1 2)

    x1x2

    Most often, H0 will assume that 1=2, while Ha will test fordifferences. Hence, the above test statistic reduces to:

    z= x1 x2x1x2

    Statistics dan Probability 4/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    5/38

  • 8/13/2019 Comparisons Between Two Populations

    6/38

    Example

    Step 1Formulate alternative hypothesis: Ha:1 =2

    i.e., test whether the two theodolites are different.

    Formulate null hypothesis: H0:1 =2

    i.e., assume that they give identical readings.

    Step 2 - Determine number of tails

    This is a 2-tailed test, because the null hypothesis has an equality.

    Step 3 - Determine level of significance.

    Were told that the significance level is = 0.05.

    Statistics dan Probability 6/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    7/38

    Example

    Step 4 - Determine the critical value ofz

    We have a 2-tailed test, so we need to find z

    /2=z0.025From the standard normal distribution table, we have:z0.025=z(0.5 0.025) =z(0.475) =1.96

    Step 5 - Determine the rejection region

    The null hypothesis will be rejected if1 =2, so we have the followingsituation:

    Since we are testing 1 =2, we are at both sides of the normal curve,

    therefore the rejection regions are z < 1.96 and z >1.96.Statistics dan Probability 7/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    8/38

    Example

    Step 6 - Determine the test statistic (z-score) from the sample data:

    z=(x1 x2) (1 2)

    x1x2=

    (x1 x2) 0

    x1x2=

    16.10 15.990.152

    40 + 0.2

    2

    40

    = 2.78

    Step 7 - Compare the test statistic against its critical value:2.78

  • 8/13/2019 Comparisons Between Two Populations

    9/38

    Confidence intervals

    For the distribution of the difference between two populations, the(1 )% confidence interval is given by:

    CI = (x1 x2) z/2x1x2

    Remember, this shows us that we are (1 )% confident that thedifference between the means lies in the range specified by the CI.

    Notice that with this approach, we dont need to know the values of,and we can approximate bys if necessary. For the given data in theabove example:

    CI = (16.10 15.99) 1.960.15240

    +0.22

    40

    1

    2

    = 0.110 0.077

    Therefore, we reject H0 at this level, because H0 says that1 2= 0,whereas we have found that 0 does not lie in the CI range.Note that you can only use confidence interval estimation as a

    replacement for hypothesis testing when you have a 2-tailed test.Statistics dan Probability 9/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    10/38

    Comparing Means Small Samples

    Ifn1

  • 8/13/2019 Comparisons Between Two Populations

    11/38

  • 8/13/2019 Comparisons Between Two Populations

    12/38

    Unequal population variances

    Sometimes the small samples will be drawn from two populations thathave different (but unknown) variances, for example:

    comparing instruments from two different manufacturers;

    different operators using the same instruments (though depends oncompetency).

    In this case we are not allowed to form a pooled variance like we do whenthe population variances are equal. So, we have to compute the standarderror of the mean of the difference distribution through:

    sx1x2 =

    s21n1

    + s22

    n2

    Statistics dan Probability 12/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    13/38

    Unequal population variances

    However, we now must use the following formula to calculate the totalnumber of degrees of freedom:

    =

    s

    2

    1n1 + s2

    2n22

    11

    s21

    n1

    2+ 12

    s22

    n2

    2instead of=1+2, when determining the critical value oft.

    Statistics dan Probability 13/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    14/38

    Hypothesis testing

    When doing hypothesis testing on small samples drawn from twopopulations, use the following test statistic:

    t= (x1 x2) (1 2)sx1x2

    where sx1x2 is determined trough the previous methods, depending onwhether the two populations have equal or unequal variances.

    Statistics dan Probability 14/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    15/38

    Example of Equal Variances

    The same distance was measured by two EDMs (from the samemanufacturer): EDM 1 recorded a mean distance of 100.20 m with s1 =0.04 m from 10 measurements; EDM 2 recorded a mean distance of 99.94m with s2 = 0.09 m from 32 measurements. You suspect that EDM 1 hasa systematic error of at least 20 cm (i.e., is reading longer by 20 cm). Test

    this hypothesis at 0.01 significance.

    Step 1

    Formulate alternative hypothesis: Ha:1 2>0.2

    i.e., test whether EDM 1 has a systematic error of +20 cm.Formulate null hypothesis: H0:1 2 0.2

    i.e., assume that EDM 1 and EDM 2 are the same.

    Statistics dan Probability 15/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    16/38

    Example of Equal Variances

    Step 2 - Determine number of tails

    This is a 1-tailed test, because the null hypothesis has an inequality.

    Step 3 - Determine level of significance and degree of freedom.

    Were told that the significance level is = 0.01.Because we have equal population variances, we can use=1+2= 9 + 31 = 40.

    Step 4 - Determine the critical value oft

    We have a 1-tailed test, so we need to find t,=t40,0.01From the t distribution table, we have:t40,0.01=2.423

    Statistics dan Probability 16/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    17/38

    E l f E l V i

  • 8/13/2019 Comparisons Between Two Populations

    18/38

    Example of Equal Variances

    Step 6 - Determine the test statistic (t-score) from the sample data:First, determine the pooled variance:

    s2p=1s

    21+2s

    22

    1+2=

    9 0.042

    +

    31 0.092

    9 + 31 = 0.00664

    Then determine the standard error of the mean:

    sx1x2 =

    s2p

    1

    n1+

    1

    n2

    =

    0.00664

    1

    10+

    1

    32

    = 0.0295

    Finally, determine the test statistic:

    t=(x1 x2) (1 2)

    sx1x2=

    (100.20 99.94) 0.2

    0.0295 = 2.033

    [Note that 1 2= 0.2 here.]

    Statistics dan Probability 18/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    19/38

  • 8/13/2019 Comparisons Between Two Populations

    20/38

    C i V i F Di t ib ti

  • 8/13/2019 Comparisons Between Two Populations

    21/38

    Comparing Variances - F Distribution

    Sometimes we may need to compare the precision resulting from twoexperiments:

    precision is measured by the standard deviation;

    in fact, as with the 2 test, we compare variances.

    If random samples of size n1 and n2 are selected from twonormally-distributed populations with equal variance then the ratio:

    F =s21s22

    has an F distribution with 1 degrees of freedom in the numerator and 2degrees of freedom in the denominator.

    Statistics dan Probability 21/38 Comparisons Between Two Populations

    Comparing Variances F Distribution

  • 8/13/2019 Comparisons Between Two Populations

    22/38

    Comparing Variances - F Distribution

    Each specific F distribution depends upon which sample is selected for the

    numerator of the F-ratio, and which for the denominator; i.e., there is aunique F distribution for every possible combination of values of1 and2.

    The probability density function for the F distribution is:

    f(x, 1, 2) = 1

    2 + 22

    12

    22

    1x1x+2

    2/21x x >0

    where is the gamma function (see standard maths texts).

    Different tables are given for different values of. Each table gives a valueof F corresponding to the area in the upper tail (), for the degrees offreedom N in the numerator, and D in the denominator. The tables forthe F distribution look something like the following:

    Statistics dan Probability 22/38 Comparisons Between Two Populations

    Table of F distribution

  • 8/13/2019 Comparisons Between Two Populations

    23/38

    Table of F distribution

    The numbers in the first column give the degrees of freedom in thedenominator; the numbers in the first row give the degrees of freedom inthe numerator.The numbers in the main body of the table give the F-score correspondingto those particular values of,

    N and

    D, i.e., F

    N,D,.

    Statistics dan Probability 23/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    24/38

    F Distribution

  • 8/13/2019 Comparisons Between Two Populations

    25/38

    F Distribution

    The tables only give the area in the upper tail. If we want to find theF-score corresponding to s small area in the lower tail, we use theimportant relationship:

    F1,2,1= 1

    F2,1,

    Notice that the number of degrees of freedom in the numerator anddenominator are interchanged. So:

    F0.95=

    1

    0.05 F0.975=

    1

    0.025 F0.99=

    1

    0.01

    for any 1, 2

    Statistics dan Probability 25/38 Comparisons Between Two Populations

    Example

  • 8/13/2019 Comparisons Between Two Populations

    26/38

    Example

    Calculate F4,20,0.975.From the previous equation, we see that:

    F4,20,0.975= 1

    F20,4,0.025=

    1

    8, 56= 0.117

    Statistics dan Probability 26/38 Comparisons Between Two Populations

    Hypothesis testing

  • 8/13/2019 Comparisons Between Two Populations

    27/38

    Hypothesis testing

    The procedure for the hypothesis testing of variances follows the same

    8-step procedure as for means testing with a normal distribution, exceptwe use the test statistic:

    F =s21s22

    For a 1-tailed test, we always phrase the alternative hypothesis like:

    Ha:2larger>

    2smaller

    Furthermore, the observation with the largest variance goes into thenumerator, so that

    F >1

    This puts the rejection region in the upper tail, so we only ever need touse the upper tail F values.

    Statistics dan Probability 27/38 Comparisons Between Two Populations

    For a 2-tailed test, it doesnt matter which way the alternative hypothesis

  • 8/13/2019 Comparisons Between Two Populations

    28/38

    is phrased:

    Ha :2larger =

    2smaller or Ha:

    2smaller =

    2larger

    as long as the observation with the largest variance goes into thenumerator.As there are two tails, we need to find F1,2,/2 andF1,2,1/2= 1/F2,1,/2:

    Statistics dan Probability 28/38 Comparisons Between Two Populations

    Example 1

  • 8/13/2019 Comparisons Between Two Populations

    29/38

    Example 1

    Which of these two sets of measurements, A or B, is the most precise, at

    the 0.05 level of significance: sA = 5.83 from 31 measurements, or sB =4.12 from 21 measurements?

    Step 1

    Formulate alternative hypothesis: Ha:

    2

    A >

    2

    Bi.e.,put the larger variance as population 1;

    or, set A has the larger variability, so is less precise.

    Formulate null hypothesis: H0:2A

    2B

    i.e., the opposite.Step 2 - Determine number of tails

    This is a 1-tailed test, because the null hypothesis has an inequality.

    Statistics dan Probability 29/38 Comparisons Between Two Populations

    Example 1

  • 8/13/2019 Comparisons Between Two Populations

    30/38

    Example 1

    Step 3 - Determine level of significance and degree of freedom.

    Were told that the significance level is = 0.05.A= 31, A = 31 1 = 30 (numerator, because A has the largest

    variance)B = 21, B = 21 1 = 20 (denominator).

    Step 4 - Determine the critical value ofF

    We have a 1-tailed test, so we need to find FA,B,=F30,20,0.05= 2.04

    Statistics dan Probability 30/38 Comparisons Between Two Populations

    Example 1

  • 8/13/2019 Comparisons Between Two Populations

    31/38

    Example 1

    Step 5 - Determine the rejection region

    The null hypothesis will be rejected if2A> 2B, so we have the following

    situation:

    Since we are testing 2A> 2B, we are in the upper tail of the F curve,

    therefore the rejection region is F >2.04.

    Statistics dan Probability 31/38 Comparisons Between Two Populations

    Example 1

  • 8/13/2019 Comparisons Between Two Populations

    32/38

    p

    Step 6 - Determine the test statistic (F-score) from the sample data:

    F = s2As2B

    =5.832

    4.122 = 2.002

    Step 7 - Compare the test statistic against its critical value:

    2.002

  • 8/13/2019 Comparisons Between Two Populations

    33/38

    Example 2

  • 8/13/2019 Comparisons Between Two Populations

    34/38

    p

    Step 3 - Determine level of significance and degree of freedom.

    Were told that the significance level is = 0.05.A= 10, A = 10 1 = 9 (denominator)B = 6, B = 6 1 = 5 (numerator, because B has the largest

    variance).Step 4 - Determine the critical value ofF

    We have a 2-tailed test, so we need to find FB ,A,/2=F5,9,0.025= 4.48

    FB ,A,1

    /2=F5,9,0.975=

    1

    F9,5,0.025 =

    1

    6.68 = 0.150

    Statistics dan Probability 34/38 Comparisons Between Two Populations

  • 8/13/2019 Comparisons Between Two Populations

    35/38

    Example 1

  • 8/13/2019 Comparisons Between Two Populations

    36/38

    Step 6 - Determine the test statistic (F-score) from the sample data:

    F =s2Bs2A

    = 0.52

    0.422 = 1.42

    Step 7 - Compare the test statistic against its critical value:0.150

  • 8/13/2019 Comparisons Between Two Populations

    37/38

    As for the t and 2

    distributions, determining P-values for the Fdistribution requires the use of a computer program.

    On the internet, such a program can be found at:

    davidmlane.com/hyperstat/F table.htmlMicrosoft Excel has the function FDIST to work out P-values for the Fdistribution, where:

    p(F > F0) = FDIST(F0, N, D)

    for some numerical value F0

    Statistics dan Probability 37/38 Comparisons Between Two Populations

    Example

    http://localhost/var/www/apps/conversion/tmp/scratch_2/davidmlane.com/hyperstat/F_table.htmlhttp://localhost/var/www/apps/conversion/tmp/scratch_2/davidmlane.com/hyperstat/F_table.html
  • 8/13/2019 Comparisons Between Two Populations

    38/38

    Calculate the P-value for the following data: sA = 0.42 from 10measurements, and sB = 0.5 from 6 measurements. That is, what is theprobability that the precisions are different?

    N =B = 5

    D =A= 9

    F =s2Bs2A

    = 0.52

    0.422 = 1.42

    Using Excel (or the website shown above), we find:

    p(F 1.42) = FDIST(1.42, 5, 9) = 0.305

    Statistics dan Probability 38/38 Comparisons Between Two Populations


Recommended