+ All Categories
Home > Documents > Nolan Ch09

Nolan Ch09

Date post: 20-Oct-2015
Category:
Upload: peacesoul
View: 96 times
Download: 2 times
Share this document with a friend
Description:
dfakaakj
Popular Tags:
30
C H A P T E R 9 The Single-Sample t Test and the Paired-Samples t Test The t Distributions Estimating Population Standard Deviation from the Sample Calculating Standard Error for the t Statistic Using Standard Error to Calculate the t Statistic The Single-Sample t Test The t Table and Degrees of Freedom The Six Steps of the Single-Sample t Test Calculating a Confidence Interval for a Single-Sample t Test Calculating Effect Size for a Single-Sample t Test The Paired-Samples t Test Distributions of Mean Differences The Six Steps of the Paired-Samples t Test Calculating a Confidence Interval for a Paired-Samples t Test Calculating Effect Size for a Paired-Samples t Test 201 You should know the six steps of hypothesis testing (Chapter 7). You should know how to determine a confidence interval for a z statistic (Chapter 8). You should understand the concept of effect size and know how to calculate Cohen’s d for a z test (Chapter 8). BEFORE YOU GO ON
Transcript
Page 1: Nolan Ch09

C H A P T E R 9

The Single-Sample tTest and the Paired-Samples tTest

The t DistributionsEstimating Population Standard Deviation

from the SampleCalculating Standard Error for

the t StatisticUsing Standard Error to Calculate

the t Statistic

The Single-Sample t TestThe t Table and Degrees of FreedomThe Six Steps of the Single-Sample t TestCalculating a Confidence Interval for a

Single-Sample t TestCalculating Effect Size for a Single-Samplet Test

The Paired-Samples t TestDistributions of Mean DifferencesThe Six Steps of the Paired-Samples t Test

Calculating a Confidence Interval for a Paired-Samples t Test

Calculating Effect Size for a Paired-Samples t Test

201

� You should know the six steps of hypothesistesting (Chapter 7).

� You should know how to determine aconfidence interval for a z statistic (Chapter 8).

� You should understand the concept of effectsize and know how to calculate Cohen’s d fora z test (Chapter 8).

BEFORE YOU GO ON

Page 2: Nolan Ch09

Upon arriving at college, many undergraduatestudents are faced with what might seem like in-cessant warnings about weight gain, including thedreaded “freshman 15,” the waistline expansionsresulting from spring break excess, and thepounds put on over the annual winter holidays.In North America, as in many other parts of theworld, the winter holiday season is a time whenfamily food traditions take center stage. Theseholiday foods usually are readily available, beau-tifully presented, and high in calories. Popularwisdom suggests that many Americans add 5 to7 pounds to their body weight over the holidayseason. But before/after studies suggest a far moremodest increase: a weight gain of just over 1pound (Hull, Radley, Dinger, & Fields, 2006;Roberts & Mayer, 2000; Yanovski et al, 2000).A 1-pound weight gain over the holidays

might not seem so bad, but weight gained overthe holidays tends to stay (Yanovski et al, 2000).

The data provide other insights about holiday weight gain. For example, female stu-dents at the University of Oklahoma gained a little less than 1 pound, male studentsa little more than 1 pound, and students who were already overweight gained an av-erage of 2.2 pounds (Hull et al, 2006).The fact that researchers used two groups in their study—students before the hol-

idays and students after the holidays—is significant for this chapter.The versatility of the t distributions allows us to compare two

groups. We can compare one sample to a population when wedon’t know all of the details about the parameters, and we cancompare two samples to each other. There are two ways to com-pare two samples: we can use a within-groups design (as whenthe same people are weighed before and after the holidays) or abetween-groups design (as when different people are in the pre-holiday sample and the post-holiday sample). Whether we use awithin-groups design or a between-groups design to collect thedata for two groups, we use a t test. For a within-groups design,we use a paired-samples t test. Because the steps for a paired- samples t test are similar to those for a single-sample t test, we learnabout these two tests in this chapter. For a between-groups design,we use an independent-samples t test. Because the calculations foran independent-samples t test are a bit different from the first twotypes of t tests, we learn about that test in Chapter 10.

The t DistributionsWhen we compare the average weight of a sample of people before the holidays tothe average weight of a sample of people after the holidays, we are concerned aboutwhether the samples are fair representations of the larger populations. The t test, basedon the t distributions, tells us how confident we can be that what we have learnedfrom our samples generalizes to the larger populations.

202 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

Holiday Weight Gain and Two-Group Studies Two-group studies indicate that theaverage holiday weight gain by college students is less than many people believe, onlyabout 1 pound.

Radiu

s I

mages/A

lam

y

� MASTERING THE CONCEPT

9-1: There are three types of t tests. We use a

single-sample t test when we are comparing a

sample mean to a population mean but do not

know the population standard deviation. We use

a paired-samples t test when we are comparing

two samples and every participant is in both

samples—a within-groups design. We use an

independent-samples t test when we are

comparing two samples and every participant is

in only one sample—a between-groups design.

Page 3: Nolan Ch09

The t distributions are used when we don’t have enough informa-tion to use the z distribution. Specifically, we have to use a t distribu-tion when we don’t know the population standard deviation or whenwe compare two samples to one another. As Figure 9-1 demonstrates,there are many t distributions—one for each possible sample size. Asthe sample size gets smaller, we become less certain about what the pop-ulation distribution really looks like, and the t distributions become flat-ter and more spread out. However, as the sample size gets bigger, the tdistributions begin to merge with the z distribution because we gainconfidence as more and more participants are added to our study.

Estimating Population Standard Deviation from the SampleBefore we can conduct a t test, we have to estimate the standard deviation. To do this,we use the standard deviation of the sample data to estimate the standard deviation ofthe entire population. Estimating the standard deviation is the only practical differ-ence between conducting a z test with the z distribution and conducting a t test witha t distribution. Here is the standard deviation formula that we have used up untilnow with a sample:

We need to make a correction to this formula to account for the fact that there islikely to be some level of error when we’re estimating the population standard devi-ation from a sample. Specifically, any given sample is likely to have somewhat less spreadthan is the entire population. One tiny alteration of this formula leads to the slightlylarger standard deviation of the population that we estimate from the standard devia-tion of the sample. Instead of dividing by N, we divide by (N � 1) to get the meanof the squared deviations. Subtraction is the key. Dividing by a slightly smaller num-ber, (N � 1), instead of by N increases the value of the standard deviation. For exam-ple, if the numerator was 90 and the denominator (N) was 10, the answer is 9; if wedivide by (N � 1) � (10 � 1) � 9, the answer is 10, a slightly larger value. So theformula for estimating the standard deviation of the population from the standard de-viation of the sample is:

sX MN

522

R( )( )

2

1

SDX M

N5

2R( )2

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 203

� MASTERING THE CONCEPT

9-2: We use a t distribution instead of a z

distribution when sampling requires us to

estimate the population standard deviation

from the sample standard deviation or when

we compare two samples to one another.

0

Standard, normal z distributiont distribution, 30 individualst distribution, 8 individualst distribution, 2 individuals

FIGURE 9-1The Wider and Flatter t Distributions

For smaller samples, the t distributionsare wider and flatter than the zdistribution. As the sample sizeincreases, however, the t distributionsapproach the shape of the z distribution.In this figure, the t distribution mostsimilar to the z distribution is that for asample of approximately 30 individuals.This makes sense because a distributionderived from a larger sample size wouldbe more likely to be similar to that of theentire population than one derived from asmaller sample size.

MASTERING THE FORMULA

Page 4: Nolan Ch09

204 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

Notice that we call this standard deviation s instead of SD.It still uses Latin rather than Greek letters because it is a statistic (from a sample) rather than a parameter (from a pop-ulation). From now on, we will calculate the standard devi-ation in this way (because we will be using the samplestandard deviation to estimate the population standard devi-ation), and we will be calling our standard deviation s.Let’s apply the new formula for standard deviation to an

everyday situation that many of us can relate to: multitask-ing. This formula marks an important step in conducting a ttest. Researchers conducted a study in which employees were observed at one of two high-tech companies for over 1000hours (Mark, Gonzalez, & Harris, 2005). The employees spentjust 11 minutes, on average, on one project before an inter-ruption. Moreover, after each interruption, they needed anaverage of 25 minutes to get back to the original project! Soeven though a person who is busy multitasking appears to

be productive, maybe the underlying reality is that multitasking actually reduces over-all productivity. How can we use a t test to determine the effects of multitasking onproductivity?Suppose you were a manager at one of these firms and decided to reserve a period

from 1:00 to 3:00 each afternoon during which employees could not interrupt oneanother, but they might still be interrupted by phone calls or e-mails from peopleoutside the company. To test your intervention, you observe five employees duringthese periods and develop a score for each—the time he or she spent on a selectedtask before being interrupted. Here are your fictional data: 8, 12, 16, 12, and 14 min-utes. In this case, we are treating 11 minutes as the population mean, but we do notknow the population standard deviation. As a key step in conducting a t test, we needto estimate the standard deviation of the population from the sample.

To calculate our estimated standard deviation for the population, there are two steps.

Even though we are given a populationmean (i.e., 11), we use the sample mean to

calculate the corrected standard deviation for the sample.The mean for these 5 samplescores is:

Remember, the easiest way to calculate the numerator under the square root signis by first organizing our data into columns, as shown here:

sX MN

522

R( )( )

2

1

STEP 2. Use this sample mean in thecorrected formula for thestandard deviation.

M 51 1 1 1

5( )

.8 12 16 12 14

512 4

STEP 1. Calculate the sample mean.

Multitasking If multitasking reduces productivity in a sample, we canstatistically determine the likelihood that multitasking reduces productivityamong a much larger population.

Corb

is

EXAMPLE 9.1

Page 5: Nolan Ch09

Calculating Standard Error for the t StatisticAfter we make the correction, we have an estimate of the standard deviation of the dis-tribution of scores but not an estimate of the spread of a distribution of means, thestandard error. As we did with the z distribution, we need to make our spread smallerto reflect the fact that a distribution of means is less variable than a distribution of scores.We do this in exactly the same way that we adjusted for the z distribution. We divides by . The formula for the standard error as estimated from a sample, therefore, is:

s sNM 5

N

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 205

Thus, the numerator is:

R(X � M)2 � R(19.36 � 0.16 � 12.96 � 0.16 � 2.56) � 35.2

And given a sample size of 5, the corrected standard deviation is:

sX MN

522

5 5 5R( )

( ).

( ). .

2

135 25 1

8 8 2 97−

X X � M (X � M )2

8 �4.4 19.36

12 �0.4 0.16

16 3.6 12.96

12 �0.4 0.16

14 1.6 2.56

A Simple Correction: N � 1 When estimating variability, subtracting one person from a sample of fourmakes a big difference. Subtracting one person from a sample of thousands makes only a small difference.

OLIV

ER

WE

IKE

N/e

pa/C

orb

is

INA

CIO

RO

SA

/epa/C

orb

is

MASTERING THE FORMULA

Page 6: Nolan Ch09

206 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

EXAMPLE 9.2

Notice that we have replaced r with s because we are using the corrected stan-dard deviation from the sample rather than the actual standard deviation from thepopulation.

Here’s how we would convert our corrected standard deviation of 2.97 (from the dataabove on minutes before an interruption) to a standard error. Our sample size was 5,so we divide by the square root of 5:

So the appropriate standard deviation for the distribution of means—that is, its stan-dard error—is 1.33. Just as the central limit theorem predicts, the standard error forthe distribution of sample means is smaller than the standard deviation of sample scores(1.33 � 2.97).(Note:This step leads to one of the most common mistakes that we see among our

students. Because we have implemented a correction when calculating s, students wantto implement an extra correction here by dividing by . Do not do this! Westill divide by in this step. We are making our standard deviation smaller to reflectthe size of the sample; there is no need for a further correction to the standard error.)

Using Standard Error to Calculate the t StatisticOnce we know how to estimate the population standard deviation from our sampleand then use that to calculate standard error, we have all the tools necessary to con-duct a t test. The simplest type of t test is the single-sample t test. We introduce theformula for that t statistic here, and in the next section we go through all six steps fora single-sample t test. The formula to calculate the t statistic for a single-sample t testis identical to that for the z statistic, except that it uses the estimated standard errorrather than the actual standard error of the population of means. So, the t statistic in-dicates the distance of a sample mean from a population mean in terms of the standard error.That distance is expressed numerically as the estimated number of standard errors be-tween the two means. Here is the formula for the t statistic for a distribution of means:

Note that the denominator is the only difference between this formula for the tstatistic and the formula used to compute the z statistic for a sample mean. The cor-rected denominator makes the t statistic smaller and thereby reduces the probabilityof observing an extreme t statistic. That is, a t statistic is not as extreme as a z statistic;in scientific terms, it’s more conservative.

The t statistic for our sample of 5 scores representing minutes until interruptions is:

As part of the six steps of hypothesis testing, this t statistic, 1.05, can help us makean inference about whether the communication ban from 1:00 to 3:00 affected theaverage number of minutes until an interruption.

tM

sM

M

( )l

N( )N 21

s sNM 5 5 5

2 975

1 33. .

t Ms

M

M

( ) ( . ).

.l 12 4 11

1 331 05

�MASTERING THE FORMULA

9-3: The formula for the t statistic

is: . It only differs from

the formula for the z statistic in thatwe use sM instead of rM, becausewe’re using the sample to estimatestandard error rather than using theactual population standard error.

t(M

sM

M

l )

EXAMPLE 9.3

Page 7: Nolan Ch09

The Single-Sample t TestA before/after comparison of weight change over the holidays is only one of the in-teresting comparisons we can make using the t statistic. There might also be regionaldifferences in how much people weigh. For example, the t statistic can be used to com-pare the average weight from a sample of people in a particular region with the na-tional average. To answer that kind of question, we now demonstrate how to conducta single-sample t test that uses a distribution of means.

A single-sample t test is a hypothesis test in which we compare data from one sample toa population for which we know the mean but not the standard deviation.The only thing we

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 207

As with the z distribution, statisticians have developed t tables that include proba-bilities under any area of the t curve. We provide you with a t table for many differ-ent sample sizes in Appendix B. The t table includes only the percentages of mostinterest to researchers—those indicating the extreme scores that suggest large differ-ences between groups.

CHECK YOUR LEARNINGReviewing the Concepts > The t distributions are used when we do not know the population standard deviation or

are comparing only two groups.

> The two groups may be a sample and a population, or the two groups may be two sam-ples as part of a within-groups design or a between-groups design.

> Because we do not know the population standard deviation, we must estimate it, and es-timating invites the possibility of more error.

> The formula for the t statistic for a single-sample t test is the same as the formula for thez statistic for a distribution of means, except that we use estimated standard error in thedenominator rather than the actual standard error for the population.

Clarifying the Concepts 9-1 What is the t statistic?

9-2 Briefly describe the three different t tests.

Calculating the Statistics 9-3 Calculate the standard deviation for a sample (SD) and as an estimate of the population(s) using the following data: 6, 3, 7, 6, 4, 5.

9-4 Calculate the standard error for t for the data given in Check Your Learning 9-3.

Applying the Statistics 9-5 In our discussion of a study on multitasking (Mark et al, 2005), we imagined a follow-up study in which five employees were observed following a communication ban from1:00 to 3:00. For each of the five employees, one task was selected. Let’s now examinethe time until work on that task was resumed. The fictional data for the 5 employeeswere 20, 19, 27, 24, and 18 minutes until work on the given task was resumed.Remember that the original research showed it took 25 minutes on average for anemployee to return to a task after being interrupted.

a. What distribution will be used in this situation? Explain your answer.

b. Determine the appropriate mean and standard deviation (or standard error) for thisdistribution. Show all your work; use symbolic notation and formulas whereappropriate.

c. Calculate the t statistic.

Solutions to these Check Your

Learning questions can be found in

Appendix D.

� The t statistic indicates thedistance of a sample meanfrom a population mean interms of the standard error.

� A single-sample t test is ahypothesis test in which wecompare data from onesample to a population forwhich we know the mean butnot the standard deviation.

Page 8: Nolan Ch09

208 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

EXAMPLE 9.4

�MASTERING THE FORMULA

9-4: The formula for degrees offreedom for a single-sample t test is:df � N � 1. To calculate degrees offreedom, we subtract 1 from thesample size.

need to know to use a single-sample t test is the population mean. We begin with thesingle-sample t test because understanding it will help us when using the more so-phisticated t tests that let us compare two samples.

The t Table and Degrees of FreedomWhen we use the t distributions, we use the t table. There are different t distributionsfor every sample size, so we must take sample size into account when using the t table.However, we do not look up our actual sample size on the table. Rather, we look updegrees of freedom, the number of scores that are free to vary when estimating a population pa-rameter from a sample.The phrase free to vary refers to the number of scores that can takeon different values if we know a given parameter.

For example, the manager of a baseball team needs to assign nine players to particularspots in the batting order but only has to make eight decisions (N � 1). Why? Becauseonly one option remains after making the first eight decisions. So before the managermakes any decisions, there are N � 1, or 9 � 1 � 8, degrees of freedom. After the sec-ond decision, there are N � 1, or 8 � 1 � 7, degrees of freedom, and so on.As in the baseball example, there is always one score that cannot vary once all of

the others have been determined. For example, if we know the mean of four scores is6 and we know that three of the scores are 2, 4, and 8, then the last score must be 10.So the degrees of freedom is the number of scores in the sample minus 1; there is al-ways one score that cannot vary. Degrees of freedom is written in symbolic notationas df, which is always italicized. The formula for degrees of freedom for a single-sam-ple t test, therefore, is:

df � N � 1

This is one key piece of information to keep in mind as we work with the t table.In the behavioral sciences, the degrees of freedom usually correspond to how manypeople are in the study or how many observations we make.Table 9-1 is an excerpt from a t table, but an expanded table is included in Appen-

dix B. Consider the relation between degrees of freedom and the cutoff, or criticalvalue, needed to declare statistical significance. In the column corresponding to a one-tailed test at a p level of 0.05 with only 1 degree of freedom, the critical t value is

TABLE 9-1. Excerpt from the t Table

When conducting hypothesis testing, we use the t table to determine critical values for a given p level, basedon the degrees of freedom and whether the test is one- or two-tailed.

One-Tailed Tests Two-Tailed Tests

df 0.10 0.05 0.01 0.10 0.05 0.01

1 3.078 6.314 31.821 6.314 12.706 63.657

2 1.886 2.920 6.965 2.920 4.303 9.925

3 1.638 2.353 4.541 2.353 3.182 5.841

4 1.533 2.132 3.747 2.132 2.776 4.604

5 1.476 2.015 3.365 2.015 2.571 4.032

� Degrees of freedom is thenumber of scores that are freeto vary when estimating apopulation parameter from asample.

Page 9: Nolan Ch09

6.314. With only 1 degree of freedom, the two means have to be extremely far apartand the standard deviation has to be very small in order to declare that a statisticallysignificant difference exists. But with 2 degrees of freedom (two observations), the crit-ical t value drops to 2.920. With 2 degrees of freedom, the two means don’t have tobe quite so far apart or the standard deviation so small. That is, it is easier to reach thecritical value of 2.920 needed to declare that there is a statistically significant differ-ence. We’re more confident with two observations than with just one.Now notice what happens when we increase the number of observations once again

from two observations to three observations. The critical t value needed to declare sta-tistical significance once again decreases, from 2.920 to 2.353. Our level of confidence inour observation increases with each additional observation; at the same time, the criticalvalue decreases, becoming closer and closer to the related cutoff on the z distribution.The t distributions become closer to the z distribution as sample size increases. When

the sample size is large enough, the standard deviation of a sample is more likely to beequal to the standard deviation of the population. At large enough sample sizes, in fact,the t distribution is identical to the z distribution. Most t tables include a sample sizeof infinity (∞) to indicate a very large sample size (a sample size of infinity itself is, ofcourse, impossible). The t statistics at extreme percentages for very large sample sizesare identical to the z statistics at the very same percentages. Check it out for yourselfby comparing the z and t tables in Appendix B. For example, the z statistic for the 95thpercentile—a percentage between the mean and the z statistic of 45%—is between 1.64and 1.65; at a sample size of infinity, the t statistic for the 95th per-centile is 1.645.Let’s remind ourselves why the t statistic merges with the z statistic

as sample size increases. The underlying principle is easy to understand:more observations lead to greater confidence. Thus, moreparticipants in a study—if they are a representative sample—corre-spond to increased confidence that we are making an accurate obser-vation. So don’t think of the t distributions as completely separate fromthe z distribution. Rather, think of the z statistic as a single-blade SwissArmy knife and the t statistic as a multiblade Swiss Army knife that stillincludes the single blade that is the z statistic.Let’s determine the cutoffs, or critical t value(s), for two research

studies. For the first study, you may use the excerpt in Table 9-1. Thesecond study requires the full t table in Appendix B.

The study: A researcher collects Stroop reaction times for five participants who havehad reduced sleep for three nights. She wants to compare this sample to the knownpopulation mean. Her research hypothesis is that the lack of sleep will slow partici-pants down, leading to an increased reaction time. She will use a p level of 0.05 to de-termine her critical value.

The cutoff(s):This is a one-tailed test because the research hypothesis posits a changein only one direction—an increase in reaction time. There will be only a positive crit-ical t value because we are hypothesizing an increase. There are five participants, so thedegrees of freedom is:

df � N � 1 � 5 � 1 � 4

Her stated p level is 0.05. When we look in the t table under one-tailed tests, in thecolumn labeled 0.05 and in the row for a df of 4, we see a critical value of 2.132. Thisis our critical t value.

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 209

� MASTERING THE CONCEPT

9-3: As sample size increases, the t

distributions more and more closely

approximate the z distribution. You can

think of the z statistic as a single-blade

Swiss Army knife and the t statistic as a

multiblade Swiss Army knife that includes

the single blade that is the z statistic.

EXAMPLE 9.5

Page 10: Nolan Ch09

210 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

EXAMPLE 9.6 The study:A researcher knows the mean number of calories a rat will consume in halfan hour if unlimited food is available. He wonders whether a new food will lead ratsto consume a different number of calories—either more or fewer. He studies 38 ratsand uses a conservative critical value based on a p level of 0.01.

The cutoff(s): This is a two-tailed test because the research hypothesis allows forchange in either direction. There will be both negative and positive critical t values.There are 38 rats, so the degrees of freedom is:

df � N � 1 � 38 � 1 � 37

His stated p level is 0.01. We want to look in the t table under two-tailed tests, inthe column for 0.01 and in the row for a df of 37; however, there is no df of 37. Inthis case, we err on the side of being more conservative and choose the more extreme(i.e., larger) of the two possible critical t values, which is always the smaller df. Here,we look next to 35, where we see a value of 2.724. Because this is a two-tailed test,we will have critical values of �2.724 and 2.724. Be sure to list both values.

The Six Steps of the Single-Sample t TestNow we have all the tools necessary to conduct a single-sample t test. So let’s con-sider a hypothetical study and conduct all six steps of hypothesis testing.

Chapter 4 presented data that included the mean number of sessions attended by clientsat a university counseling center. We noted that one study reported a mean of 4.6 ses-sions (Hatchett, 2003). Let’s imagine that the counseling center hoped to increase par-

ticipation rates by having students sign a contract to attend at least 10sessions. Five students sign the contract, and these students attend 6, 6, 12,7, and 8 sessions, respectively. The researchers are interested only in theiruniversity, so they treat the mean of 4.6 sessions as a population mean.

Population 1: All clients at thiscounseling center who sign acontract to attend at least 10 ses-

sions. Population 2: All clients at this counseling center who do not signa contract to attend at least 10 sessions.The comparison distribution will be a distribution of means. The hypoth-

esis test will be a single-sample t test because we have only one sample andwe know the population mean but not the population standard deviation.This study meets one of the three assumptions and may meet the other

two: (1) The dependent variable is scale. (2) We do not know whether thedata were randomly selected, however, so we must be cautious with re-spect to generalizing to other clients at this university who might sign thecontract. (3) We do not know whether the population is normally distrib-uted, and there are not at least 30 participants. However, the data from oursample do not suggest a skewed distribution.

Null hypothesis: Clients at thisuniversity who sign a contractto attend at least 10 sessions at-

tend the same number of sessions, on average, as clients who do not signsuch a contract—H0: l1 � l2.

STEP 1. Identify the populations,distribution, and assumptions.

STEP 2. State the null and researchhypotheses.

EXAMPLE 9.7

Nonparticipation in Therapy Clients missingappointments can be a problem for their therapists. A t test can compare the consequences between thosewho do and those who do not commit themselves toparticipating in therapy for a set period.

Zig

y K

alu

zny/G

etty I

mages

Page 11: Nolan Ch09

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 211

Research hypothesis: Clients at this university who sign a contract to attend at least10 sessions attend a different number of sessions, on average, from clients who do notsign such a contract—H1: l1 � l2.

lM � 4.6; sM � 1.114

Calculations:

lM � l � 4.6

The numerator is the sum of the squares:

R(X � M )2 � R(3.24 � 3.24 � 17.64 � 0.64 � 0.04) � 24.8

df � N � 1 � 5 � 1 � 4

For a two-tailed test with a p level of 0.05 and a df of 4, the critical values are�2.776 and 2.776 (as seen in the curve in Figure 9-2).

STEP 4. Determine the critical values,or cutoffs.

s sNM 5 5 5

2 4905

1 114. .

sX MN

522

5 5 5R( )

( ).

( ). .

2

124 85 1

6 2 2 490−

X X � M (X � M )2

6 �1.8 3.24

6 �1.8 3.24

12 4.2 17.64

7 �0.8 0.64

8 0.2 0.04

M XN

5 51 1 1 1

5R ( )

.6 6 12 7 8

57 8

STEP 3. Determine the characteristicsof the comparison distribution.

2.50% 2.50%

2.77622.776

FIGURE 9-2Determining Cutoffs for a tDistribution

As with the z distribution, we typicallydetermine critical values in terms of tstatistics rather than means of rawscores so that we can easily compare atest statistic to them to determinewhether the test statistic is beyond thecutoffs. Here, the cutoffs are �2.776and 2.776, and they mark off the mostextreme 5%, with 2.5% in each tail.

Page 12: Nolan Ch09

212 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

After completing our hypothesis test, we want to present the primary statistical in-formation in a report. There is a standard American Psychological Association (APA)format for the presentation of statistics across the behavioral sciences so that the re-sults are easily understood by the reader. You’ll notice this format in almost everyjournal article that reports results of a social science study:

1. Write the symbol for the test statistic (e.g., t).2. Write the degrees of freedom, in parentheses.3. Write an equal sign and then the value of the test statistic, typically to two dec-imal places.

4. Write a comma and then the exact p value associated with the test statistic.(Note that we must use software to get the exact p value. For now, we can justsay whether our p value is less than the p level of 0.05.)

In our example, then we reject the null hypothesis and the statistics would read:

t(4) � 2.87, p � 0.05

The statistic typically follows a statement about the finding, after a comma or inparentheses: for example, “It appears that counseling center clients who sign a con-tract to attend at least 10 sessions, on average, do attend more sessions than do clientswho do not sign such a contract.” The report would also include the sample mean andthe standard deviation (not the standard error) to two decimal points. The descriptivestatistics, typically in parentheses, would read, for our example: (M � 7.80, SD � 2.49).Notice that, due to convention, we use SD instead of s to symbolize the standard deviation.As with a z test, we could present prep as an alternative to the p value, a change that has

been encouraged by the Association for Psycho logical Science (APS). The t table in Ap-pendix B only includes the p values of .10, .05, and .01, so we cannot use it to determinethe actual p value for our test statistic. In the SPSS section of this chapter, we show you

Reject the null hypothesis; it appears thatcounseling center clients who sign a con-

tract to attend at least 10 sessions do attend more sessions, on average, than do clientswho do not sign such a contract (see Figure 9-3).

tM

sM

M5

25

25

( ) ( . . ).

.l 7 8 4 6

1 1142 873

STEP 5. Calculate the test statistic.

STEP 6.Make a decision.

2.50% 2.50%

2.77622.776

2.873

FIGURE 9-3Making a Decision

To decide whether to reject the nullhypothesis, we compare our test

statistic to our critical t values. In thisfigure, the test statistic, 2.873, is

beyond the cutoff of 2.776, so we canreject the null hypothesis.

Page 13: Nolan Ch09

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 213

how you can use SPSS to determine the specific p value for the test statistic. Then we caninsert the p value into the Excel formula introduced in Chapter 8 to determine prep:�NORMSDIST (NORMSINV (1-P)/ (SQRT(2))). This procedure is the same for allthree kinds of t tests—single- sample, paired-samples, and independent-samples t tests.

Calculating a Confidence Interval for a Single-Sample t TestAs with a z test, the APA recommends that researchers report confidence intervalsand effect sizes, in addition to the results of hypothesis tests, whenever possible. Wecan calculate a confidence interval with the single-sample t test data. The populationmean was 4.6, and we used the sample to estimate the population standard deviationto be 2.490 and the population standard error to be 1.114. The five students in thesample attended a mean of 7.8 sessions.When we conducted hypothesis testing, we centered our curve

around the mean according to the null hypothesis—the populationmean of 4.6. We determined critical values based on this mean andcompared our sample mean to these cutoffs. We were able to rejectthe null hypothesis that there was no mean difference between thetwo groups. The test statistic was beyond the cutoff z statistic. Nowwe can use the same information to calculate the 95% confidenceinterval around the sample mean of 7.8.

Step 1: Draw a picture of a t distribution that includes the con-fidence interval. We draw a normal curve (see Figure 9-4) that hasthe sample mean, 7.8, at its center (instead of the population mean, 4.6).

Step 2: Indicate the bounds of the confi-dence interval on the drawing. We draw a verti-cal line from the mean to the top of the curve.For a 95% confidence interval, we also draw twomuch smaller vertical lines indicating the mid-dle 95% of the t distribution (2.5% in each tailfor a total of 5%).We then write the appropriate percentages

under the segments of the curve. The curve issymmetric, so half of the 95% falls above and halffalls below the mean. Thus, 47.5% falls on eachside of the mean between the mean and the cut-off, and 2.5% falls in each tail.

Step 3: Look up the t statistics that fall at eachline marking the middle 95%. For a two-tailed test with a p level of 0.05 and df of 4, thecritical values are �2.776 and 2.776. We can now add these t statistics to our curve, as seenin Figure 9-5.

Step 4: Convert the t statistics back into raw means. As we did with the z test, wecan use formulas for this conversion, but first we must identify the appropriate mean

� MASTERING THE CONCEPT

9-4: Whenever researchers conduct a

hypothesis test, the APA encourages that, if

possible, they also calculate a confidence

interval and an effect size.

7.8

2.50%

47.50% 47.50%

2.50%

FIGURE 9-4A 95% Confidence Interval for a Single-Sample t Test, Part I

To begin calculating a confidence interval for a single-sample t test, we place the samplemean, 7.8, at the center of a curve and indicate the percentages within and beyond theconfidence interval.

2.50%

(0) 2.77622.776

47.50% 47.50%

2.50%

FIGURE 9-5A 95% Confidence Interval for a Single-Sample t Test, Part II

The next step in calculating a confidence interval for a single-sample t test is to identify the t statistics that indicate each endof the interval. Because the curve is symmetric, the t statisticshave the same magnitude—one is negative, �2.776, and oneis positive, 2.776. The t statistic at the mean is always 0.

Page 14: Nolan Ch09

and standard deviation. There are two importantpoints to remember. First, we center our inter-val around the sample mean (not the populationmean). So we use the sample mean of 7.8 in ourcalculations. Second, because we have a samplemean (rather than an individual score), we use adistribution of means. So we use the standard er-ror of 1.114 as our measure of spread.Using this mean and standard error, we can

calculate the raw mean at each end of the con-fidence interval, the lower end and the upper

end, and add them to our curve as in Figure 9-6. The formulas are exactly the sameas for the z test except that z is replaced by t and rM is replaced by sM.

Mlower � �t(sM) � Msample � �2.776(1.114) � 7.8 � 4.71

Mupper � t(sM) � Msample � 2.776(1.114) � 7.8 � 10.89

Our 95% confidence interval, reported in brackets as is typical, is [4.71, 10.89].Step 5: Check that the confidence interval makes sense. The sample mean should

fall exactly in the middle of the two ends of the interval.

4.71 � 7.8 � �3.09 and 10.89 � 7.8 � 3.09

We have a match. The confidence interval ranges from 3.09 below the sample meanto 3.09 above the sample mean. If we were to sample five students from the same pop-ulation over and over, the 95% confidence interval would include the populationmean 95% of the time. Note that the population mean, 4.6, does not fall within thisinterval. This means it is not plausible that this sample of students who signed contractscame from the population according to the null hypothesis—students seeking treat-ment at the counseling center who did not sign a contract. We can conclude that thesample comes from a different population; that is, we can conclude that these studentsattended more sessions than did the general population. As with the z test, the conclu-sions from both the single-sample t test and the confidence interval are the same, butthe confidence interval gives us more information—an interval estimate, not just apoint estimate.

Calculating Effect Size for a Single-Sample t TestAs with a z test, we can calculate the effect size (Cohen’s d) for a single-sample t test.Let’s calculate it for the counseling center study. Similar to what we did with the z test,we simply use the formula for the t statistic, substituting s for sM (and l for lM, eventhough these means are always the same). This means we use 2.490 instead of 1.114in the denominator. The Cohen’s d is now based on the spread of the distribution ofindividual scores, rather than the distribution of means.

Cohen’s

Our effect size, d � 1.29, tells us that our sample mean and the population meanare 1.29 standard deviations apart. According to the conventions we learned in Chap-ter 8 (0.2 is a small effect; 0.5 is a medium effect; 0.8 is a large effect), this is a largeeffect. We can add the effect size when we report the statistics as follows: t(4) � 2.87,p � 0.05, d � 1.29.

dM

s( ) ( . . )

..

7 8 4 62 490

1 29l

�MASTERING THE FORMULA

9-5: The formula for the lowerbound of a confidence interval for asingle-sample t test is Mlower � �t(sM)� Msample.The formula for the upperbound of a confidence interval for asingle-sample t test is Mupper � t(sM)� Msample.The only differences fromthose for a z test are that in each formula z is replaced by t and rM is replaced by sM.

MASTERING THE FORMULA

2.50%

7.8 10.894.71

47.50% 47.50%

2.50%

FIGURE 9-6A 95% Confidence Interval for a Single-Sample t Test, Part III

The final step in calculating a confidence interval for a single-sample t test is to convert the tstatistics that indicate each end of the interval into raw means, 4.71 and 10.89.

Page 15: Nolan Ch09

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 215

The Paired-Samples t TestResearchers found that weight gain over the holidays was far less than once thought.The dreaded “freshman 15” also appears to be an exaggerated myth. It’s really less than4 pounds, on average. One study sampled college students at a northeastern univer-sity and compared their weight at the beginning of the fall semester with how muchthey weighed by November (Holm-Denoma, Joiner, Vohs, & Heatherton, 2008). Malestudents gained an average of 3.5 pounds and female students gained an average of 4.0pounds. These types of before/after comparisons can be tested by using the paired-samples t test.

CHECK YOUR LEARNINGReviewing the Concepts > A single-sample t test is a hypothesis test in which we compare data from one sample to

a population for which we know the mean but not the standard deviation.

> We consider degrees of freedom, or the number of scores that are free to vary, instead ofN when we assess estimated test statistics against distributions.

> As sample size increases, our confidence in our estimates improves, degrees of freedom in-crease, and the critical value for t drops, making it easier to reach statistical significance. Infact, as sample size grows, the t distributions approach the z distribution.

> As with any hypothesis test, we identify the populations and comparison distribution andcheck the assumptions. We then state the null and research hypotheses. We next deter-mine the characteristics of the comparison distribution, a distribution of means basedon the null hypothesis. We must first estimate the standard deviation from our sample;then we must calculate the standard error. We then determine critical values, usually fora two-tailed test with a p level of 0.05. The test statistic is then calculated and comparedto these critical values, or cutoffs, to determine whether to reject or fail to reject thenull hypothesis.

> We can determine prep for our test statistic as an alternative to p value.

> We can calculate a confidence interval and an effect size, Cohen’s d, for a single-sample t test.

Clarifying the Concepts 9-6 Explain the term degrees of freedom.

9-7 Why does a single-sample t test have more uses than a z test?

Calculating the Statistics 9-8 Compute degrees of freedom for each of the following:

a. An experimenter times how long it takes 35 rats to run through a maze with 8pathways.

b. Test scores for 14 students are collected and averaged over 4 semesters.

9-9 Identify the critical t value for each of the following tests:

a. A two-tailed test with alpha of 0.05 and 11 degrees of freedom

b. A one-tailed test with alpha of 0.01 and N of 17

Applying the Concepts 9-10 Let’s assume that according to university summary statistics, the average student misses3.7 classes during a semester. Imagine the data you have been working with (6, 3, 7, 6,4, 5) are the number of classes missed by a group of students. Conduct all six steps ofhypothesis testing, assuming a two-tailed test with a p level of 0.05. (Note:The workfor step 3 has already been completed in Check Your Learning exercises 9-3 and 9-4.)

Solutions to these Check Your

Learning questions can be found in

Appendix D.

Page 16: Nolan Ch09

216 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

� The paired-samples t test isused to compare two meansfor a within-groups design, asituation in which everyparticipant is in both samples;also called a dependent-samples t test.

The paired-samples t test (also called dependent-samples t test) is used to compare twomeans for a within-groups design, a situation in which every participant is in both samples.Al-though both terms—paired and dependent—are used frequently, we use the term pairedin this book because it matches the language of some of the most-used statistical soft-ware packages. If an individual in the study participates in both conditions (such as amemory task after ingesting a caffeinated beverage and again after ingesting a noncaf-feinated beverage), then her score in one depends on her score in the other. That’swhen we use the dependent-samples, or paired-samples, t test. Once you understand thesingle-sample t test, the paired-samples t test is simple. The major difference in thepaired-samples t test is that we must create difference scores for every individual.

Distributions of Mean DifferencesWe already have learned about a distribution of scores and a distribution of means.Now we need to develop a distribution of mean differences so that we can establish adistribution that specifies the null hypothesis. Let’s use pre- and post-holiday weightdata to demonstrate how to create a distribution of mean differences, the distributionthat accompanies a within-groups design.Imagine that many college students’ weights were measured before and after the

winter holidays to determine if they changed, and you planned to gather data on asample of three people. Imagine that you have two cards for each person on whichweights are listed—one before the holidays and one after the holidays. So you havemany pairs of cards, one for each student. First, you would randomly choose threepairs of cards. For each pair, you’d subtract the first weight from the second weight tocalculate a difference score, and then you would calculate the mean of the differencesin weights for these three people. Then you would randomly choose another threepeople from the population of many college students and calculate the mean of theirthree difference scores. And then you’d do it again, and again, and again. That’s reallyall there is to it, except that we do this procedure many more times (although we aresimplifying a bit here; we would actually replace every pair of cards before selectingthe next). So there are two samples—college students before the holidays and collegestudents after the holidays—but we’re building just one curve of mean differences.Let’s say the first student weighed 140 pounds before the holidays and 144 pounds

after the holidays; the difference between weights would be 144 � 140 � 4. Next,we put all the cards back and repeat the process. Let’s say that this time we haveweights of 126 pounds before the holidays and 124 pounds after the holidays; now the

difference between means would be 124 � 126� �2. A third student might weigh 168 bothbefore and after the holidays for a difference be-tween means of 0. We would take the mean ofthese three different scores, 0.667. We wouldthen choose three more students and calculatethe mean of their difference scores. Eventually,we would have many mean differences—somepositive, some negative, and some right at 0—and could plot them on a curve. But this wouldonly be the beginning of what this distributionwould look like. If we were to calculate thewhole distribution, then we would do this anuncountable number of times. When the au-thors calculated 30 mean differences for pairs ofweights, we got the distribution in Figure 9-7

6

7

8

5

2

1

3

4

025 24 23 22 21 0 1 2 3 4 5

Frequency

Mean weight differences(in pounds)

FIGURE 9-7Creating a Distribution of Mean Differences

This distribution is one of many thatcould be created by pulling 30 meandifferences, the average of threedifferences between pairs of weights,pulled one a time from a population ofpairs of weights—one pre-holiday andone post-holiday. The population usedhere was one based on the nullhypothesis, that there is no averagedifference in weight from before theholidays to after the holidays.

Page 17: Nolan Ch09

(we plotted means rounded to whole numbers). If no meandifference is found when comparing weights from before andafter the holidays, as with the data we used to create Figure9-7, the distribution would center around 0. According tothe null hypothesis, we would expect no difference in aver-age weight from before the holidays to after the holidays.

The Six Steps of the Paired-Samples t TestIn a paired-samples t test, each participant has two scores—onein each condition. When we conduct a paired-samples t test,we write the pairs of scores in two columns, side by side nextto the same individual. We then subtract each score in one col-umn from its paired score in the other column to create dif-ference scores. Ideally, a positive difference score indicates anincrease, and a negative difference score indicates a decrease.Typically, we subtract the first score from the second so thatour difference scores match this logic. Now we implement thesteps of the single-sample t test with only minor changes, as discussed in the steps below.

Let’s try an example from the social sciences. Computers and software companies of-ten employ social scientists to research ways their products can better benefit users.For example, Microsoft researchers studied how 15 volunteers performed on a set oftasks under two conditions. The researchers compared the volunteers’ performance onthe tasks while using a 15-inch computer monitor and while using a 42-inch moni-tor (Czerwinski et al, 2003). The 42-inch monitor, far larger than most of us have everused, allows the user to have multiple programs in view at the same time.Here are five participants’ fictional data, which reflect the actual means reported by

researchers. Note that a smaller number is good—it indicates a faster time. The firstperson completed the tasks on the small monitor in 122 seconds and on the large mon-itor in 111 seconds; the second person in 131 and 116; the third in 127 and 113; thefourth in 123 and 119; and the fifth in 132 and 121.

The paired-samples t test islike the single-sample t testin that we analyze a singlesample of scores. For the

single-sample t test, we use individual scores; for the paired-samples ttest, we use difference scores. For the paired-samples t test, one popula-tion is reflected by each condition, but the comparison distribution is adistribution of mean difference scores (rather than a distribution of means).The comparison distribution is the same as with the single-sample t test;it is based on the null hypothesis that posits no difference. So the meanof the comparison distribution is 0; this indicates a mean difference scoreof 0. The assumptions are the same as for the single-sample t test.Summary: Population 1: People performing tasks using a 15-inch mon-itor. Population 2: People performing tasks using a 42-inch monitor.The comparison distribution will be a distribution of mean difference scores based

on the null hypothesis. The hypothesis test will be a paired- samples t test because wehave two samples of scores, and every individual contributes a score to each sample.

STEP 1. Identify the populations,distribution, andassumptions.

EXAMPLE 9.8

� MASTERING THE CONCEPT

9-5: The steps for the paired-samples t test

are very similar to those for the single-

sample t test. The main difference is that

we are comparing the sample mean

difference between scores to that for the

mean difference for the population

according to the null hypothesis, rather than

comparing the sample mean of individual

scores to the population mean.

Large Monitors and Productivity Microsoft researchers andcognitive psychologists (Czerwinski et al, 2003) reported a 9% increasein productivity when research volunteers used an extremely large 42-inchdisplay versus a more typical 15-inch display. Every participant used bothdisplays and thus was in both samples. A paired-samples t test is theappropriate hypothesis test for this within-groups design.

Co

urt

esy o

f M

icro

so

ft R

ese

arc

h

Page 18: Nolan Ch09

218 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

This study meets one of the three assumptions and may meet the other two: (1) Thedependent variable is time, which is scale. (2) The participants were not randomly se-lected, however, so we must be cautious with respect to generalizing our findings. (3) Wedo not know whether the population is normally distributed, and there are not at least30 participants. However, the data from our sample do not suggest a skewed distribution.

This step is identical to that for the single-sample t test. Remember, hypotheses are always about populations, not about our spe-

cific samples.Summary: Null hypothesis: People who use a 15-inch screen will complete a set oftasks in the same amount of time, on average, as people who use a 42-inch screen—H0: l1 � l2. Research hypothesis: People who use a 15-inch screen will complete aset of tasks in a different amount of time, on average, from people who use a 42-inchscreen—H1: l1 � l2.

This step is similar to that for the single- sample t test in that we determine the appro -priate mean and standard error of the com-

parison distribution—the distribution based on the null hypothesis. In the single-samplet test, there was a comparison mean, and the null hypothesis posited that the samplemean would be the same as that of the comparison distribution. With the paired-sam-ples t test, we have a sample of difference scores. According to the null hypothesis, thereis no difference; that is, the mean difference score is 0. So the mean of the comparisondistribution is always 0, as long as the null hypothesis posits no difference.The standard error is calculated exactly as it is calculated for the single-sample t

test, only we use the difference scores rather than the scores in each condition. To getthe difference scores in the current example, we want to know what happens whenwe go from the control condition (small screen) to the experimental condition (largescreen), so we subtract the first score from the second score. This means that a nega-tive difference indicates a decrease in time when the screen goes from small to largeand a positive difference indicates an increase in time. (The test statistic will be thesame if we reverse the order in which we subtract, but the sign will change. In somecases, you can think about it as subtracting the “before” score from the “after” score.)Another helpful strategy is to cross out the original scores once we’ve created the

difference scores so that we remember to use only the difference scores from that pointon. If we don’t cross out the original scores, it is very easy to use them in our calcu-lations and end up with an incorrect standard error.Summary: lM � 0; sM � 1.923Calculations: (Notice that we crossed out the original scores once we created our

column of difference scores. We did this to remind ourselves that all remaining calcu-lations involve the differences scores, not the original scores.)

STEP 3. Determine the characteristicsof the comparison distribution.

STEP 2. State the null and researchhypotheses.

Difference � SquaredX Y Difference mean difference deviation

122 111 �11 0 0

131 116 �15 �4 16

127 113 �14 �3 9

123 119 �4 7 49

132 121 �11 0 0

Page 19: Nolan Ch09

This step is identical to that for the single-sample t test.

Summary:

This step is identical to that for the single-sample t test. If we reject the null hypothesis,

we need to examine the means of the two conditions (in this case, MX � 127; MY � 116)so that we know the direction of the effect. Remember, even though the hypotheses aretwo-tailed, we report the direction of the effect.Summary:Reject the null hypothesis. It appears that, on average, people perform fasterwhen using a 42-inch monitor than when using a 15-inch monitor (as shown by thecurve in Figure 9-9).

STEP 6.Make a decision.

STEP 5. Calculate the test statistic.

t 52 2

52( )

..

11 01 923

5 72

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 219

The mean of the difference scores is:

Mdifference � �11

The numerator is the sum of squares, SS:

0 � 16 � 9 � 49 � 0 � 74

This step is the same as that for the single-sample t test. The degrees of freedom is thenumber of participants (not the number of scores) minus 1.

Summary: df � N � 1 � 5 � 1 � 4Our critical values, based on a two-tailed test and a p level of 0.05, are �2.776 and

2.776, as seen in the curve in Figure 9-8.

STEP 4. Determine the critical values,or cutoffs.

sM 5 54 301

51 923. .

s 52

5 574

5 118 5 4 301

( ). .

2.50% 2.50%

2.77622.77625.72

FIGURE 9-8Determining Cutoffs for aPaired-Samples t Test

We typically determine critical values interms of t statistics rather than meansof raw scores so that we can easilycompare a test statistic to them todetermine whether the test statistic isbeyond the cutoffs.

FIGURE 9-9Making a Decision

To decide whether to reject the nullhypothesis, we compare our teststatistic to our critical values. In thisfigure, the test statistic, �5.72, isbeyond the cutoff of �2.776, so wecan reject the null hypothesis.

Page 20: Nolan Ch09

220 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

The statistics, as reported in a journal article, follow the same APA format as for asingle- sample t test. We report the degrees of freedom, the value of the test statistic, andthe p value associated with the test statistic (unless we use software, we can only indi-cate whether the p value is less than or greater than 0.05). In the current example, thestatistics would read: t(4) � �5.72, p � 0.05.(We would also include the means and the standard deviations for the two samples.

We calculated the means in step 6 of hypothesis testing, but we would also have to cal-culate the standard deviations for the two samples to report them. In addition, we couldreport prep instead the p value. See the SPSS section of this chapter for details.)Researchers note that the faster time with the large display might not seem much

faster but that, in their research, they have had great difficulty identifying any factorsthat lead to faster times (Czerwinski et al., 2003). Based on their previous research,therefore, this is an impressive difference.

Calculating a Confidence Interval for a Paired-Samples t TestAs with most hypothesis testing, the APA also encourages the use of confidence intervalsand effect sizes when conducting a paired-samples t test. Let’s start by determining the con-fidence interval for the example we’ve been using. First, let’s recap the information weneed. The population mean difference according to the null hypothesis was 0, and we usedthe sample to estimate the population standard deviation to be 4.301 and standard error tobe 1.923. The five participants in the study sample had a mean difference of �11. We willcalculate the 95% confidence interval around the sample mean difference of �11.

Step 1: Draw a picture of a t distribution thatincludes the confidence interval.We draw a normal curve (see Figure 9-10) that

has the sample mean difference, �11, at its centerinstead of the population mean difference, 0.Step 2: Indicate the bounds of the confidenceinterval on the drawing.As before, 47.5% would fall on each side of

the mean between the mean and the cutoff, and2.5% would fall in each tail.Step 3: Add the t statistics to the curve, as seenin Figure 9-11.For a two-tailed test with a p level of 0.05 and

4 df, the critical values are �2.776 and 2.776.Step 4: Convert the t statistics back into rawmean differences.As we did with the other confidence inter-

vals, we use the sample mean difference (�11)in our calculations and the standard error (1.923)as our measure of spread. We use the same for-

mulas as for the single-sample t test, recalling that these means and standard errors arecalculated from differences between two scores for each participant in the study, ratherthan an individual score for each participant. We have added these raw mean differ-ences to our curve in Figure 9-12.

Mlower � �t(sM) � Msample � �2.776(1.923) � (�11) � �16.34

Mupper � t(sM) � Msample � 2.776(1.923) � (�11) � �5.66

Our 95% confidence interval, reported in brackets as is typical, is [�5.66, �16.34].

0

2.50%

22.776

47.50% 47.50%

2.50%

2.776

211

2.50%

47.50% 47.50%

2.50%

FIGURE 9-11A 95% Confidence Interval for aPaired-Samples t Test, Part II

The next step in calculating a confidenceinterval for mean differences isidentifying the t statistics that indicateeach end of the interval. Because thecurve is symmetric, the t statistics havethe same magnitude—one is negative,�2.776, and one is positive, 2.776.

FIGURE 9-10A 95% Confidence Interval for aPaired-Samples t Test, Part I

We start the confidence interval for adistribution of mean differences bydrawing a curve with the sample meandifference, �11, in the center.

Page 21: Nolan Ch09

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 221

Step 5: Check that the confidence interval makes sense. The sample mean differ-ence should fall exactly in the middle of the two ends of the interval.

�11 �(�5.66) � �5.34 and �11 �(�16.34) � 5.34

We have a match. The confidence interval ranges from 5.34 below the sample meandifference to 5.34 above the sample mean difference. If we were to sample five peo-ple from the same population over and over, the 95% confidence interval would in-clude the population mean 95% of the time. Note that the population mean differenceaccording to the null hypothesis, 0, does not fall within this interval. This means it isnot plausible that the difference between those using the 15-inch monitor and thoseusing the 42-inch monitor is 0. We can conclude that, on average, people performfaster when using a 42-inch monitor than when using a 15-inch monitor. As withother hypothesis tests, the conclusions from both the paired-samples t test and the con-fidence interval are the same, but the confidence interval gives us more information—an interval estimate, not just a point estimate.

Calculating Effect Size for a Paired-Samples t TestAs with a z test, we can calculate the effect size (Cohen’s d) for a paired-samples t test.Let’s calculate it for the computer monitor study. Again, we simply use the formulafor the t statistic, substituting s for sM (and l for lM, even though these means are al-ways the same). This means we use 4.301 instead of 1.923 in the denominator. Co-hen’s d is now based on the spread of the distribution of individual differences betweenscores, rather than the distribution of mean differences.

Cohen’s

Our effect size, d � �2.56, tells us that our sample mean difference and the pop-ulation mean difference are 2.56 standard deviations apart. This is a large effect. Re-call that the sign has no effect on the size of an effect: �2.56 and 2.56 are equivalenteffect sizes. We can add the effect size when we report the statistics as follows: t(4) ��5.72, p � 0.05, d � �2.56.

dM

s5

25

2 252

( ) ( ).

.l 11 0

4 3012 56

2.50%

216.34 211

47.50% 47.50%

2.50%

25.66

FIGURE 9-12A 95% Confidence Interval for aPaired-Samples t Test, Part III

The final step in calculating aconfidence interval for mean differencesis converting the t statistics that indicateeach end of the interval to raw meandifferences, �16.34 and �5.66.

� MASTERING THE FORMULA

9-7: The formula for the lowerbound of a confidence interval fora paired-samples t test is Mlower �

�t(sM) � Msample. The formula forthe upper bound of a confidence in-terval for a paired-samples t test isMupper � t(sM) � Msample. These arethe same as for a single-sample t test,but remember that the means andstandard errors are calculated fromdifferences between pairs of scores,not individual scores.

MASTERING THE FORMULA

CHECK YOUR LEARNINGReviewing the Concepts > The paired-samples t test is used when we have data for all participants under two

conditions—a within-groups design.

> In the paired-samples t test, we calculate a difference score for every individual. The sta-tistic is calculated on those difference scores.

> We use the same six steps of hypothesis testing that we used with the z test and with thesingle-sample t test.

continued on next page

Page 22: Nolan Ch09

222 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

> We can calculate prep, a confidence interval, and an effect size (Cohen’s d), for a paired-samples t test.

Clarifying the Concepts 9-11 How do we conduct a paired-samples t test?

9-12 Explain what an individual difference score is, as it is used in a paired-samples t test.

Calculating the Statistics 9-13 Below are energy-level data (on a scale of 1 to 7, where 1 � feeling of no energy and7 � feeling of high energy) for five students before and after lunch. Calculate themean difference for these people so that loss of energy is a negative value. Assume youare testing the hypothesis that students go into what we call “food comas” after eating,versus lunch giving them added energy.

Before lunch After lunch

6 3

5 2

4 6

5 4

7 5

Applying the Concepts 9-14 Using the energy-level data presented in Check Your Learning 9-13, let’s test thehypothesis that students have different energy levels before and after lunch.

a. Perform the six steps of hypothesis testing.

b. Calculate the 95% confidence interval and describe how it results in the sameconclusion as the hypothesis test.

c. Calculate and interpret Cohen’s d.

Solutions to these Check Your

Learning questions can be found in

Appendix D.

The t DistributionsThe t distributions are similar to the z distribution, except that we must estimate thestandard deviation from the sample. When estimating the standard deviation, we mustmake a mathematical correction to adjust for the increased likelihood of error. Afterestimating the standard deviation, the t statistic is calculated like the z statistic for dis-tributions of means. The t distributions can be used to compare the mean of a sampleto a population mean when we don’t know the population standard deviation (single-sample t test), to compare two samples with a within-groups design (paired-samples t test), and to compare two samples with a between-groups design (independent- samples t test). We learned about the first two t tests in this chapter; the third t test isdescribed in Chapter 10.

The Single-Sample t TestLike z tests, single-sample t tests are conducted in the rare cases in which we have onesample that we’re comparing to a known population. The difference is that we mustknow the mean and the standard deviation of the population to conduct a z test, whereas

REVIEW OF CONCEPTS

Page 23: Nolan Ch09

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 223

we only have to know the mean of the population to conduct a single-sample t test.There are many t distributions, one for every possible sample size. We look up the ap-propriate critical values on the t table based on degrees of freedom, a number calculatedfrom the sample size. In addition to the hypothesis test we can calculate prep, a confi-dence interval, and an effect size (Cohen’s d) for a single-sample t test.

The Paired-Samples t TestA paired-samples t test is used when we have two samples and the same participants are inboth samples; to conduct the test, we calculate a difference score for every individual.The comparison distribution is a distribution of mean difference scores. We can calcu-late prep, a confidence interval, and an effect size (Cohen’s d) for a paired- samples t test.

The t test is used to compare only two groups. First, let’s con-duct a single-sample t test using the data on number of coun-seling sessions attended that we tested earlier in this chapter.The five scores were: 6, 6, 12, 7, and 8.Select Analyze → Compare Means → One-Sample T

Test. Then highlight the dependent variable (sessions) and clickthe arrow in the center to choose it. Type the population meanto which we’re comparing our sample, 4.6, next to “TestValue” and click “OK.” The screenshot here shows the dataand output. You’ll notice that the t statistic, 2.874, is almostidentical to the one we calculated, 2.873. The difference isonly due to our rounding decisions. Notice that the confi-dence interval is different from the one we calculated. This isan interval around the difference between the two means,rather than around the mean of our sample.The p value is under “Sig (2-tailed).” The p value of .045

is less than the chosen p level of .05, an indication that this is

a statistically significant finding. We can use this p value in Excel to calculate our prep.When we replace P with .045 inthe Excel formula, �NORMSDIST(NORMSINV(1-P)/(SQRT (2))), we get a prep of .8847. If we were to replicatethis study with the same sample size drawn from the samepopulation, we could expect an effect in the same direction88.47% of the time.For a paired-samples t test, let’s use the data from this chap-

ter on performance using a small monitor versus a large mon-itor. Enter the data in two columns, with each participanthaving one score in the first column for his or her perform-ance on the small monitor and one score in the second col-umn for his or her performance on the large monitor.Select Analyze → Compare Means → Paired-Samples T

Test. Choose the dependent variable under the first condition(Small) by clicking it, then clicking the center arrow. Choosethe dependent variable under the second condition (large) by

SPSS®

Page 24: Nolan Ch09

224 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

clicking it, then clicking the center arrow. Then click “OK.”The data and output are shown in the screenshot. Notice thatthe t statistic and confidence interval match ours (5.72) and[�16.34, �5.66] except that the signs are different. This oc-curs because of the order in which one score was subtracted

from the other score—that is, whether the score on the largemonitor was subtracted from the score on the small monitor,or vice versa. The outcome is the same in either case. The pvalue is under “Sig. (2-tailed)” and is .005. We can use this num-ber in Excel to determine the value for prep, .9657.

9.1 CONDUCTING A SINGLE-SAMPLE t TESTIn How It Works 7.2, we conducted a z test for data from the Consideration of FutureConsequences (CFC) scale (Petrocelli, 2003). How can we conduct all six steps for a single-sample t test for the same data using a p level of 0.05 and a two-tailed test? To start, we’lluse the population mean CFC score of 3.51, but we’ll pretend that we no longer knowthe population standard deviation. As before, we wonder whether students who joined acareer discussion group might have improved CFC scores, on average, compared with thepopulation. Forty-five students in the social sciences regularly attended these discussiongroups and then took the CFC scale. The mean for this group is 3.7. The standard devia-tion for this sample is 0.52.

Step 1: Population 1: All students in career discussion groupsPopulation 2: All students who did not participate in career discussion groups

The comparison distribution will be a distribution of means. The hypothesis testwill be a single-sample t test because we have only one sample, and we know thepopulation mean but do not know the population standard deviation. This studymeets two of the three assumptions and may meet the third. The dependent vari-able is scale. In addition, there are more than 30 participants in the sample, indi-cating that the comparison distribution will be normal. The data were not randomlyselected, however, so we must be cautious when generalizing.

Step 2:Null hypothesis: Students who participated in career discussion groups had thesame CFC scores, on average, as students who did not participate H0: l1 � l2.

How It Works

Page 25: Nolan Ch09

Research hypothesis: Students who participated in career discussion groups had dif-ferent CFC scores, on average, from students who did not participate—H1: l1 � l2.

Step 3: lM � l � 3.51;

Step 4: df � N � 1 � 45 � 1 � 44

Our critical values, based on 44 degrees of freedom (because 44 is not in the table,we look up the more conservative degrees of freedom of 30), a p level of 0.05, anda two-tailed test, are �2.021 and 2.021.

Step 5:

Step 6: Reject the null hypothesis. It appears that students who participate in career dis-cussion groups have higher CFC scores, on average, than do students who do notparticipate.

The statistics, as presented in a journal article, would read:

t(44) � 2.44, p � 0.05

9.2 CONDUCTING A PAIRED-SAMPLES t TESTSalary Wizard is an online tool that allows you to look up incomes for specific jobs forcities in the United States. We looked up the 25th percentile for income for six jobs intwo cities: Boise, Idaho, and Los Angeles, California. The data are below.

Boise Los Angeles

Executive chef $53,047.00 $62,490.00Genetics counselor $49,958.00 $58,850.00Grants/proposal writer $41,974.00 $49,445.00Librarian $44,366.00 $52,263.00Public schoolteacher $40,470.00 $47,674.00Social worker (with bachelor’s degree) $36,963.00 $43,542.00

How can we conduct a paired-samples t test to determine whether income in one ofthese cities differs, on average, from income in the other? We’ll use a two-tailed test and ap level of 0.05.

Step 1: Population 1: Job types in Boise, IdahoPopulation 2: Job types in Los Angeles, California

The comparison distribution will be a distribution of mean differences. The hy-pothesis test will be a paired-samples t test because we have two samples, and allparticipants are in both samples.This study meets one of the three assumptions and may meet the third. The de-

pendent variable, GPA, is scale. We do not know whether the population is nor-mally distributed, there are not at least 30 participants, and there is not muchvariability in the data in our samples, so we should proceed with caution. The datawere not randomly selected, so we should be cautious when generalizing beyondthis sample of job types.

Step 2: Null hypothesis: Jobs in Boise pay the same, on average, as jobs in Los Angeles—H0: l1 � l2. Research hypothesis: Jobs in Boise pay different incomes, on average,from jobs in Los Angeles—H1: l1 � l2.

Step 3: lM � l� 0; SM � 438.830

Boise Los Angeles Difference (D) (D � Mdifference) (D � Mdifference)2

$53,047.00 $62,490.00 9443 1528.667 2336821.797$49,958.00 $58,850.00 8892 977.667 955832.763$41,974.00 $49,445.00 7471 �443.333 196544.149$44,366.00 $52,263.00 7897 �17.333 300.433$40,470.00 $47,674.00 7204 �710.333 504572.971$36,963.00 $43,542.00 6579 �1335.333 1783114.221

tM

sM

M5

25

25

( ) ( . . ).

.l 3 7 3 51

0 0782 44

s sNM 5 5 5

0 5245

0 078. .

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 225

Page 26: Nolan Ch09

226 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

Mdifference � 7914.333

SS � R(D � Mdifference)2 � 5,777,187.333

Step 4: df � N � 1 � 6 � 1 � 5

Our critical values, based on 5 degrees of freedom, a p level of 0.05, and a two-tailed test, are �2.571 and 2.571.

Step 5:

Step 6: Reject the null hypothesis. It appears that jobs in Los Angeles pay more, on aver-age, than do jobs in Boise.

The statistics, as they would be presented in a journal article, are: t(5) � 18.04, p � 0.05

tM

sdifference difference

M

( ) .l 7914 333 0438..830

18.04

sD M

Ndifference

52

25

( )( )

, , .( )

2

15 777 186 334

6 1∑

−551074.913

s sNM 5 5 5

1074 9136. 438.830

ExercisesClarifying the Concepts

9.1 When should we use a t distribution?9.2 Why do we modify the formula for calculating standard

deviation when using t tests (and divide by N � 1)?9.3 How is the calculation of standard error different for a

t test than for a z test?9.4 Explain why the standard error for the distribution of

sample means is smaller than the standard deviation ofsample scores.

9.5 Define the symbols in the formula for the t statistic:

.

9.6 Distinguish between within-group designs and between-group designs.

9.7 What do we mean when we say we have a distributionof mean differences?

9.8 Why is a two-tailed t test considered more conserva-tive than a one-tailed t test?

9.9 What does the phrase free to vary, referring to a num-ber of scores in a given sample, mean for statisticians?

9.10 How is the t critical value affected by sample size anddegrees of freedom?

9.11 Why does the t distribution merge with the z distribu-tion as sample size increases?

9.12 Explain what each part of the following statisticmeans, as it would be reported in the American Psy-chological Association (APA) format: t(4) � 2.87, p � 0.032.

9.13 What could you report instead of the p value in Exer-cise 9.12?

9.14 When do we use a paired-samples t test?9.15 Explain the distinction between the terms independent-

samples and paired-samples as they relate to t tests.

Calculating the Statistics

9.16 We use formulas to describe calculations. Find the er-ror in symbolic notation in each of the following for-mulas. Explain why it is incorrect and provide thecorrect symbolic notation.

a.

b. X � z(r) � lM

c.

d.

9.17 For the following data (93, 97, 91, 88, 103, 94, 97), cal -culate the standard deviation under both of these conditions:

a. For the sample

b. As an estimate of the population9.18 For the following data (1.01, 0.99, 1.12, 1.27, 0.82,

1.04), calculate the standard deviation under both ofthese conditions. (Note:You will have to carry some cal-culations out to the third decimal place to see the dif-ference in calculations.)

a. For the sample

b. As an estimate of the population

tM M

M5

2( )lr

rr

M N5

21

tM

sM

M5

2( )l zX M

52( )r

Page 27: Nolan Ch09

9.19 Calculate the standard error for t for the sample usedin Exercise 9.17 using symbolic notation: 93, 97, 91, 88,103, 94, 97.

9.20 Calculate the standard error for t for the sample usedin Exercise 9.18 using symbolic notation: 1.01, 0.99,1.12, 1.27, 0.82, 1.04.

9.21 Calculate the t statistic for the data presented in Exer-cise 9.17, assuming l � 96. Again, the data are 93, 97,91, 88, 103, 94, 97.

9.22 Calculate the t statistic for the data presented in Exer-cise 9.18, assuming l � 0.96. Again, the data are 1.01,0.99, 1.12, 1.27, 0.82, 1.04.

9.23 Identify the critical t value in each of the following circumstances:

a. A one-tailed test with 73 degrees of freedom at ap level of 0.10

b. A two-tailed test with 108 degrees of freedom at ap level of 0.05

c. A one-tailed test with 38 degrees of freedom at ap level of 0.01

9.24 Calculate degrees of freedom and identify the critical tvalue in each of the following circumstances:

a. A two-tailed test based on 8 observations at a p levelof 0.10

b. A one-tailed test based on 42 observations at a plevel of 0.05

c. A two-tailed test based on 89 observations at a plevel of 0.01

9.25 Identify t critical values for each of the following tests:

a. A single-sample t test examining scores for 26 par-ticipants to see if there is any difference comparedto the population, using a p level of 0.05

b. A one-tailed, paired-samples t test performed onscores on the Marital Satisfaction Inventory for 18couples who went through marriage counseling, us-ing a p level of 0.01

c. A two-tailed, single-sample t test, using a p level of0.05, with 34 degrees of freedom

9.26 Assume we know the following for a two-tailed, single-sample t test, at a p level of 0.05: l � 44.3, N � 114, M � 43, s � 5.9.

a. Calculate the t statistic.

b. Calculate a 95% confidence interval.

c. Calculate effect size using Cohen’s d.9.27 Assume we know the following for a two-tailed, single-

sample t test: l � 7, N � 41, M � 8.5, s � 2.1.

a. Calculate the t statistic.

b. Calculate a 99% confidence interval.

c. Calculate effect size using Cohen’s d.9.28 Assume we know the following for a paired-samples t

test: N � 32, Mdifference � 1.75, s � 4.0.

a. Calculate the t statistic.

b. Calculate a 95% confidence interval for a two-tailedtest.

c. Calculate effect size using Cohen’s d.

9.29 Assume we know the following for a paired-samples ttest: N � 13, Mdifference � �0.77, s � 1.42.

a. Calculate the t statistic.

b. Calculate a 95% confidence interval for a two-tailedtest.

c. Calculate effect size using Cohen’s d.

9.30 Each of the following is a p value for a specific t statis-tic. For each, use Excel to determine prep.

a. 0.022

b. 0.37

c. 0.004

Applying the Concepts

9.31 For each of the problems described below, which arethe same as those described in Exercise 9.25, identifywhat the z critical value would have been if there hadbeen just one sample and we knew the mean and stan-dard deviation of the population:

a. A single-sample t test examining scores for 26 par-ticipants to see if there is any difference comparedto the population, using a p level of 0.05

b. A one-tailed, single-sample t test performed onscores on the Marital Satisfaction Inventory for 18people who went through marriage counseling, us-ing a p level of 0.01

c. A two-tailed, single-sample t test, using a p level of0.05, with 34 degrees of freedom

d. Comparing the t critical values with the z criticalvalues, explain how and why these are different.

9.32 On its Web site, the Princeton Review claims that stu-dents who have taken its course improve their GREscores, on average, by 210 points. (No other informa-tion is provided about this statistic.) Treating this av-erage gain as a population mean, a researcher wonderswhether the far cheaper technique of practicing forthe GRE on one’s own using books and CD-ROMswould lead to a different average gain. She randomlyselects five students from the pool of students at heruniversity who plan to take the GRE. The studentstake a practice test before and after two months ofself-study. They reported (fictional) gains of 160, 240,340, 70, and 250 points. (Note that many experts sug-gest that the results from self-study are similar tothose from a structured course if you have the self-discipline to go solo. Regardless of the format, prepa-ration has been convincingly demonstrated to lead toincreased scores.)

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 227

Page 28: Nolan Ch09

228 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

a. Using symbolic notation and formulas (where ap-propriate), determine the appropriate mean andstandard error for the distribution to which we willcompare this sample. Show all steps of your calcu-lations.

b. Using symbolic notation and the formula, calcu-late the t statistic for this sample.

c. As an interested consumer, what critical questionswould you want to ask about the statistic reportedby the Princeton Review? List at least three ques-tions.

9.33 The Florida Department of Corrections publishes anonline death row fact sheet. It reports the average timeon death row prior to execution as 11.72 years but pro-vides no standard deviation. This mean is a parameter,as it is calculated from the entire population of executedprisoners in Florida. Has the time spent on death rowchanged in recent years? According to the execution listlinked to the same Web site, the six prisoners executedin Florida during the years 2003, 2004, and 2005 spent25.62, 13.09, 8.74, 17.63, 2.80, and 4.42 years on deathrow, respectively. (All were men, although AileenWuornos, the serial killer portrayed by Charlize Theronin the 2003 film Monster, was among the three prison-ers executed by the state of Florida in 2002; Wuornosspent 10.69 years on death row.)

a. Using symbolic notation and formulas (where ap-propriate), determine the appropriate mean andstandard error for the distribution of means. Showall steps of your calculations.

b. Using symbolic notation and the formula, calcu-late the t statistic for time spent on death row forthe sample of recently executed prisoners.

c. The specific p value for the t statistic you just cal-culated is 0.929. Use Excel to determine prep.Whatdoes prep tell you about this study?

d. The execution list provides data on all prisoners ex-ecuted since the death penalty was reinstated inFlorida in 1976. Included for each prisoner are thename, race, gender, date of birth, date of offense,date sentenced, date arrived on death row, data ofexecution, number of warrants, and years on deathrow. State at least one hypothesis, other than yearof execution, that could be examined using a t dis-tribution and the comparison mean of 11.72 yearson death row. Be specific about your hypothesis(and if you are truly interested, you can search forthe data online).

e. What additional information would you need tocalculate a z score for the length of time AileenWuornos spent on death row?

9.34 Refer to the information provided in Exercise 9.33when answering the following:

a. Write hypotheses to address the question “Has thetime spent on death row changed in recent years?”

b. Using these data as “recent years” and the mean of11.72 years as the comparison, answer the questionbased on your t statistic, using alpha of 0.05.

9.35 Refer to the information provided in Exercise 9.33when answering the following:

a. Calculate the confidence interval for this statisticbased on the data presented.

b. What conclusion would you make about your hypotheses based on this confidence interval?What can you say about the size of this confidenceinterval?

9.36 Refer to the information provided in Exercise 9.33 andthe work you have done through Exercise 9.35 whenanswering the following:

a. Calculate the effect size using Cohen’s d.

b. Evaluate the size of this effect.

9.37 Many communities worldwide are lamenting the ef-fects of so-called big box retailers (e.g., Wal-Mart) ontheir local economies, particularly on small, independ-ently owned shops. Do these large stores affect the bot-tom lines of locally owned retailers? Imagine that youdecide to test this premise. You assess earnings at 20 localstores for the month of October, a few months before abig box store opens. You then assess earnings the follow-ing October, correcting for inflation.

a. What are the two populations?

b. What would the comparison distribution be? Ex-plain.

c. What hypothesis test would you use? Explain.

d. Check the assumptions for this hypothesis test.

e. What is one flaw in drawing conclusions from thiscomparison over time?

9.38 For the scenario described in Exercise 9.37 (big boxstores and their effect on local retailers), state the nulland research hypotheses in both words and symbols.

9.39 Bardwell, Ensign, and Mills (2005) assessed the moods of60 male U.S. Marines following a month-long trainingexercise conducted in cold temperatures and at high alti-tudes. Negative moods, including fatigue and anger, in-creased substantially during the training and lasted up tothree months after the training ended. Mean mood scoreswere compared to population norms for three groups:college men, adult men, and male psychiatric outpatients.Let’s examine anger scores for six men at the end of train-ing; these scores are fictional, but their mean and standarddeviation are very close to the actual descriptive statisticsfor the sample: 14, 12, 13, 12, 14, 15.

a. The population mean anger score for college menis 8.90. Conduct all six steps of a single-sample ttest. Be sure to label all six steps. Report the statis-tics as you would in a journal article.

Page 29: Nolan Ch09

CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test 229

b. Now calculate the test statistic to compare this sam-ple mean to the population mean anger score foradult men (M � 9.20). You do not have to repeatall the steps from part (a), but conduct step 6 of hy-pothesis testing and report the statistics as youwould in a journal article.

c. Now calculate the test statistic to compare this sam-ple mean to the population mean anger score formale psychiatric outpatients (M � 13.5). Do notrepeat all the steps from part (a), but conduct step6 of hypothesis testing and report the statistics asyou would in a journal article.

d. What can we conclude overall about Marines’moods following high-altitude, cold-weather train-ing? Remember, if we fail to reject the null hypoth-esis, we can only conclude that there is no evidencefrom this study to support the research hypothesis.We cannot conclude that we have supported thenull hypothesis.

9.40 The number of paid days off (i.e., vacation, sick leave)taken by eight employees at a local small business iscompared to the national average. You are hired by thebusiness owner, who has been in business for just 18months, to help her determine what to expect for paiddays off. In general, she wants to set some standard forher employees and for herself. Let’s assume your searchon the Internet for data on paid days off leaves you withthe impression that the national average is 15 days. Thedata for the eight local employees during the last fiscalyear are: 10, 11, 8, 14, 13, 12, 12, and 27 days.

a. Write hypotheses for your research.

b. Which type of test would be appropriate to analyzethese data in order to answer your question?

c. Before doing any computations, do you have anyconcerns about this research? Are there any ques-tions you might like to ask about the data you havebeen given?

9.41 Use the data presented in Exercise 9.40 to help thisbusiness owner understand her employees’ experiencewith paid days off in greater detail.

a. Calculate the appropriate t statistic. Show all of yourwork in detail.

b. Draw a statistical conclusion for this business owner.

c. The p level for the test statistic you calculated inpart (a) is 0.454. Using Excel, determine prep.

d. Calculate the confidence interval.

e. Calculate and interpret the effect size.

9.42 Consider all the results you calculated in Exercise 9.41.How would you summarize the situation for this busi-ness owner? Identify the limitations of your analyses,and discuss the difficulties of making comparisons be-tween populations and samples. Make reference to theassumptions of the statistical test in your answer.

9.43 After further investigation, you discover that one of thedata points, 27 days, was actually the owner’s numberof paid days off. Redo some of the work for Exercise9.41 adapting for this new information by deleting thatvalue.

a. Calculate the appropriate t statistic. Show all of yourwork in detail.

b. Draw a statistical conclusion for this business owner.

c. The p level for the test statistic you calculated inpart (a) is now 0.003. Using Excel, determine prep.

d. Calculate and interpret the effect size.

e. Explain what changed in these analyses.9.44 Is it harder to get into graduate programs in psychol-

ogy or history? We randomly selected five institutionsfrom among all U.S. institutions with graduate pro-grams. The first number for each is the minimum GPAfor applicants to the psychology doctoral program, andthe second is for applicants to the history doctoral pro-gram. These GPAs were posted on the Web site of thewell-known college guide company Peterson’s.

Wayne State University: 3.0, 2.75

University of Iowa: 3.0, 3.0

University of Nevada–Reno: 3.0, 2.75

George Washington University: 3.0, 3.0

University of Wyoming: 3.0, 3.0

a. The participants are not people; explain why it isappropriate to use a paired-samples t test for this situation.

b. Conduct all six steps of a paired-samples t test. Besure to label all six steps.

c. Report the statistics as you would in a journal article.

9.45 Using the data provided in Exercise 9.44, calculate theeffect size and explain what this adds to your analysis.

9.46 In Chapter 1, you were given an opportunity to com-plete the Stroop task in which color words are printedin the wrong color; for example, the word red mightbe printed in the color blue. The conflict that ariseswhen we try to read the words, but are distracted bythe colors, increases our reaction time and decreasesour accuracy. Several researchers have suggested thatthe Stroop effect can be decreased by hypnosis. Raz(2005) used brain-imaging techniques [i.e., functionalmagnetic resonance imaging (fMRI)] to demonstratethat posthypnotic suggestion led highly hypnotizableindividuals to see Stroop words as nonsense words.Imagine that you are working with Raz and your as-signment is to determine if reaction times decrease (re-member, a decrease is a good thing; it indicates thatparticipants are faster) when highly hypnotizable indi-viduals receive a posthypnotic suggestion to view the

Page 30: Nolan Ch09

230 CHAPTER 9 � The Single-Sample t Test and the Paired-Samples t Test

words as nonsensical. You conduct the experiment onsix individuals, once in each condition, and receive thefollowing data; the first number is reaction time in sec-onds without the posthypnotic suggestion, and thesecond number is reaction time with the posthypnoticsuggestion:

Participant 1: (12.6, 8.5) Participant 2: (13.8, 9.6) Participant 3: (11.6, 10.0) Participant 4: (12.2, 9.2)Participant 5: (12.1, 8.9)Participant 6: (13.0, 10.8)

a. What is the independent variable and what are itslevels? What is the dependent variable?

b. Conduct all six steps of a paired-samples t test. Besure to label all six steps.

c. Report the statistics as you would in a journal article.

9.47 Let’s consider Exercise 9.46 on the Stroop task andposthypnotic suggestion. When we conduct a one-tailedtest instead of a two-tailed test, there are small changesin steps 2 and 4 of hypothesis testing.

a. Conduct step 2 of hypothesis testing—stating thenull and research hypotheses in words and in sym-bols—for a one-tailed test.

b. Conduct step 4 of hypothesis testing—determiningthe critical value and drawing the curve—for a one-tailed test.

c. Conduct step 6 of hypothesis testing—making a de-cision—for a one-tailed test.

d. Under which circumstance—a one-tailed or a two-tailed test—is it easier to reject the null hypothesis?Explain.

e. If it becomes easier to reject the null hypothesis un-der one type of test (one-tailed versus two-tailed),does this mean that there is a bigger mean differ-ence between the samples? Explain.

9.48 When we change the p level that we use as a cutoff, itcauses a small change in step 4 of hypothesis testing. Although 0.05 is the most commonly used p level, otherlevels, such as 0.01, are also often used. Let’s considerExercise 9.46 on the Stroop task and posthypnotic suggestion.

a. Conduct step 4 of hypothesis testing—determiningthe critical value and drawing the curve—for a plevel of 0.01.

b. Conduct step 6 of hypothesis testing—making a de-cision—for a p level of 0.01.

c. With which p level—0.05 or 0.01—is it easiest toreject the null hypothesis? Explain.

d. If it is easier to reject the null hypothesis with cer-tain p levels, does this mean that there is a biggermean difference between the samples? Explain.

9.49 Changing the sample size can have an effect on the out-come of a hypothesis test. Consider Exercise 9.46 onthe Stroop task and posthypnotic suggestion.

a. Calculate the test statistic using only partici-pants 1–3.

b. Is this test statistic closer to or farther from the cut-off? Does reducing the sample size make it easier ormore difficult to reject the null hypothesis? Explain.

Termst statistic (p. 206)single-sample t test (p. 207)

degrees of freedom (p. 208)paired-samples t test (p. 216)

Formulas

s � (p. 202)

sM � (p. 205)

t � (p. 206)

df � N � 1 (p. 208)Mlower � �t(sM) � Msample (p. 214)

Mupper � t(sM) � Msample (p. 214)

Cohen’s (p. 214)

( )Ms

M

M

2 l

sN

R( )( )X MN

2

2

2

1d

Ms

52( )l

SymbolssM (p. 205)t (p. 206)df (p. 208)


Recommended