Chapter 11 Part I Answers Chapter 12 Notes Chapter 11 Part II Answers.

Post on 31-Dec-2015

291 views 0 download

Tags:

transcript

Chapter 11 Part I Answers

Chapter 12 Notes

Chapter 11 Part II Answers

Chapter 12Sampling Surveys

How do we gather data?• Surveys • Opinion polls• Interviews• Studies

– Observational• Retrospective (past)• Prospective (future)

• Experiments

Population• the entire group of individuals that we want information about

1.

Census•Gathering data involving the entire population

2.

Why would we not use a census all

the time?1) Not accurate

2) Very expensive

3) Perhaps impossible

4) If using destructive sampling, you would destroy the population

• Breaking strength of soda bottles• Lifetime of flashlight batteries• Safety ratings for cars

Look at the U.S. census – it has a huge amount of error in it; plus it takes a long to compile the data making the data obsolete

by the time we get it!

Suppose you wanted to know the average weight

of the white-tail deer population in Texas –

would it be feasible to do a census?

Since taking a census of any population takes time, censuses are VERY costly

to do!

3.

Sample• A part of the population that

we actually examine in order to gather information

• Use sample to generalize the population

4.

Why Do We Sample Anyway?HOW GOOD ARE CURRENT INSPECTION SYSTEMS

Printed below is a story which can be used to demonstrate the effectiveness of 100% inspection. Assume that the letter “G” or “g” is a defective product caused by the Gremlin, and that you are the inspector. Allow yourself about 3 minutes to count all the G’s or g’s. Place your total in the box at the bottom of the story.

Total number found:

3 Minutes

Total number found:

2 Minutes

Total number found:

1 Minutes

Total number found:

15 Seconds

Total number found:

TIME!

Why do we sample?

Take a Census of all the “G” or “g” which appear in the story.

The actual count is…

83

Sampling design•refers to the method

used to choose the sample from the population

5.

Sampling frame•a list of every

individual in the population

6.

• consist of n individuals from the population chosen in such a way that–every individual has an equal

chance of being selected

Simple Random Sample (SRS)

7.

• consist of n individuals from the population chosen in such a way that–every individual has an equal

chance of being selected

Simple Random Sample (SRS)

Suppose we were to take an SRS of 50 SHS students – put each students’ name in a hat. Then randomly select 50 names from the hat. Each student has the same chance to be selected!

7.

• consist of n individuals from the population chosen in such a way that–every individual has an equal

chance of being selected–every set of n individuals has an

equal chance of being selected

Simple Random Sample (SRS)

7.

• consist of n individuals from the population chosen in such a way that–every individual has an equal

chance of being selected–every set of n individuals has an

equal chance of being selected

Simple Random Sample (SRS)

Not only does each student have the same chance to be selected –

but every possible group of 50 students has the same chance to be selected! Therefore, it has to be possible for all 50 students to be seniors in order for it to be an

SRS!

7.

Jelly Blubber Activity• Marine biologist have just

discovered a new variety of jellyfish called the Jelly Blubber

• We are to study a colony of Jelly Blubbers and determine their average length (measured horizontally in centimeters)

Jelly Blubber Activity

• You will have 5 seconds to choose a Jelly Blubber that you think is of average length and then measure its length and report your results

• Why is this not an appropriate sampling method?

Jelly Blubber Activity

• This time you are to choose 5 Jelly Blubbers which are a representative sample of the colony. Measure each blubber and calculate the mean length.

• Why is this not an appropriate sampling method?

Jelly Blubber Activity

• Now take a Simple Random Sample (SRS) of 5 blubblers by generating 5 random numbers of 1 – 100.

• Measure each of the 5 random blubbers and find the mean length for your SRS.

What are the results of a census of the JellyBlubbers colony?

Average size of all 100 members of the colony is 18.64cm.

Mean 18.64cm; Standard Deviation 13.08cm; Median 13cm; IQR = 23.5cm

Jelly Blubber Activity• Now take a Simple Random

Sample (SRS) of 5 blubblers by generating 5 random numbers of 1 – 100.

• Measure each of the 5 random blubbers and find the mean length for your SRS.

• Why is this sampling method better than just selecting 5 on your own?

• consist of n individuals from the population chosen in such a way that–every individual has an equal

chance of being selected–every set of n individuals has an

equal chance of being selected

Simple Random Sample (SRS)

7.

Stratified random sample•population is divided

into homogeneous groups called strata

8.

Stratified random sample•population is divided

into homogeneous groups called strata

•SRS’s are pulled from each strata

Homogeneous groups are groups that are alike based upon some

characteristic of the group members.

8.

Stratified random sample•population is divided

into homogeneous groups called strata

•SRS’s are pulled from each strata

Suppose we were to take a stratified random sample of 50 SHS students.

Since students are already divided by grade level, grade level can be our

strata. Then randomly select a some seniors, juniors, sophomores, and

freshman. How many depends of the proportion of the population.

8.

Stratified random sampleIf a high school is 20% Senior, 20% Junior,

30% Sophomore, & 30% Freshman, then a 50 student sample should include...

10 Seniors, 10 Juniors, 15 Sophomores, and 15 Freshman. (Use SRS for each strata.)

8.

Stratified random sampleON YOUR OWN: If a high school is 10%

Senior, 20% Junior, 40% Sophomore, & 30% Freshman, then a 30 student sample should include...

8.

Systematic random sample• select sample by

following a systematic approach

• randomly select where to begin

9.

Systematic random sample• select sample by

following a systematic approach

• randomly select where to begin

Suppose we want to do a systematic random sample of SHS students -

number a list of students(There are approximately 2000 students – if

we want a sample of 50, 2000/50 = 40)Select a number between 1 and 40 at random. That student will be the first

student chosen, then choose every 40th student from there.

Suppose we want to do a systematic random sample of SHS students

9.

- CALCULATE grouping size: 2000 students – need sample of 50,

so 2000/50 = 40

- NUMBER a list of students (sampling frame)

Suppose we want to do a systematic random sample of SHS students

9.

- CALCULATE grouping size: 2000 students – need sample of 50,

so 2000/50 = 40

- USE the grouping size: Select a number between 1 and 40 at random.

That student will be the first student chosen, then choose every 40th student from there.

Systematic random sample

What if it doesn’t work evenly?Say there are 2011 students.

2011/50 = 40 r. 11Your starting place will be chosen by randomly selecting a number between 1 & 51 instead of 1 & 40.From there choose every 40th student from your sample frame.

9.

Systematic random sample

ON YOUR OWN: You want to gather a sample from 1505 students systematically. Your sample size needs to be 30. What do you do?

9.

Cluster Sample• based upon heterogeneous groups

which are representative of the population

• randomly pick a cluster or clusters • Take an SRS of that cluster(s)

10.

Cluster Sample• based upon heterogeneous group

which is representative of the population

• randomly pick a cluster or clusters • Take an SRS of that cluster(s)

Suppose we want to do a cluster sample of SHS students. One way to do this would be to randomly select

classrooms during 2nd period. Perform a SRS of the students in

those rooms!

10.

Multistage sample

•select successively smaller groups within the population in stages

•SRS used at each stage

11.

Multistage sample

•select successively smaller groups within the population in stages

•SRS used at each stage

To use a multistage approach to sampling SHS students, we could first divide 2nd period classes by level (AP, Honors, Regular, etc.) and randomly select 4 second period classes from

each group. Then we could randomly select 5 students from each of those

classes. The selection process is done in stages!

11.

Identify the sampling design

a)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.) Then they randomly selected 3 colleges from each group.

Stratified random sample

12.

b) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks.

Identify the sampling design

Cluster sampling

12.

c) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave.

Identify the sampling design

Systematic random sampling

12.

Row1 4 5 1 8 0 5 1 3 7 12 0 1 5 5 1 8 1 5 7 03 8 9 9 3 4 3 5 0 6 3

Suppose your population consisted of these 20 people:

1) Aidan6) Fred 11) Kathy 16) Paul2) Bob 7) Gloria 12) Lori 17) Shawnie3) Chico 8) Hannah 13) Matthew 18) Tracy4) Doug 9) Israel 14) Nan 19) Uncle Sam5) Edward 10) Jung 15) Opus 20) Vernon

Use the following random digits to select a sample of five from these people.

We will need to use double digit random

numbers, ignoring any number greater than 20.

Start with Row 1 and read across.

Ignore.

18) Tracy

5) Edward

13) Matthew

1) Aidan

15) Opus

Ignore.Ignore.Repeat.

Stop when five people are selected. So my sample would

consist of :

Aidan, Edward, Matthew, Opus, and Tracy

13.

Bias•ERROR•favors certain outcomesAnything that causes

the data to be wrong! It might be attributed

to the researchers, the respondent, or to the sampling method!

14.

Sources of Bias

• things that can cause bias in your sample

•cannot do anything with bad data

15.

Voluntary response

•People chose to respond •Usually only people with very strong opinions respond

•Produces bias results

An example would be the surveys in magazines that ask readers to

mail in the survey. Other examples are call-in shows,

American Idol, etc.

Remember, the respondent selects themselves to participate

in the survey!

Remember – the way to determine

voluntary response is:

Self-selection

16.

Convenience sampling

•Ask people who are easy to ask

•Produces bias results

An example would be stopping friendly-looking people in the

mall to survey. Another example is the surveys left on

tables at restaurants - a convenient method!

The data obtained by a convenience sample will be

biased – however this method is often used for surveys & results

reported in newspapers and magazines!

17.

Undercoverage

•some groups of population are left out of the sampling process

Suppose you take a sample by

randomly selecting names from the phone

book – some groups will not

have the opportunity of being selected!

People with unlisted phone numbers – usually high-income families

People without phone numbers –usually low-income families

People with ONLY cell phones – usually young adults

18.

Nonresponse• occurs when an individual chosen for the sample can’t be contacted or refuses to cooperate

• telephone surveys 70% nonresponse

People are chosen by the researchers, BUT refuse to

participate.

NOT self-selected!

This is often confused with voluntary response!

Because of huge telemarketing efforts in the past few years,

telephone surveys have a MAJOR problem with nonresponse!

One way to help with the problem of nonresponse is to make follow up contact with the people who are not home when you first contact them.

19.

Response bias

• occurs when the anything in the survey design influences the response–The interviewer can be cause

–The survey’s wording•Wording must be nuetral

Suppose we wanted to survey high school students on drug

abuse and we used a uniformed police officer to

interview each student in our sample – would we get honest

answers?

20.

Source of Bias?a) Before the presidential election of 1936, FDR against Republican ALF Landon, the magazine Literary Digest predicting Landon winning the election in a 3-to-2 victory. A survey of 10 million people. George Gallup surveyed only 50,000 people and predicted that Roosevelt would win. The Digest’s survey came from magazine subscribers, car owners, telephone directories, etc.

Undercoverage – since the Digest’s survey comes from car owners, etc., the people selected were mostly from high-income families and thus mostly Republican! (other answers are possible)

21.

b) Suppose that you want to estimate the total amount of money spent by students on textbooks each semester at Rice. You collect register receipts for students as they leave the bookstore during lunch one day.

Convenience sampling – easy way to collect data

orUndercoverage – students who

buy books from on-line bookstores are included.

21.

c) To find the average value of a home in Friendswood, one averages the price of homes that are listed for sale with a realtor.

Undercoverage – leaves out homes that are not for sale or

homes that are listed with different realtors.

(other answers are possible)

21.

Page 289 #2A question posted on the Lycos

Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes.

22.

Page 289 #2

Identify the following items (if possible). If you can’t tell, then say so – this often happens when we read about a survey.

a) The population

A question posted on the Lycos Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes.

all U.S. adults

22.

Page 289 #2

Identify the following items (if possible). If you can’t tell, then say so – this often happens when we read about a survey.

b) The population parameter of interest

A question posted on the Lycos Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes.

proportion that feels marijuana should be legalized for medicinal purposes

22.

Page 289 #2

Identify the following items (if possible). If you can’t tell, then say so – this often happens when we read about a survey.

c) The sampling frame

A question posted on the Lycos Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes.

none given –potentially all people with access to web site

22.

Page 289 #2

Identify the following items (if possible). If you can’t tell, then say so – this often happens when we read about a survey.

d) The sample

A question posted on the Lycos Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes.

those visiting the web site who responded

22.

Page 289 #2

Identify the following items (if possible). If you can’t tell, then say so – this often happens when we read about a survey.

e) The sampling method, including whether or not randomization was employed

A question posted on the Lycos Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes.

voluntary response (no randomization employed)

22.

Page 289 #2

Identify the following items (if possible). If you can’t tell, then say so – this often happens when we read about a survey.

f) Any potential sources of bias you can detect and any problems you see in generalizing to the population of interest

A question posted on the Lycos Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes.

voluntary response (no randomization employed)

22.

Random Rectangles

Random Rectangles

Population Parameter= 7.5

Chapter 11 Part I5&6. Explain why each of the following simulations fails to model the real situation properly. a) Use a random integer 0 through 9 to represent the number of heads that appear when 9 coins are tossed. 

Chapter 11 Part I5&6. Explain why each of the following simulations fails to model the real situation properly.

b) A basketball player takes a foul shot. Look at a random digit, using an odd digit to represent a good shot and an even digit to represent a miss.

Chapter 11 Part I5&6. Explain why each of the following simulations fails to model the real situation properly.

c) Use five random digits from 1 through 13 to represent the denominations of the cards in a poker hand.

Chapter 11 Part I5&6. Explain why each of the following simulations fails to model the real situation properly.

d) Use random numbers 2 through 12 to represent the sum of the faces when two dice are rolled

Chapter 11 Part I5&6. Explain why each of the following simulations fails to model the real situation properly.

e) Use a random integer 0 through 5 to represent the number of boys in a family of 5 children.

Chapter 11 Part I5&6. Explain why each of the following simulations fails to model the real situation properly.

f) Simulate a baseball player’s performance at bat by letting 0 = an out, 1 = a single, 2 = a double, 3 = a triple, and 4 = a home run.

Chapter 11 Part I9. You’re pretty sure that your candidate for class president has about 55% of the votes in the entire school. But you’re worried that only 100 students will show up to vote. How often will the underdog (the one with 45% support) win? To find out you set up a simulation. a)Describe how you will simulate a component and its outcomes.

Chapter 11 Part I9. You’re pretty sure that your candidate for class president has about 55% of the votes in the entire school. But you’re worried that only 100 students will show up to vote. How often will the underdog (the one with 45% support) win? To find out you set up a simulation. b)Describe how you will simulate a trial.

Chapter 11 Part I9. You’re pretty sure that your candidate for class president has about 55% of the votes in the entire school. But you’re worried that only 100 students will show up to vote. How often will the underdog (the one with 45% support) win? To find out you set up a simulation. c)Describe the response variable.

Chapter 11 Part I10. When drawing five cards randomly from a deck, which is more likely, two pairs or three of a kind? A pair is exactly two of the same denomination. (Don’t count three 8’s as a pair – that’s 3 of a kind. And don’t count 4 of the same kind as two pair- that’s four of a kind, a very special hand.) How could you simulate 5-card hands? Be careful; once you’ve picked the 8 of spades for a hand, you can’t get it again until the next hand. a) Describe how you will simulate a component and its outcomes.

Chapter 11 Part I10. When drawing five cards randomly from a deck, which is more likely, two pairs or three of a kind? A pair is exactly two of the same denomination. (Don’t count three 8’s as a pair – that’s 3 of a kind. And don’t count 4 of the same kind as two pair- that’s four of a kind, a very special hand.) How could you simulate 5-card hands? Be careful; once you’ve picked the 8 of spades for a hand, you can’t get it again until the next hand.

b)Describe how you will simulate a trial.

Chapter 11 Part I10. When drawing five cards randomly from a deck, which is more likely, two pairs or three of a kind? A pair is exactly two of the same denomination. (Don’t count three 8’s as a pair – that’s 3 of a kind. And don’t count 4 of the same kind as two pair- that’s four of a kind, a very special hand.) How could you simulate 5-card hands? Be careful; once you’ve picked the 8 of spades for a hand, you can’t get it again until the next hand.

c) Describe the response variable.

Chapter 11 Part I11. Suppose a cereal manufacturer puts pictures of famous athletes on cards in boxes of cereal in the hope of boosting sales. The manufacturer announces that 20% of the boxes contain a picture of Tiger Woods, 30% a picture of Lance Armstrong, and the rest a picture of Serena Williams. Suppose you buy five boxes of cereal. Estimate the probability that you end up with a complete set of the pictures. Your simulation should use at least 10 runs.

A component is… checking one box of cereal for the picture inside.

Chapter 11 Part I11. Suppose a cereal manufacturer puts pictures of famous athletes on cards in boxes of cereal in the hope of boosting sales. The manufacturer announces that 20% of the boxes contain a picture of Tiger Woods, 30% a picture of Lance Armstrong, and the rest a picture of Serena Williams. Suppose you buy five boxes of cereal. Estimate the probability that you end up with a complete set of the pictures. Your simulation should use at least 10 runs.

I’ll look at a one-digit random number. Let 0-1 represent a box with… Tiger Woods Let 2-4 represent a box with… Lance Armstrong Let 5-9 represent a box with… Serena Williams

Chapter 11 Part I11. Suppose a cereal manufacturer puts pictures of famous athletes on cards in boxes of cereal in the hope of boosting sales. The manufacturer announces that 20% of the boxes contain a picture of Tiger Woods, 30% a picture of Lance Armstrong, and the rest a picture of Serena Williams. Suppose you buy five boxes of cereal. Estimate the probability that you end up with a complete set of the pictures. Your simulation should use at least 10 runs.

Each trial consists of… check 5 boxes which is represented by 5 digits

The response variable is… whether or not the 5 boxes had at least one of each athlete (a.k.a. “a complete set”)

Chapter 11 Part I11. Suppose a cereal manufacturer puts pictures of famous athletes on cards in boxes of cereal in the hope of boosting sales. The manufacturer announces that 20% of the boxes contain a picture of Tiger Woods, 30% a picture of Lance Armstrong, and the rest a picture of Serena Williams. Suppose you buy five boxes of cereal. Estimate the probability that you end up with a complete set of the pictures. Your simulation should use at least 10 runs.

Conclusion:According to our simulation the probability that you end up with a complete set of pictures after checking 5 boxes is _________. However, it should be noted that only 10 trials were run.

Chapter 11 Part I#1: TABLE OF RANDOM NUMBERS

78545 49201 05329 14182 10971 90472 44682 39304 19819 55799

 Trial #

 OUTCOMES

Complete Set?

1 7-W 8-W 5-W 4-A 5-W NO2 4-A 9-W 2-A 0-T 1-T YES3 0-T 5-W 3-A 2-A 9-W YES4 1-T 4-A 1-T 8-W 2-A YES5 1-T 0-T 9-W 7-W 2-A YES6 9-W 0-T 4-A 7-W 2-T YES7 4-A 4-A 6-W 8-W 2-A NO8 3-A 9-W 3-A 0-T 4-A YES9 1-T 9-W 8-W 1-T 9-W NO

10 5-W 5-W 7-W 9-W 9-W NO

60% CHANCE OF COMPLETE SET

Chapter 11 Part I#2: TABLE OF RANDOM NUMBERS

72749 13347 65030 26128 49067 27904 49953 74674 94617 13317

 Trial #

 OUTCOMES

Complete Set?

1 7-W 2-A 7-W 4-A 9-W NO2 1-T 3-A 3-A 4-A 7-W YES3 6-W 5-W 0-T 3-A 0-T YES4 2-A 6-W 1-T 2-A 8-W YES5 4-A 9-W 0-T 6-W 7-W YES6 2-A 7-W 9-W 0-T 4-A YES7 4-A 9-W 9-W 5-W 3-A NO8 7-W 4-A 6-W 7-W 4-A NO9 9-W 4-A 6-W 1-T 7-W YES

10 1-T 3-A 3-A 1-T 7-W YES

70% CHANCE OF COMPLETE SET

Chapter 11 Part I#3: TABLE OF RANDOM NUMBERS

11071 44430 94664 91294 35163 05494 32882 23904 41340 61185

 Trial #

 OUTCOMES

Complete Set?

1 1-T 1-T 0-T 7-W 1-T NO2 4-A 4-A 4-A 3-A 0-T NO3 9-W 4-A 6-W 6-W 4-A NO4 9-W 1-T 2-A 9-W 4-A YES5 3-A 5-W 1-T 6-W 3-A YES6 0-T 5-W 4-A 9-W 4-A YES7 3-A 2-A 8-W 8-W 2-A NO8 2-A 3-A 9-W 0-T 4-A YES9 4-A 1-T 3-A 4-A 0-T NO

10 6-W 1-T 1-T 8-W 5-W NO

40% CHANCE OF COMPLETE SET

Chapter 11 Part I#4: TABLE OF RANDOM NUMBERS

42831 95113 43511 42082 15140 34733 68076 18292 69486 80468

 Trial #

 OUTCOMES

Complete Set?

1 4-A 2-A 8-W 3-A 1-T YES2 9-W 5-W 1-T 1-T 3-A YES3 4-A 3-A 5-W 1-T 1-T YES4 4-A 2-A 0-T 8-W 2-A YES5 1-T 5-W 1-T 4-A 0-T YES6 3-A 4-A 7-W 3-A 3-A NO7 6-W 8-W 0-T 7-W 6-W NO8 1-T 8-W 2-A 9-W 2-A YES9 6-W 9-W 4-A 8-W 6-W NO

10 8-W 0-T 4-A 6-W 8-W YES

70% CHANCE OF COMPLETE SET

Chapter 11 Part II12. Suppose a cereal manufacturer puts pictures of famous athletes on

cards in boxes of cereal in the hope of boosting sales. The manufacturer announces that 20% of the boxes contain a picture of Tiger Woods, 30% a picture of Lance Armstrong, and the rest a picture of Serena Williams. Suppose you really want the Tiger Woods picture. How many boxes of cereal do you need to buy to be pretty sure of getting at least one? Your simulation should use at least 10 runs.

Chapter 11 Part II14. A friend of yours got all 6 questions right on a multiple choice quiz,

but now claims to have guessed blindly on every question. If each question offered 4 possible answers, do you believe her? Explain, basing your argument on a simulation involving 10 runs. (Make sure that you remember to define your simulation first. That means give the component, outcomes, trial, and response variable first. Then run 10 trials, analyze your response variable, and write your conclusion.) Use the following table for your simulation.

Chapter 11 Part II

19. You are about to take the road test for your driver’s license. You hear that only 34% of candidates pass the test the first time, but the percentage rises to 72% on subsequent retests. Estimate the average number of test drivers take in order to get a license. Your simulation should use 10 runs.

Chapter 11 Part II25. Many couples want to have both a boy and a girl. If they decide to

continue to have children until they have one child of each gender, what would the average family size be? Assume that boys and girls are equally likely. (Make sure that you remember to define your simulation first. That means give the component, outcomes, trial, and response variable first. Then run 10 trials, analyze your response variable, and write your conclusion.) Use the following table for your simulation.