Chapter 5 Modeling Variation with Probability. Calculated Risks... Our lives are full of...

Post on 03-Jan-2016

223 views 1 download

Tags:

transcript

Chapter 5

Modeling Variation with Probability

Calculated Risks...Our lives are full of false-positives. When the smoke alarm

goes off because you've burned something on the stove, that's a false-positive. It's positive because an alarm goes off alerting you to a danger. It's false because your house is not actually burning down.

After September 11th, the Transportation Security Administration (TSA) was established and charged with installing a screening system for airports that would detect weapons and bombs on individuals or in baggage. Since January 1, 2003, TSA has been screening all checked luggage.

Calculated Risks...The machines for checking baggage are costly, over $1

million per machine. Unfortunately, the technology is not perfect. Shampoo, for example, which has the same density as certain explosives, can be mistaken for explosives and generate a false-positive.

Other items that produce false-positives are certain food items (like cheese or chocolate), books, deodorant sticks, and toothpaste. The machines also flag luggage that has items the scanner can't see through, such as laptop, camera equipment, and cell phones. TSA screeners will hand-search bags that register a positive reading.

Calculated Risks...

• In the context of the problem, what are these and which of the following is most serious?

• True negative• True positive• False negative• False positive

Calculated Risks...

• True negative: machine says no explosives/weapons & there really aren’t any explosives/weapons

• True positive: machine says explosives/weapons & there really are explosives/weapons

• False negative: machine says no explosives/weapons & there really are explosives/weapons

• False positive: machine says explosives/weapons & there really aren’t any explosives/weapons

Calculated Risks....Which is most serious in the context of this situation?

• True negative: machine says no explosives/weapons & there really aren’t any explosives/weapons

• True positive: machine says explosives/weapons & there really are explosives/weapons

• False negative: machine says no explosives/weapons & there really are explosives/weapons

• False positive: machine says explosives/weapons & there really aren’t any explosives/weapons

Probability...

• Probability calculations are the basis for inference (making decisions about a population based on a sample).

• What we learn in this chapter will help us describe statistics from random samples & randomized comparative experiments later in the course.

1-in-6 game…

As a special promotion for its 20-ounce bottles of soda, a soft drink company printed a message on the inside of each bottle cap. Some of the caps said, “Please try again!” while others said, “You’re a winner!” The company advertised the promotion with the slogan “1 in 6 wins a prize.” The prize is a free 20-ounce bottle of soda, which comes out of the store owner’s profits.

Seven friends each buy one 20-ounce bottle at a local convenience store. The store clerk is surprised when three of them win a prize. The store owner is concerned about losing money from giving away too many free sodas. She wonders if this group of friends is just lucky or if the company’s 1-in-6 claim is inaccurate. In this Activity, you and your classmates will perform a simulation to help answer this question.

1-in-6 game…

For now, let’s assume that the company is telling the truth, and that every 20-ounce bottle of soda it fills has a 1-in-6 chance of getting a cap that says, “You’re a winner!” We can model the status of an individual bottle with a six-sided die: let 1 through 5 represent “Please try again!” and 6 represent “You’re a winner!”

1-in-6 game…

1. Roll your die seven times to imitate the process of the seven friends buying their sodas. How many of them won a prize? Repeat 3 times.

2. Write your three results on the board. Using Minitab, input the data and create a dot plot displaying the number of prize winners you got in Step 1 on the graph.

3. What percent of the time did the friends come away with three or more prizes, just by chance? Does it seem plausible that the company is telling the truth, but that the seven friends just got lucky? Explain.

Whose book is this?Suppose that four friends (including Ariana Grande) get together to study at a doughnut shop for their next test in high school statistics.

When they leave their table to go get a doughnut, the doughnut shop owner decides to mess with them (you know… because of Ariana’s recent doughnut scandal) and makes a tower using their textbooks. Unfortunately, none of the students wrote their name in their book, so when they leave the doughnut shop, each student takes one of the books at random.

When the students return the books at the end of the year and the clerk scans their barcodes, the students are surprised to learn that none of the four had their own book. How likely is it that none of the four students ended up with the correct book? … simulation time!

On four equally-sized slips of paper, write “Student 1,” “Student 2,” “Student 3,” and “Student 4.” Likewise, on four equally-sized slips of paper, write “Book 1,” “Book 2,” “Book 3,” and “Book 4.” Place the four papers with the student numbers on your desk. Then shuffle the papers with book numbers and randomly place one paper on each ‘student.” If the book number matches the student number, this represents a student choosing his own book from the tower of textbooks.

Count the number of students who get the correct book. Repeat this process three times. Then write your results on the board. Input the data and create a dot plot in Minitab. How likely is it for none of the students to end up with their own book?

What if we were to do this entire simulation again. Would you expect to get the same exact results? Why or why not?

Investigating Randomness… & More Simulation

• Pretend that you are flipping a fair coin. Without actually flipping a coin, imagine the first toss. Write down the result you see in your mind, heads (H) or tails (T), below.

• Imagine a second coin flip. Write down the result below.

• Keep doing this until you have recorded the results of 25 imaginary flips. Write all 25 of your results in groups of 5 to make them easier to read, like this: HTHTH TTHHT, etc.

Investigating Randomness… & More Simulation…

• A run is a repetition of the same result. In the previous example, there is a run of two tails followed by a run of two heads in the first 10 coin flips. Read through your 25 imagined coin flips that you wrote above and find the longest run (doesn’t matter if it was heads or tails; just your longest run).

• On the board, write the length of the longest run you wrote (within your 25 values). Input into Minitab and create a dot plot of the classes data.

Investigating Randomness… & More Simulation

• Now, use a random digits table, technology, or a coin to generate a similar list of 25 coin flips. Find the longest run that you have.

• Now lets create another dot plot with this new data from the class. Plot the length of the longest run you got above.

Randomness…

• The idea of probability is that randomness is predictable in the long run. Unfortunately, our intuition about randomness tries to tell us that random phenomena should also be predictable in the short run.

• Probability Applet (www.whfreeman.com/tps5e)

Random Phenomenon...

We call an event ‘random’ if individual outcomes are uncertain but there is nonetheless a regular distribution of outcomes in a large number of repetitions.

Big Idea

Chance behavior (random phenomenon) is unpredictable in the short run, but has a regular and predictable pattern in the long run.

Individual outcomes are uncertain; but a regular distribution of outcomes emerges in a large number of repetitions.

Probability of any outcome of random phenomenon is the proportion of times an outcome would occur in a very long series of repetitions. Probability is a long-term relative frequency (simulations very helpful).

Probability vs. Odds

• Probability =

• Odds = €

successes

total

successes

failures

Careful...

• It makes no sense to discuss the probability of an event that has already occurred.

• Meaningless to ask what the probability is of an already-flipped coin being a tail. It’s already been decided.

• Probability: future event

• Statistics: past event

Definition: Simulation is...

• the imitation of chance behavior, based on a model that accurately reflects the phenomenon under consideration.

• Examples include...

Simulation...

• Why would we want to simulate a situation (rather than carry the event out in reality)?

• Discuss with a partner for one minute.

Simulation… model mustmatch situation...

• What model could we use to simulate the probability of a soon-to-be new-born baby being a girl or a boy?

Simulation...

What couldn’t be used as a model to simulate this situation?

Discuss for one minute.

Simulation...

• ... can be an effective tool/method for finding the likelihood of complex results IF you have a trustworthy model.

• If not (if model does not correctly describe the random phenomenon), probabilities derived from model will also be incorrect/worthless.

Simulation Steps...

• State. Ask a question of interest about some chance process

• Plan. Describe how to use a chance device to imitate one repetition of the process. Tell what you will record at the end of each repetition

• Do. Perform many repetitions of the simulation

• Conclude. Use the results of your simulation to answer the question of interest

• Do following simulation if time permits

Simulation: Should I guess?State – Plan – Do - Conclude

A multiple-choice test is scored as follows: For each question you answer correctly, you get 4 points. For each question you answer incorrectly, you lose 1 point. For simplicity suppose that there are 10 multiple-choice questions with four choices for each question.

Suppose Mr. Deming doesn’t know the answers to any of the questions, and he guesses on each one. Use simulation methods to determine Mr. Deming’s expected score.

Should I guess?(lets use SPDC)

State – Plan – Do - Conclude• Probability of guessing correctly is 0.25. Let digits 00 to 24

correspond to a correct solution & digits 25 to 99 correspond to incorrect solution.

• Teams of two; Simulate 2 or 3 trials; write your results on board; class will create a graphical representation of all our results

• State conclusion.

• Expected (theoretical) score is 2.5

Should I guess?Note: What if there were five choices, so a guess has a

probability of 0.20 of being correct.

On average, students who guess at all ten questions would get two correct for a score of 2 x 4 = 8

But they miss, on average, eight of the ten questions so lose 8 points for these wrong answers.

Final score: 0

Models …

• , simulations, etc.

• Basis for all probability models:• Sample space: list of all possible outcomes; can

be very simple or very complex• Event: a subset of sample space

bxay ˆ

Probability Models• Accurately counting outcomes is critical in probability

• Example: all possibilities when rolling 1 red die and 1 green die(1r, 1g), (1r, 2g), (1r, 3g), etc.

or

(1g, 1r), (1g, 2r), (1g, 3r), etc.

• Tossing a penny & a quarter(hp, hq), (hp, tq), (tp, hq), (tp, tq)

and(hq, hp), (hq, tp), (tq, hp), (tq, tp)

Probability

• Tree diagrams sometimes helpful tool

• Good graphical technique for listing entire sample space for relatively small sample space (not if you have, say, 210 sample space)

• Diagram: Flip a coin then roll a dieor

• Diagram: Roll a die then flip a coin

Sampling with/without Replacement

• Without Replacement: Choosing a card from a deck; keeping that card, then choosing another card

• These are not independent events; caution

• With Replacement: Choosing a card from a deck; putting that card back into the deck, then choosing another card

• Sometimes not possible; but a good general practice

Probability Rules ...

1. All probabilities are values between 0 & 1 (remember density curves?)

Consider event A:

2. Sum of probabilities of all outcomes = 1S sample space P(S) = 1

0P(A)1

Momentary detour...Examples of disjoint/mutually exclusive events

include:• miss a bus; catch a bus• play chess; sleep• turn left; turn right• sit down; stand up

Non-examples of disjoint/mutually exclusive events include:

• listen to music; do homework• sleep; dream

Mutually Exclusive/Disjoint events are...

• Events that cannot happen simultaneously

• Other examples of mutually exclusive/disjoint events?

Another brief detour… “union”

* The union of any collection of events is the event that at least one of the collection occur.

* Symbol “U”

* P(A or B or C) = P(A U B U C)

Back to the Probability Rules ...

3. If 2 events have no outcomes in common (disjoint/mutually exclusive) then the probability of one or the other occurring is the sum of their individual probabilities.

P(A or B) = P(A) + P(B)(Addition Rule for Disjoint Events)

Example: P (rolling a 2 or rolling an odd)Non-example: P (rolling a 4 or rolling an even)

…more examples

P (A or B or C) = P(A U B U C) = P (A) + P (B) + P (C) only if events are disjoint

A: freshman P(A) = 0.30B: sophomore P(B) = 0.35C: junior P(C) = 0.20D: senior P(D) = 0.15

All disjoint events.P(B U C) = P(A U D) = P (A U B U C U D) =

Probability Rules ...4. Probability that an event does not occur is one minus

the probability that the event will occur (complement rule)

P(Ac)= 1 - P(A)

Example: P (person has brown hair) = 0.53So, P (person does not have brown hair) = 1 – 0.53 = 0.47

What would = ? What would = ?

AUAc cAA

Probability Rules ....

... one more probability rule later... Stay tuned ...

Probability Rules PracticeDistance learning courses are rapidly gaining popularity among

college students. The probability of any age group is just the proportion of all distance learners in that age group. Here is the probability model:

Are rules 1 & 2 satisfied above?Are the above groups mutually exclusive events? Why or why not?P ( 18-23 yr & 30-39 yr) =P (not being in 18-23 yr category) =P (24-29 yr or 39-39 yr) =

Caution...Be careful to apply the addition rule only to

disjoint/mutually exclusive events

P (queen or heart) = 4/52 + 13/52 (??) .... not disjoint... this probability rule would not be correct in this case

More on this later…

Review/Preview...Mutually Exclusive/Disjoint• sleeping; playing chess• walking; riding a bike

Overlapping Events (not mutually exclusive)• roll an even; roll a prime• select 12th grader; select athlete• choose hard-cover book; choose fiction

What if ...

• What if events are not disjoint/mutually exclusive? i.e., they can occur simultaneously (overlapping events)

• How do we calculate P(A or B)?

Data Collection Time…Which do you prefer?

Texting Emailing Social Media

Total

Apple Devices

Other (non-Apple

Devices)

Total

General Addition Rule (disjoint or overlapping)

P (A or B) = P (A) + P (B) – P (A and B)

P (A U B) = P (A) + P (B) – P (A∩ B)

Pierced ears, anyone?Find the probability that a given student:• has pierced ears• is a male• is male and has pierced ears• is male or has pierced ears

Morale of the story? Be careful to apply the addition rule for mutually

exclusive events only to disjoint/mutually exclusive events

P (queen or heart) = 4/52 + 13/52 .... not disjoint... counted queen of hearts twice

P (queen or heart) = 4/52 + 13/52 – 1/52(think of a Venn diagram; overlap)

Venn Diagrams…

(a) Event A and (b) A, B mutually exclusive/disjoint

Venn Diagrams

(a) Intersection of A & B (and)(b) Union of A, B (or)

Conditional Probability...Remember... Probability assigned to an event can

change if we know that some other event has occurred (“given”)

Conditional Probability...P (A | B) is read “the probability of A given B”

P (female) =versus

P (female | 15-17 years) =

Conditional Probability... cautionP (male | 18-24 yr) =

P (18-24 yr | male) =

Formula…To find the conditional probability P (A | B)

The conditional probability P (B | A) is given by

General Multiplication Rule for Any Two Events

The joint probability that events A and B both happen is P (A ∩ B) = P (A) P (B|A)

P (female and 15-17yr) =

89/16,639

P(A ∩ B) = P(A) P(B|A)A = female B = 15-17 years

= (9,321/16,639) x (89/9,321)= 89/16,639 ✓

Tree diagram…

About 27% of adult Internet users are 18 to 29 years old, another 45% are 30 to 49 years old, and the remaining 28% are 50 and over.

The Pew Internet and American Life Project finds that 70% of Internet users aged 18 to 29 have visited a video-sharing site, along with 51% of those aged 30 to 49 and 26% of those 50 or older.

Review/Preview ....Two events A & B are independent if knowing that one occurs

does not change the probability that the other occurs.

Examples:- Roll a die twice. What I roll the first time does not change the

probability of what I will roll the second time.- Win at chess; win the lottery- Student on debate team; student on swim team

So, if events A and B are independent, then P (A|B) = P(A) and likewise P (B|A) = P (B).

A = {The person chosen is male}B = {The person chosen is 25 – 34 years }

(a) Explain why P(A) = 0.4397.(b) Find P(B).(c) Are the events A and B independent?

a) P (A): 7317/16,639 = .4397b) P (B): (3,494)/16,639 = .2100c) P (A|B) must equal P (A) for events to be independent

(1589/3494) = .4547 ≠ .4397so events A and B are not independent

Are these events independent?

Event A: Honors student Event B: Basketball

P (B) = P (B|A) =1800/6000 = 0.3 450/1500 = 0.3

P (A) = P (A|B) = 1500/6000 = 0.25 450/1,800 =

0.25

Honors Student

Non-Honors Student

Total

Basketball 450 1,800Non-BB PlayerTotal 1,500 6,000

So remember… Independent Events

Two events A and B that both have positive probability are independent if

P (B|A) = P (B) or P (A|B) = P (A)

Last Probability Rule ....If events A & B are independent, then

P (A & B) = P (A) P(B)(this is the multiplication rule for independent events)

Example: Consider the following probabilities.P( student has 4.0 GPA) = 0.15 P(student miss bus) = 0.30

If these two events are independent, then P (4.0 GPA & missing bus) = (0.15)(0.30) = 0.045

Caution...

P ( heart & 3) -- without replacement – is not independent; knowing outcome of first pick changes outcome of second pick

Independent is not mutually exclusive/disjoint.

Mutually exclusive/disjoint is not independent.

(remember... mutually exclusive/disjoint events can’t happen at same time; independent events can)

Free stuff...

• If events A and B are independent, then:– their complements, Ac and Bc, are also independent– Ac and B are also independent– A and Bc are also independent

• Also extends to collections of more than 2 events, i.e., independence of events A, B, & C means that no information about any one or any two can change the probability of the remaining events

General Probability Practice...

An automobile manufacturer buys computer chips from a supplier. The supplier sends a shipment containing 5% defective chips. Each chip chosen from this shipment has probability 0.05 of being defective, and each automobile uses 12 chips selected independently. What is the probability that all 12 chips in a car will work properly?

Answer ...

The probability that all 12 chips in the car will work is (1 – 0.05)12 = (0.95)12 = 0.504.

Draw a venn diagram for...

• Mutually exclusive/disjoint events• Independent events• Dependent events• Overlapping events

Case Closed...

• True negative: machine says no explosives/weapons & there really aren’t any explosives/weapons

• True positive: machine says explosives/weapons & there really are explosives/weapons

• False negative: machine says no explosives/weapons & there really are explosives/weapons

• False positive: machine says explosives/weapons & there really aren’t any explosives/weapons

Case Closed Questions...

• It is said that the occurrence of false-positives in airport screenings has been about 30%. What does that mean?

• The probability that the alarm will sound (incorrectly) when scanning luggage that does not contain explosives, guns, or knives is 0.3

Case-Closed Questions...

• In an FAA test, 40% of explosives planted by government agents made it through security checkpoints; and the occurrence of false-positives in airport screenings has been about 30%.

• Assume that on average 1 suitcase in 10,000 has a bomb in it. Construct a tree diagram to help you find the probability that a suitcase with a bomb would be detected. What’s the probability that a piece of luggage that has a bomb in it would escape detection?

Case Closed...

Case Closed...

• Find the probability that no alarm is sounded for a suitcase that has no bomb.

• Answer: (9,999/10,000) x (0.70) = 0.69993

Let’s Make a Deal!

• Go to New York Times website and do simulation

• Discuss which strategy is best.• Textbook Page 228• Show clip of game show