+ All Categories
Home > Documents > Introduction to Probability The problems of data measurement, quantification and interpretation.

Introduction to Probability The problems of data measurement, quantification and interpretation.

Date post: 15-Jan-2016
Category:
Upload: ariana-jen
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
49
Introduction to Probability The problems of data measurement, quantification and interpretation
Transcript
Page 1: Introduction to Probability The problems of data measurement, quantification and interpretation.

Introduction to Probability

The problems of data measurement, quantification and interpretation

Page 2: Introduction to Probability The problems of data measurement, quantification and interpretation.

Is the mere act of quantification Science?

Page 3: Introduction to Probability The problems of data measurement, quantification and interpretation.

What is probability?

Page 4: Introduction to Probability The problems of data measurement, quantification and interpretation.

Measuring probability

Page 5: Introduction to Probability The problems of data measurement, quantification and interpretation.

Event

It is a simple process with a well-recognized beginning and end

Page 6: Introduction to Probability The problems of data measurement, quantification and interpretation.

Outcome

One of the alternatives through which an event manifests

Page 7: Introduction to Probability The problems of data measurement, quantification and interpretation.

Sample space

The set formed from all possible outcomes of an event

Page 8: Introduction to Probability The problems of data measurement, quantification and interpretation.

Trial

• A single complete instance of a process of testing

• Statisticians refer to each trial as an individual replicate, and refer to a set of

trials as an experiment

Page 9: Introduction to Probability The problems of data measurement, quantification and interpretation.

By definition !! 0.0 < P < 1.0

trialsofnumber

outcomes ofnumber P

Page 10: Introduction to Probability The problems of data measurement, quantification and interpretation.

Probability

• Most statistics textbooks define probability just as we have done: the (expected) frequency with which events occur

Page 11: Introduction to Probability The problems of data measurement, quantification and interpretation.

An example of a trial: flipping a coin…An example of an experiment: flipping a coin several times...Sample space: {heads} {tails} ...

Page 12: Introduction to Probability The problems of data measurement, quantification and interpretation.

Random and Deterministic processes

• When we say that events are random, stochastic, probabilistic, or due to chance, what we really mean is that their outcomes are determined in part by a complex set of processes that we are unable or unwilling to measure and will instead treat as random

• The strength of other processes that we measure, manipulate, and model represent deterministic or mechanistic forces

Page 13: Introduction to Probability The problems of data measurement, quantification and interpretation.

The mathematics of Probability• Axiom 1: the sum of the probabilities of

outcomes within a single sample space =1.0

• In a properly defined sample space the outcomes are mutually exclusive and exhaustive

0.1)(1

n

iiAP

Page 14: Introduction to Probability The problems of data measurement, quantification and interpretation.

The whirligig beetle

These beasts always produce exactly two litters, with between 2 and 4 offspring per litter

Page 15: Introduction to Probability The problems of data measurement, quantification and interpretation.

The lifetime reproductive success of a beetle can be described as an outcome (a,b) where a represents the number of offspring in the first litter and b the number of offspring in the second litter

Page 16: Introduction to Probability The problems of data measurement, quantification and interpretation.

The sample space Whirligig Beetle Fitness consists of all possible

lifetime reproductive outcomes:

• Fitness = {(2,2),(2,3),(2,4)

(3,2),(3,3),(3,4)

(4,2),(4,3),(4,4)}

P(2,2)=P(2,3)=P(2,4) = … =P(4,4)

1/9+1/9+1/9+1/9+1/9+1/9+1/9+1/9+1/9=1

Page 17: Introduction to Probability The problems of data measurement, quantification and interpretation.

Complex events

• Are composites of simple events in the sample space

• A complex event can be achieved by one of several pathways (OR statement)

• Event A or Event B or Event C, represented by the union of simple events (A U B U C)

Page 18: Introduction to Probability The problems of data measurement, quantification and interpretation.

Complex events: summing probabilities

• What is the probability that a whirligig beetle produces 6 offspring?

• 6 offspring ={(2,4),(3,3),(4,2)}

Fitness

(2,2)

(3,4)(2,3)

(4,4)(4,3)(2,4)

(3,2)(3,3)

(4,2)

6 offspring

Page 19: Introduction to Probability The problems of data measurement, quantification and interpretation.

Complex events

• Axiom 2: the probability of a complex event equals the sum of the probabilities of the outcomes that make up that event

• P (6 offspring) = P(2,4) or P(3,3) or P(4,2)

= 1/9+1/9+1/9 = 3/9 = 1/3

• P(A or B or C)= P(A)+P(B)+P(C)

Page 20: Introduction to Probability The problems of data measurement, quantification and interpretation.

Shared events

• Are multiple simultaneous occurrences of simple events in the sample space

• A shared event requires the simultaneous occurrence of two or more simple events (AND statement)

• Event A and Event B and Event C, represented by the intersection of simple events (A ∩ B ∩ C)

Page 21: Introduction to Probability The problems of data measurement, quantification and interpretation.

Shared events: multiplying probabilities

• If, instead, we assume the number of offspring produced in the second litter is independent of the number produced in the first litter

• Suppose that an individual can produce 2,3,4 offspring in each litter and that the chances of each of these events are 1/3.

• What is the probability of obtaining the pair of litters (2,4)?

• 2,4 offspring ={(2,4)}

Page 22: Introduction to Probability The problems of data measurement, quantification and interpretation.

Independence

• Two events are independent of one another if the outcome of one event is not affected by the outcome of the other

• If two events are independent of one another, then probability that both events occur (a shared event) equals the product of their individual probabilities

Page 23: Introduction to Probability The problems of data measurement, quantification and interpretation.

If A and B are independent

(2)

(3) (4)

Fitness

(2) (3)

(4)

First litter

)()()( BxPAPBAP

Second litter

1/3*1/3=1/9

Page 24: Introduction to Probability The problems of data measurement, quantification and interpretation.

Milkweeds and Caterpillars

Page 25: Introduction to Probability The problems of data measurement, quantification and interpretation.

Probability calculations

• Imagine two kinds of milkweed populations: those that evolved secondary chemicals that make them resistant (R) to the herbivore, and those that haven’t (not R)

• Suppose you census a number of milkweed populations and determine that 20% of the populations are resistant to the herbivore

• Thus P(R)=0.20; P(not R)=0.80

Page 26: Introduction to Probability The problems of data measurement, quantification and interpretation.

Probability calculations

• Similarly, suppose that the probability that the caterpillar (C) occurs in a patch is 0.7

• Then P(C)=0.7; P(not C)=0.3.• If colonization events are independent of

one another, What are the chances of finding either caterpillars, milkweeds, or both in these patches?

• What is the probability that the milkweed will disappear?

Page 27: Introduction to Probability The problems of data measurement, quantification and interpretation.

Probability calculations

Shared event

Probability calculation

Milkweed resistant

Caterpillar present

Susceptible &

no caterpillar

[1-P(R)]*[1-P(C)]=

0.8*0.3=0.24

NO NO

Susceptible &

caterpillar

[1-P(R)]*[P(C)]=

0.8*0.7=0.56

NO YES

Resistant &

no caterpillar

[P(R)]*[1-P(C)]=

0.2*0.3=0.06

YES NO

Resistant &

caterpillar

[P(R)]*[P(C)]=

0.2*0.7=0.14

YES YES

Page 28: Introduction to Probability The problems of data measurement, quantification and interpretation.

Notice

• 0.24+0.56+0.06+0.14=1

• 0.14+0.06=0.20 (probability of resistance)

• 0.56+0.14=0.70 (probability of caterpillar presence)

• 0.56 Probability that milkweed will disappear

Page 29: Introduction to Probability The problems of data measurement, quantification and interpretation.

Rules for combining sets when events are not independent

• Suppose in our sample space there are two identifiable events, each of which consists of a group of outcomes:

1. whirligig that produces exactly 2 offspring in the first litter (F)

2. whirligig that produces exactly 4 offspring in the second litter (S)

Page 30: Introduction to Probability The problems of data measurement, quantification and interpretation.

Rules for combining sets when events are not independent

• Fitness={(2,2),(2,3),(2,4)

(3,2),(3,3),(3,4)

(4,2),(4,3),(4,4)}

)}4,4(),4,3(),4,2(),3,2(),2,2{(SF

F={(2,2),(2,3),(2,4)}S={(2,4),(3,4),(4,4)}

F={(2,2),(2,3),(2,4)}S={(2,4),(3,4),(4,4)}

)}4,2{(SF

Page 31: Introduction to Probability The problems of data measurement, quantification and interpretation.

Venn diagram

SF

Fitness

(2,2)

(3,4)

(2,3)

(4,4)

(4,3)

(2,4)

(3,2)

(3,3)

(4,2)

SF

F

S

Page 32: Introduction to Probability The problems of data measurement, quantification and interpretation.

Rules for combining sets when events are not independent

• We can construct a third useful set by considering the set Fc , called the complement of F, which is the set of objects in the remaining sample space

• Fc={(3,2),(3,3),(3,4),(4,2),(4,3),4,4)}

• From axioms 1 and 2: P(F)+P(Fc)=1

Page 33: Introduction to Probability The problems of data measurement, quantification and interpretation.

Empty set

• The empty set contains no elements and is written as

CFF =

Page 34: Introduction to Probability The problems of data measurement, quantification and interpretation.

Calculating probabilities of combined events

)()()()( BAPBPAPBAP

)()()( BPAPBAP If:

)()( BPAP ={ }

then:

Page 35: Introduction to Probability The problems of data measurement, quantification and interpretation.

How to estimate the probability that a whirligig produces 6 offspring, if the number of offspring produced in the second litter depends on the number of offspring in the first litter?

• Recall the complex event 6 offspring is

P(6 offspring) = {(2,4),(3,3),(4,2)} = 3/9 (or 1/3)• If you observed that the first litter was 2

offspring, what is the probability that the whirligig will produce 4 offspring next time?

Answer = 1/3 is correct, but why??????

Page 36: Introduction to Probability The problems of data measurement, quantification and interpretation.

Conditional probabilities

• If we are calculating the probability of a complex event, and we have information about the outcome of that event, we should modify our estimates of the probabilities of other outcomes accordingly. We refer to these updated estimates as conditional probabilities

P(A│B) or the probability of event A given event B

Page 37: Introduction to Probability The problems of data measurement, quantification and interpretation.

The probability of A is calculated assuming that the

event B has already occurred:

)(

)()|(

BP

BAPBAP

9/1)( SFP

3/1)( SP

3/13/1

9/1)|( FSP

Page 38: Introduction to Probability The problems of data measurement, quantification and interpretation.

Rearranging the formula gives us a general formula for calculating the probability of an

intersection:

)()|()()|()( AxPABPBxPBAPBAP

Note that if two events A and B are independent, then P(A|B)=P(A), so that

)()()( BxPAPBAP

Page 39: Introduction to Probability The problems of data measurement, quantification and interpretation.

• Until now, we have discussed probability using what is know as the frequentist paradigm, in which probabilities are estimated as the relative frequencies of outcomes based on an infinitely large set of trials

• Scientists start assuming NO prior knowledge of the probability of an event, and re-estimate the probability based on a large number of trials

The frequentist paradigm

Page 40: Introduction to Probability The problems of data measurement, quantification and interpretation.

• In contrast is the Bayesian paradigm, which builds on the idea that investigators may already have a belief about the probability of an event, before the trials are conducted.

• These prior probabilities may be based on previous experience, intuition, or model predictions

• These prior probabilities are then modified by the data from the current trial to yield posterior probabilities.

Bayes’ Theorem

Page 41: Introduction to Probability The problems of data measurement, quantification and interpretation.

Bayes’ Theorem

)()/()()/(

)()/()/(

cc APABPAPABP

APABPBAP

The probability of an event or outcome A conditional on another event B can be determined if you know the probability of the event B conditional on the event A and you know the complement of A

Page 42: Introduction to Probability The problems of data measurement, quantification and interpretation.

An important distinction

• For example, the distinction between:

1. P(C|R), the probability that caterpillars are found given a resistant population of milkweeds. To estimate P(C|R), we would need to examine populations of resistant milkweeds to determine the frequency with which these populations were hosting caterpillars

Page 43: Introduction to Probability The problems of data measurement, quantification and interpretation.

An important distinction

• and:

2. P(R|C), the probability that milkweeds are resistant given that they are eaten by caterpillars. To estimate P(R|C), we would need to examine caterpillars to determine the frequency with which their host plants are resistant.

Page 44: Introduction to Probability The problems of data measurement, quantification and interpretation.

Probability is completely contingent on how we define the sample space

• In general, we all have intuitive estimates for probabilities for all kinds of events.

• However, to quantify those guesses, we have to decide on a sample space, take samples, and count the frequency with which certain events occur

Page 45: Introduction to Probability The problems of data measurement, quantification and interpretation.

Estimating probability by sampling

• We can efficiently estimate the probability of an event by taking a sample of the population of interest

Exercise 1

Part 1, with cards

Page 46: Introduction to Probability The problems of data measurement, quantification and interpretation.

Estimating probabilities by sampling

1. Using playing cards identify Kings, Queens, Jacks and Aces as “captures”, and the rest of the cards as “non captures”.

2. What is the probability of “capture”?

3. Shuffle to provide an element of chance in the game.

4. Take at random four cards and note how many of them are “captures”

5. Repeat this procedure (Steps 3. and 4.) 20 times

6. What is the expected value of the capture probability?

students will have one week to complete this exercise

Page 47: Introduction to Probability The problems of data measurement, quantification and interpretation.

Estimating probabilities by sampling

• Do the same exercise, but use only the heart suit

• What is the expected value of the capture probability?

• How different is the result among the games you played?

Page 48: Introduction to Probability The problems of data measurement, quantification and interpretation.

• Write an algorithm (sequence of instructions) in Excel that simulates the game previously described (be creative)

• Play the game 10 and 20 times • How different are the results from the games you played?

(present the results as histograms)• What is the expected value of the capture probability?

Exercise 1

Part 2, A model of the game

Page 49: Introduction to Probability The problems of data measurement, quantification and interpretation.

Example of Histogram• The numbers on the horizontal axis, or x-axis

indicate the number of “captures”• The numbers on the vertical axis or y-axis

indicate the frequency


Recommended