Lecture 6 Probability Distributions - Kent · 2012-05-21 · Lecture 6 1 Lecture 6 Probability...

Lecture 6 1

Lecture 6

Probability Distributions

In certain situations, some attribute of the

outcome may hold more interest for the

experimenter than the outcome itself. For

example, a player of the game of craps may be

concern only about throwing a 7 and not weather

the 7 was the result of a 5 and a 2 or a 4 and a 3

or a 6 and a 1.

Definition 1. A random variable (r.v.) is a

numerical measure of the outcome of a

probability experiment, so its value is determined

by chance. Random variables are denoted using

May 21, 2012

Lecture 6 2

capital letters such as X, Y , etc.

May 21, 2012

Lecture 6 3

Definition 2. A discrete random variable is

a random variable that has either a finite number

of possible values or a countable number of

possible values. A continuous random

variable is a random variable that has an infinite

number of possible values that is not countable.

May 21, 2012

Lecture 6 4

Because the value of a r.v. is determined by

chance, there are probabilities assigned to these

possible values. A table, graph, or formula

containing all the possible values a random

variable could take together with the

corresponding probabilities forms a probability

distribution.

In the case of a discrete probability

distribution the following equalities must be

verified:

1)∑P (X = x) = 1

2) 0 ≤ P (X = x) ≤ 1

where (X = x) denotes the probability of the

May 21, 2012

Lecture 6 5

random variable X to be x.

May 21, 2012

Lecture 6 6

Mean and variance of a discrete random variable

The mean, or the expected value, of a discrete

random variable is given by the formula

µX = E(X) =∑

x× P (X = x)

where x is the value of the random variable and

P (X = x) is the probability of observing the

random variable x.The variance of a discrete r.v.

is given by

σ2X =

∑(x− µX)2P (X = x)

and the standard deviation is the square root of

the variance, i.e. σX =√σ2X .

May 21, 2012

Lecture 6 7

Binomial distribution

When do we deal with a binomial trial or

distribution? An experiment is said to be a

binomial experiment if: 1) The experiment is

performed a fixed number of times, usually

denoted by n. Each repetition is called a trial.

2) The trials are independent (the outcome of one

does not depend on the other)

3) For each trial, there are 2 mutually exclusive

outcomes: success or failure.

4) The probability of success is fixed for each trial

of the experiment.The probability of success is p

while of failure is 1− p 5) We say that a r.v. is

binomially distributed if X counts the number of

May 21, 2012

Lecture 6 8

successes in n independent trials of the

experiment. So the possible values for X are 0, 1,

2, ..., n.

May 21, 2012

Lecture 6 9

Mathematicians showed that the probability of

obtaining x successes in n independent trials of a

binomial experiment where the probability of

success is p is given by

P (X = x) = nCxpx(1− p)n−x, x = 0, 1, 2, ..., n

Also they showed that such a binomial random

variable will have the mean given by

µX = E(X) = np

and the standard deviation given by the formula:

σX =√np(1− p)

May 21, 2012

Lecture 6 10

Continuous r.v.’s – Normal

Distribution

In the case of continuous r.v.’s, computing

probabilities is not that easy because the r.v.

takes infinitely many values. That is why we look

at intervals of values the r.v. might take.

Probability density Function

A probability density function is a function

used to compute probabilities of continuous r.v.’s.

It has to satisfy the following two properties:

(1.) The area under the graph of the equation

over all possible values of the r.v. must equal one.

May 21, 2012

Lecture 6 11

(2.) The graph of the equation must lie on or

above the x-axis for all possible values of the r.v.

May 21, 2012

Lecture 6 12

Property: The probability of observing a value

of the r.v. in a certain interval equals the area

under the graph of the density function of that

r.v., over that interval.

A continuous r.v. is normally distributed or

has a normal probability distribution if its

relative frequency histogram has the shape of a

normal curve (bell-shaped and symmetric).

May 21, 2012

Lecture 6 13

Area and the normal distribution

If the r.v. X is normally distributed then the area

under the normal curve for any range of values of

the r.v. X represents either:

1) the proportion of the population with the

characteristics described by the range, or

2) the probability that a randomly chosen

individual from the population will have the

characteristics described by the range.

May 21, 2012

Lecture 6 14

Finding the area under the density graph of a

normally distributed r.v. is not an easy task. It

requires a lot of calculus. One way of avoiding

this is to use tables that give us these areas

(probabilities). But for each µ and σ we would

need a new table. How can we avoid this? By

transforming somehow all these r.v. into a

standard one.

Standardizing a normal r.v.

Suppose that the r.v. X is normally distributed

with mean µ and standard deviation σ. Then the

r.v.

Z =X − µσ

May 21, 2012

Lecture 6 15

is normally distributed with mean µ = 0 and

standard deviation σ = 1 Such an r.v. is said to

have the standard normal distribution.

May 21, 2012

Lecture 6 16

Standard Normal Distribution

We saw that the standard normal distribution is a

normal distribution with mean 0 and standard

deviation 1. Therefore its properties are deduced

from the properties of the normal distribution:

May 21, 2012

Lecture 6 17

The table II gives areas under the standard

normal curve for values to the left of a specified

Z-score z0.

Area under the normal curve to the right of z0

equals 1-Area to the left of z0.

May 21, 2012

Lecture 6 18

There are 3 types of areas that we could find as

illustrated bellow. But we could also find a

z-score given an area.

May 21, 2012

Lecture 6 19

Applications of the Normal

Distribution

Sampling distribution; The Central

Limit Theorem

Definition 3. The sampling distribution of

the mean is a probability distribution of all

possible values of the random variable x

computed from a sample of size n from a

population with mean µ and std. σ.

How do we obtain it?

Step 1. Obtain a simple random sample of size n.

Step 2. Compute the sample mean.

May 21, 2012

Lecture 6 20

Step 3. Assuming that we are sampling from a

finite population repeat step 1 and 2 until all

simple random samples of size n have been

obtained.

Ex:

May 21, 2012

Lecture 6 21

How does x vary with the increase of n?

May 21, 2012

Lecture 6 22

What can we say about this distribution (of the

samples of size n)?

May 21, 2012

Lecture 6 23

Sampling distribution of

sample mean

Let x be the mean of a SRS of size n from a

population with mean µ and SD σ. The mean

and SD of x are

µx = µ

σxσ√n

Moreover, if the population is normally

distributed then x is normally distributed

N(µ, σ√n

) and if the population is not necessarily

normally distributed but the sample size is large

then x is approximately normally distributed,

N(µ, σ√n

).

May 21, 2012

Lecture 6 24

Normal Approximation for counts

and proportions

For an SRS of size n from a large population

having population proportion p of success such

that np ≥ 10 and n(1− p) ≥ 10, when n is large

the sampling distributions of the statistic X, the

number of successes, is given by:

X is approximately N(np,√np(1− p))

May 21, 2012

Date post:	27-Jun-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Lecture 6 Probability Distributions - Kent · 2012-05-21 · Lecture 6 1 Lecture 6 Probability...

Documents