Lecture 6 1
Lecture 6
Probability Distributions
In certain situations, some attribute of the
outcome may hold more interest for the
experimenter than the outcome itself. For
example, a player of the game of craps may be
concern only about throwing a 7 and not weather
the 7 was the result of a 5 and a 2 or a 4 and a 3
or a 6 and a 1.
Definition 1. A random variable (r.v.) is a
numerical measure of the outcome of a
probability experiment, so its value is determined
by chance. Random variables are denoted using
May 21, 2012
Lecture 6 2
capital letters such as X, Y , etc.
May 21, 2012
Lecture 6 3
Definition 2. A discrete random variable is
a random variable that has either a finite number
of possible values or a countable number of
possible values. A continuous random
variable is a random variable that has an infinite
number of possible values that is not countable.
May 21, 2012
Lecture 6 4
Because the value of a r.v. is determined by
chance, there are probabilities assigned to these
possible values. A table, graph, or formula
containing all the possible values a random
variable could take together with the
corresponding probabilities forms a probability
distribution.
In the case of a discrete probability
distribution the following equalities must be
verified:
1)∑P (X = x) = 1
2) 0 ≤ P (X = x) ≤ 1
where (X = x) denotes the probability of the
May 21, 2012
Lecture 6 5
random variable X to be x.
May 21, 2012
Lecture 6 6
Mean and variance of a discrete random variable
The mean, or the expected value, of a discrete
random variable is given by the formula
µX = E(X) =∑
x× P (X = x)
where x is the value of the random variable and
P (X = x) is the probability of observing the
random variable x.The variance of a discrete r.v.
is given by
σ2X =
∑(x− µX)2P (X = x)
and the standard deviation is the square root of
the variance, i.e. σX =√σ2X .
May 21, 2012
Lecture 6 7
Binomial distribution
When do we deal with a binomial trial or
distribution? An experiment is said to be a
binomial experiment if: 1) The experiment is
performed a fixed number of times, usually
denoted by n. Each repetition is called a trial.
2) The trials are independent (the outcome of one
does not depend on the other)
3) For each trial, there are 2 mutually exclusive
outcomes: success or failure.
4) The probability of success is fixed for each trial
of the experiment.The probability of success is p
while of failure is 1− p 5) We say that a r.v. is
binomially distributed if X counts the number of
May 21, 2012
Lecture 6 8
successes in n independent trials of the
experiment. So the possible values for X are 0, 1,
2, ..., n.
May 21, 2012
Lecture 6 9
Mathematicians showed that the probability of
obtaining x successes in n independent trials of a
binomial experiment where the probability of
success is p is given by
P (X = x) = nCxpx(1− p)n−x, x = 0, 1, 2, ..., n
Also they showed that such a binomial random
variable will have the mean given by
µX = E(X) = np
and the standard deviation given by the formula:
σX =√np(1− p)
May 21, 2012
Lecture 6 10
Continuous r.v.’s – Normal
Distribution
In the case of continuous r.v.’s, computing
probabilities is not that easy because the r.v.
takes infinitely many values. That is why we look
at intervals of values the r.v. might take.
Probability density Function
A probability density function is a function
used to compute probabilities of continuous r.v.’s.
It has to satisfy the following two properties:
(1.) The area under the graph of the equation
over all possible values of the r.v. must equal one.
May 21, 2012
Lecture 6 11
(2.) The graph of the equation must lie on or
above the x-axis for all possible values of the r.v.
May 21, 2012
Lecture 6 12
Property: The probability of observing a value
of the r.v. in a certain interval equals the area
under the graph of the density function of that
r.v., over that interval.
A continuous r.v. is normally distributed or
has a normal probability distribution if its
relative frequency histogram has the shape of a
normal curve (bell-shaped and symmetric).
May 21, 2012
Lecture 6 13
Area and the normal distribution
If the r.v. X is normally distributed then the area
under the normal curve for any range of values of
the r.v. X represents either:
1) the proportion of the population with the
characteristics described by the range, or
2) the probability that a randomly chosen
individual from the population will have the
characteristics described by the range.
May 21, 2012
Lecture 6 14
Finding the area under the density graph of a
normally distributed r.v. is not an easy task. It
requires a lot of calculus. One way of avoiding
this is to use tables that give us these areas
(probabilities). But for each µ and σ we would
need a new table. How can we avoid this? By
transforming somehow all these r.v. into a
standard one.
Standardizing a normal r.v.
Suppose that the r.v. X is normally distributed
with mean µ and standard deviation σ. Then the
r.v.
Z =X − µσ
May 21, 2012
Lecture 6 15
is normally distributed with mean µ = 0 and
standard deviation σ = 1 Such an r.v. is said to
have the standard normal distribution.
May 21, 2012
Lecture 6 16
Standard Normal Distribution
We saw that the standard normal distribution is a
normal distribution with mean 0 and standard
deviation 1. Therefore its properties are deduced
from the properties of the normal distribution:
May 21, 2012
Lecture 6 17
The table II gives areas under the standard
normal curve for values to the left of a specified
Z-score z0.
Area under the normal curve to the right of z0
equals 1-Area to the left of z0.
May 21, 2012
Lecture 6 18
There are 3 types of areas that we could find as
illustrated bellow. But we could also find a
z-score given an area.
May 21, 2012
Lecture 6 19
Applications of the Normal
Distribution
Sampling distribution; The Central
Limit Theorem
Definition 3. The sampling distribution of
the mean is a probability distribution of all
possible values of the random variable x
computed from a sample of size n from a
population with mean µ and std. σ.
How do we obtain it?
Step 1. Obtain a simple random sample of size n.
Step 2. Compute the sample mean.
May 21, 2012
Lecture 6 20
Step 3. Assuming that we are sampling from a
finite population repeat step 1 and 2 until all
simple random samples of size n have been
obtained.
Ex:
May 21, 2012
Lecture 6 21
How does x vary with the increase of n?
May 21, 2012
Lecture 6 22
What can we say about this distribution (of the
samples of size n)?
May 21, 2012
Lecture 6 23
Sampling distribution of
sample mean
Let x be the mean of a SRS of size n from a
population with mean µ and SD σ. The mean
and SD of x are
µx = µ
σxσ√n
Moreover, if the population is normally
distributed then x is normally distributed
N(µ, σ√n
) and if the population is not necessarily
normally distributed but the sample size is large
then x is approximately normally distributed,
N(µ, σ√n
).
May 21, 2012
Lecture 6 24
Normal Approximation for counts
and proportions
For an SRS of size n from a large population
having population proportion p of success such
that np ≥ 10 and n(1− p) ≥ 10, when n is large
the sampling distributions of the statistic X, the
number of successes, is given by:
X is approximately N(np,√np(1− p))
May 21, 2012