+ All Categories
Home > Education > Statistics - Probability theory 1

Statistics - Probability theory 1

Date post: 26-Jan-2015
Category:
Upload: julio-huato
View: 115 times
Download: 2 times
Share this document with a friend
Description:
These slides introduce probability theory.
Popular Tags:
42
Outline Topics Probability, sample space, random variable Probability distribution Expected value Variance Moments Linear transformations of random variables Joint distributions Applied Statistics for Economics 2. Introduction to Probability Theory SFC - [email protected] Spring 2012 SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability
Transcript
Page 1: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Applied Statistics for Economics2. Introduction to Probability Theory

SFC - [email protected]

Spring 2012

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 2: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Topics

Probability, sample space, random variable

Probability distribution

Expected value

Variance

Moments

Linear transformations of random variables

Joint distributions

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 3: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Topics

The main topics in this chapter are:

I random variables and probability distributions,

I expected values: mean and variance, and

I Two random variables jointly considered.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 4: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability

The world in motion is viewed as a set of random processes orrandom experiments.

Randomness means that, no matter how much our understandingof the world may advance, there is always an element of ignoranceor uncertainty in such understanding. In other words: givenspecific causes, we don’t know fully which states of the world willresult. Or, given specific states of the world, we don’t know fullywhat specifically caused such states of the world.

In other words, we are uncertain or – more plainly said – ignorantabout the specific causes or, alternatively, effects involved in theseprocesses.

Statistics is a disciplined approach to use our observations of theworld (e.g. in the form of data) in order to reduce such uncertaintyas much as we can.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 5: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability

Examples of random processes: Your meeting the next person, SFCstudents commuting to school, residents of the U.S. producing newgoods in a given year, etc.

Why are they random? Because we are uncertain about the genderor the age of the next person you’ll meet, the commuting time ofSFC students or the means of transportation they will use, theannual gross domestic product in the U.S. or its composition, etc.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 6: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability

The mutually exclusive possible results of these experiments arecalled the outcomes. E.g. the next person you’ll meet could befemale or male, young or old; SFC students may take a few ormany minutes to commute to school; U.S. annual GDP may go upor down by some amount compared to the previous year, etc.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 7: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability

Probability: the degree of belief that the outcome of anexperiment will be a particular one.

How to decide which probability to assign to a particular outcome of anexperiment (e.g. that if you meet another person, the gender of suchperson will be female)? How to make this decision in a well-informed,disciplined, scientific way?1

One can only use experience – individual or collective – i.e. history. Wemay keep record of the gender of the people we meet over time and usethe data compiled to inform our belief or look at records on the gendercomposition of the local population, etc.

1In a sense, the whole purpose of statistics is to determine probabilities or,alternatively, expectations based on probabilities.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 8: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Sample space, event

Sample space or population: the set of all the possible outcomes ofthe experiment. E.g. the sample space of the experiment offlipping a coin once is: S = {H,T}.2

Event: a subset of the sample space, i.e. a set of one or moreoutcomes. E.g. the event (M ≤ 1) that your car will “need onerepair at most” includes “no repairs” (M = 0) and “one repair”(M = 1).

2We rule out ‘freak’ possibilities, like the coin landing on its edge.SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 9: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Random variables

Random variable (r.v.): a numerical summary of a randomoutcome. For example, G = g , where (e.g.) g is 0 if ‘male’ and 1if ‘female’. The number of times your car needs repair during agiven month: M = m, where m = 0, 1, 2, 3, . . .. The time it takesfor SFC students to commute to school: T = t, where t is time inminutes.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 10: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Random variables

There are discrete and continuous random variables. Gender,summarized as a 0 if ‘male’ and 1 if ‘female’, and the number ofcar repairs in a month are discrete random variables. Thecommuting time, if recorded in fractions of an hour – or evenfractions of minutes and seconds, etc. can be regarded as acontinuous r.v.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 11: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability distribution of a discrete r.v.

The probability distribution of a discrete r.v. is a list of all the values ofthe r.v. and the probabilities associated to each value of the r.v. Byconvention, the probabilities are a number between 0 and 1, where 0means impossibility and 1 means full certainty; the probabilities over thesample space must add up to 1. E.g. let G = 0, 1 be the r.v. ‘gender ofthe next person you’ll meet’. Then:

G Pr(G = g)0 0.451 0.55

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 12: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability distribution of a discrete r.v.

Using the information in the probability distribution, you can computethe probability of a given event. E.g. the probability that you’ll meet ‘amale or a female’:

Pr(G = 0 or G = 1) = Pr(G = 0)+Pr(G = 1) = 0.45+0.55 = 1 = 100%.

In words, we are completely certain that you’ll meet either a male or afemale the next time you meet a person.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 13: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability distribution of a discrete r.v.

Admittedly, the previous example is trivial. But consider the probabilitydistribution of your car needing repair(s) in a given month. The r.v.‘number of repairs’ needed is denoted as M:

M Pr(M = m)0 0.801 0.102 0.063 0.034 0.01

What’s the probability that the the car will need one or two repairs in amonth? Answer:

Pr(M = 1 orM = 2) = Pr(M = 1)+Pr(M = 2) = 0.10+0.06 = 0.16 = 16%.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 14: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability distribution of a discrete r.v.

The cumulative probability distribution (also known as a‘cumulative distribution function’ or c.d.f.) is the probability thatthe random variable is less than or equal to a particular value. Thefirst two columns of the following table are the same as in theprevious table. The last column gives the c.d.f.:

M Pr(M = m) Pr(M ≤ m)0 0.80 0.801 0.10 0.902 0.06 0.963 0.03 0.994 0.01 1.00

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 15: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability distribution of a discrete r.v.

A binary discrete r.v. (e.g. G = 0, 1) is called a Bernoulli r.v. TheBernoulli distribution is:

G =

{1 with probability p0 with probability 1− p

where p is the probability of the next person being ‘female’.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 16: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Probability distribution of a continuous r.v.

The cumulative probability distribution of a continuous r.v. is alsothe probability that the random variable is less than or equal to aparticular value.

The probability density function (p.d.f.) of a continuous randomvariable summarizes the probabilities for each value of the randomvariable.

The mathematical description of the p.d.f. of a continuous variablerequires that you’re familiar with calculus. So, we’ll skip it for now.

NB: Strictly speaking, the probability that a continuous random variable has a particular value is zero. We can onlyspeak of the probability of the random variable falling in a range (between two given values).NB2: The p.d.f. and the c.d.f. show the same information in different formats.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 17: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Characteristics of a r.v. distribution

In the practice of statistics, two basic measures are usedextensively to characterize the distribution of a r.v.:

I the expected value or mean (or average) and

I the variance.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 18: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Expected value

The expected value of a r.v. X or E (X ) is the average value of thevariable over many repeated trials.It is computed as a weighted average of the possible outcomes,where the weights are the probabilities of the outcomes. It is alsocalled the mean of X and denoted by µX . For a discrete r.v.:

E (X ) = x1p1 + x2p2 + · · ·+ xkpk =k∑

i=1

xipi

E.g.: You loan $100 to your friend for a year at 10% interest.There’s a 99% chance he’ll repay the loan and 1% he won’t.What’s the expected value of your loan at maturity?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 19: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Expected value

E.g.: You loan $100 to your friend for a year at 10% interest.There’s a 99% chance he’ll repay the loan and 1% he won’t.What’s the expected value of your loan at maturity?Answer:

($110× 0.99) + ($0× .01) = $108.90

E.g.: What’s the expected value or average of the number of carrepairs per month? See the table above.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 20: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Expected value

E.g.: What’s the expected value or average of the number of carrepairs per month (M)? See the table above.Answer:

E (M) = (0× 0.80) + (1× 0.10) + (2× 0.06)+

(3× 0.03) + (4× 0.01) = 0.35

What does that mean?E.g.: In general, what’s the expected value of a Bernoulli r.v. withPr(G = 1) = p and Pr(G = 0) = 1− p?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 21: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Expected value

E.g.: In general, what’s the expected value of a Bernoulli r.v. withPr(G = 1) = p and Pr(G = 0) = 1− p?Answer:

E (G ) = (1× p) + (0× (1− p)) = p

Note 1: Think of the operator E (.) as a function that transformsdata on a variable by multiplying each value of the variable by itsprobability and then adding up all the products.Note 2: The formula for the expected value of a continuous r.v.requires calculus. So we’ll skip it for now.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 22: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Variance and standard deviation

The variance of a r.v. Y is:

var(Y ) = σ2Y = E [(Y − µY )2]

The standard deviation is the positive square root of the varianceσY :

s.d.(Y ) = σY = +√σ2Y

Basically, the s.d. gives the same information as the variance, butin units that are easier to understand. The units of the standarddeviation are the same units as Y and µY .What is the intuition behind the variance and/or the standarddeviation?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 23: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Variance and standard deviation

For a discrete r.v.:

var(Y ) = σ2Y = E [(Y − µY )2] =

k∑i=1

(yi − µY )2pi

s.d.(Y ) = σY =

√√√√ k∑i=1

(yi − µY )2pi

E.g.: What are the var. and s.d. of the number of car repairs permonth (M)?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 24: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Variance and standard deviation

E.g.: What are the var. and s.d. of the number of car repairs permonth (M)?Answer:

var(M) = [(0−0.35)2×0.80]+[(1−0.35)2×0.10]+[(2−0.35)2×0.06]

+[(3− 0.35)2 × 0.03] + [(4− 0.35)2 × 0.01] = 0.6475

s.d.(M) =√

0.6475 ∼= 0.80

E.g.: What are the var. and s.d. of a Bernoulli r.v.?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 25: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Variance

E.g.: What are the var. and s.d. of a Bernoulli r.v.?Answer:

var(G ) = [(0− p)2 × (1− p)] + [(1− p)2 × p] = p(1− p)

s.d.(G ) =√

p(1− p)

Note 1: Think of the operator var(.) as a function that transformsdata on a variable by taking the distance or difference betweeneach value of the variable and its mean, squaring that difference,multiplying it by the respective probability, and then adding up allthe products.Note 1: Think of the operator s.d.(.) as a function that transformsdata on a variable by taking the distance or difference betweeneach value of the variable and its mean, squaring that difference,multiplying it by the respective probability, adding up all theproducts, and taking the positive square root of that sum.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 26: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Moments

More formally, in statistics, the characteristics of the distribution ofa r.v. are called moments.E (Y ) is the first moment, E (Y 2) is the second moment, andE (Y r ) is the rth moment of Y . The first moment is the mean andit is a measure of the center of the distribution, the secondmoment is a measure of its dispersion or spread, and r -th momentsfor r > 2 measure other aspects of the distribution’s shape.Clearly, the second moment of the distribution is intimately relatedto the variance. How?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 27: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Moments

Two other measures of the shape (using higher moments) of adistribution are:Skewness:

Skewness =E [(Y − µY )3]

σ3Y

For a symmetric distribution, the skewness is zero. If the distribution hasa long left tail, the skewness is negative. If the distribution has a longright tail, the skewness is positive.Kurtosis:

Kurtosis =E [(Y − µY )4]

σ4Y

For a distribution with heavy tails (outliers are likely), the kurtosis islarge. For a normal distribution, the kurtosis is zero.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 28: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Mean of a linear function of a r.v.

Consider the income tax schedule:

Y = 2, 000 + 0.8X

where X is pre-tax earnings and Y is after-tax earnings. What is themarginal tax rate?Suppose an individual’s next year pre-tax earnings are a r.v. with meanµX and variance σ2

X . Since her pre-tax earnings are random, her after-taxearnings are random as well. With the following mean:

E (Y ) = µY = 2, 000 + 0.8µX

Why? Remember that the operator E (Y ) means “multiply each value ofY by its probability and add up the results.”

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 29: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Variance of a linear function of a r.v.

In turn, the variance of Y is:

var(Y ) = σ2Y = E [(Y − µY )2].

Since Y = 2, 000 + 0.8X , then(Y − µY ) = (2, 000 + 0.8X )− (2, 000 + 0.8µX ) = 0.8(X − µX ).Therefore:

E (Y − µY )2 = E{[0.8(X − µX )2]} = 0.64E [(X − µX )2].

That is: σ2Y = 0.64σ2

X .And taking the positive square root of that number:

σY = 0.8σX

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 30: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Mean and var. of a linear function of a r.v.

More generally, if X and Y are r.v.’s related by Y = a + bX , then:

µY = a + bµX

σ2Y = b2σ2

Y

σY = bσY

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 31: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Two random variables

We now deal with the distribution of two random variablesconsidered together.The joint probability distribution of two random variables X and Yis the probability that the random variables take certain values atonce or Pr(X = x ,Y = y).The marginal probability distribution of a random variable Y is itsprobability distribution in the context of its relationship with(an)other variable(s).

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 32: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Multi-variate distributions

The following table shows relative frequencies (probabilities):

Joint distribution of weather conditions and commuting timesRain (X = 0) No rain (X = 1) Total

Long commute (Y = 0) 0.15 0.07 0.22Short commute (Y = 1) 0.15 0.63 0.78Total 0.30 0.70 1.00

The cells show the joint probabilities. The marginal probabilities(the marginal distribution) of Y can be calculated from the jointdistribution of X and Y . If X can take l different values x1, . . . , xl ,then:

Pr(Y = y) =l∑

i=1

Pr(X = xi ,Y = y)

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 33: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Conditional distribution

The conditional probability that Y takes the value y when X isknown to take the value x is written Pr(Y = y |X = x).The conditional distribution of Y given X = x is:

Pr(Y = y |X = x) =Pr(X = x ,Y = y)

Pr(X = x)

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 34: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Conditional mean

Consider the following table:Joint and conditional distribution of M and A

M = 0 M = 1 M = 2 M = 3 M = 4 TotalJoint distributionOld car (A = 0) 0.35 0.065 0.05 0.025 0.01 0.50New car (A = 1) 0.45 0.035 0.01 0.005 0.00 0.50Total 0.8 0.1 0.06 0.03 0.01 1.00Conditional distributionPr(M | A = 0) 0.70 0.13 0.10 0.05 0.02 1.00Pr(M | A = 1) 0.90 0.07 0.02 0.01 0.00 1.00

The conditional expectation of Y given X (or conditional mean ofY given X ) is the mean of the conditional distribution of Y givenX .

E (Y |X = x) =k∑

i=1

yiPr(Y = yi |X = x).

Based on the table, the expected number of car repairs given that the car

is old is E (M|A = 0) =

(0× 0.7) + (1× 0.13) + (2× 0.10) + (3× 0.05) + (4× 0.02) = 0.56. The

expected number of car repairs given that the car is new is

E (M|A = 1) = 0.14.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 35: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Law of iterated expectations

The mean height of adults is the weighted average of the meanheight of men and the mean height of women, weighted by theproportions of men and women. More generally:

E (Y ) =l∑

i=1

E (Y |X = xi )Pr(X = xi ).

In other terms:E (Y ) = E [E (Y |X )].

This is called the law of iterated expectations. If E (Y |X ) = 0 thenE (Y ) = 0.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 36: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Conditional variance

The variance of Y conditional on X is the variance of theconditional distribution of Y given X :

var(Y |X = x) =k∑

i=1

[yi − E (Y |X = x)]2Pr(Y = yi |X = x).

Example.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 37: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Independence

Two r.v.’s X and Y are independently distributed (i.e.independent) if knowing the value of one of them gives noinformation about the other, that is, if the conditional distributionof Y given X equals the marginal distribution of Y . Formally, Xand Y are independent if, for all values x and y ,

Pr(Y = y |X = x) = Pr(Y = y) or

Pr(X = x ,Y = y) = Pr(X = x) Pr(Y = y)

In other words, the joint distribution of X and Y is the product of theirmarginal distributions.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 38: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Covariance

The covariance between two r.v.’s X and Y measures the extent towhich they move together. The covariance is the expected value ofthe product of the deviations of the variables from their expectedvalues. The first equation below is the general formula of thecovariance. The second equation is specific to discrete r.v.’s and itassumes that X can take on l values and Y can take on k values:

cov(X ,Y ) = σXY = E [(X − µX )(Y − µY )]

cov(X ,Y ) =k∑

i=1

l∑j=1

(xj − µX )(yi − µY ) Pr(X = xj ,Y = yi ).

Note that −∞ < σXY < +∞. How do you interpret the covarianceformula?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 39: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Correlation

The problem with the covariance is that it is not bounded. Its sizedepends on the units of X and Y and is, thus, hard to interpret.The correlation between X and Y is another measure of theircovariation. But, unlike the covariance, the correlation eliminatesthe ‘units’ problem. Its formula is:

corr(X ,Y ) =cov(X ,Y )√var(X ) var(Y )

=σXYσXσY

Note that −1 ≤ corr(X ,Y ) ≤ 1.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 40: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Correlation and conditional mean

If E (Y |X = x) = E (Y ) = µY , then X and Y are uncorrelated.That is,

cov(X ,Y ) = 0 and cov(X ,Y ) = 0.

This follows from the law of iterated expectations.

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 41: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Mean and variance of sums of r.v.’s

The mean of the sum of two r.v.’s, X and Y , is the sum of their means:

E (X + Y ) = E (X ) + E (Y ) = µX + µY

The variance of the sum of X and Y is the sum of their variance plustwice their covariance:

var(X + Y ) = var(X ) + var(Y ) + 2cov(X ,Y ) = σ2X + σ2

Y + 2σXY

If X and Y are independent, the covariance is zero and the variance oftheir sum is the sum of their variances:

var(X + Y ) = var(X ) + var(Y ) = σ2X + σ2

Y

Why?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory

Page 42: Statistics - Probability theory 1

OutlineTopics

Probability, sample space, random variableProbability distribution

Expected valueVarianceMoments

Linear transformations of random variablesJoint distributions

Sums of r.v.’s

Let X , Y , and V be r.v.’s and a, b, and c be constants. Thesefacts follow from the definitions of mean, variance, covariance, andcorrelation:

E (a + bX + cY ) = a + bµX + cµY

var(a + bY ) = b2σ2Y

var(aX + bY ) = a2σ2X + 2abσXY + b2σ2

Y

E (Y 2) = σ2Y + µ2

Y

cov(a + bX + cV ,Y )

E (XY ) = σXY + µXµY

|σXY | ≤√σ2Xσ

2Y

Can you prove them?

SFC - [email protected] Applied Statistics for Economics 2. Introduction to Probability Theory


Recommended