Download - Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics

Instructor: Yogesh UppalEmail: [email protected]

Lecture Slides 3 Measures of Variability

Measures of Distribution Shape,Relative Location, and Detecting Outliers

Introduction to probabilities

The coefficient of variation is computed as follows:

Coefficient of Variation

100 %sx

The coefficient of variation indicates how large the standard deviation is in relation to the mean.

← for a sample

← for a population100 %

Coefficient of Variation (CV)

CV is used in comparing variability of distributions with different means.

A value of CV > 100% implies a data with high variance. A value of CV < 100% implies a data with low variance.

Measures of Distribution Shape,Relative Location, and Detecting Outliers

Distribution Shape z-Scores Detecting Outliers

Distribution Shape: Skewness An important measure of the shape of a distribution

is called skewness.

The formula for computing skewness for a data set is somewhat complex.

Distribution Shape: SkewnessDistribution Shape: Skewness

Symmetric (not skewed)Symmetric (not skewed)• Skewness is zero.Skewness is zero.• Mean and median are equal.Mean and median are equal.Re

lativ

e Fr

eque

ncy

.05

.10

.15

.20

.25

.30

.35

0

Skewness = Skewness = 0 0

Distribution Shape: Skewness

Moderately Skewed Left Skewness is negative. Mean will usually be less than the median.

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Skewness = Skewness = .31 .31

Distribution Shape: Skewness Moderately Skewed Right

Skewness is positive. Mean will usually be more than the median.

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Skewness = .31 Skewness = .31

Distribution Shape: SkewnessDistribution Shape: Skewness

Highly Skewed RightHighly Skewed Right• Skewness is positive.Skewness is positive.• Mean will usually be more than the median.Mean will usually be more than the median.

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Skewness = 1.25 Skewness = 1.25

Z-scores Z-score is often called standardized scores.

It denotes the number of standard deviations a data value is from the mean.

z x xsii

z-Scoresz-Scores

A data value less than the sample mean will have aA data value less than the sample mean will have a z-score less than zero.z-score less than zero. A data value greater than the sample mean will haveA data value greater than the sample mean will have a z-score greater than zero.a z-score greater than zero. A data value equal to the sample mean will have aA data value equal to the sample mean will have a z-score of zero.z-score of zero.

An observation’s z-score is a measure of the relativeAn observation’s z-score is a measure of the relative location of the observation in a data set.location of the observation in a data set.

Detecting Outliers

An An outlieroutlier is an unusually small or unusually large is an unusually small or unusually large value in a data set.value in a data set.

A data value with a z-score less than -3 or greaterA data value with a z-score less than -3 or greater than +3 might be considered an outlier.than +3 might be considered an outlier.

Introduction to Probability

Some basic definitions and relationships of probability

Some Definitions

Experiment: A process that generates well-defined outcomes. For example, Tossing a coin, Rolling a die or Playing Blackjack

Sample Space: is the set for all experimental Outcomes. For example, sample space for an experiment of tossing a coin is:

S={Head, Tail}Or rolling a die is:

S={1, 2, 3, 4, 5, 6}

Definitions (Cont’d)

Event: a collection of outcomes or sample points. For example, if our experiment is rolling a die, we can call an incidence of getting a number greater than 3 an event ‘A’.

Basic Rules of Probability

1. Probability of any outcome can never be negative or greater than 1.

2. The sum of the probabilities of all the possible outcomes of an experiment is 1.

Probability as a Numerical Measureof the Likelihood of Occurrence

00 11..55

Increasing Likelihood of OccurrenceIncreasing Likelihood of Occurrence

ProbabilitProbability:y:

The eventThe eventis veryis veryunlikelyunlikelyto occur.to occur.

The occurrenceThe occurrenceof the event isof the event is

just as likely asjust as likely asit is unlikely.it is unlikely.

The eventThe eventis almostis almostcertaincertain

to occur.to occur.

Example: Bradley Investments Bradley has invested in a stock named Markley Oil.

Bradley has determined that the possible outcomes of his investment three months from now are as follows.

Investment Gain or LossInvestment Gain or Loss (in $000)(in $000)

1010 55 002020

Example: Bradley Investments

Experiment: Investing in stocksExperiment: Investing in stocks

Sample Space: Sample Space: SS = {10, 5, 0, -20} = {10, 5, 0, -20}

Event: Making a positive profit (Lets call it ‘A’)Event: Making a positive profit (Lets call it ‘A’)A = {10, 5}A = {10, 5}

What is the event for not making a loss?What is the event for not making a loss?

Assigning ProbabilitiesClassical MethodClassical Method

Relative Frequency MethodRelative Frequency Method

Subjective MethodSubjective Method

Assigning probabilities based on the assumptionAssigning probabilities based on the assumption of of equally likely outcomesequally likely outcomes

Assigning probabilities based on Assigning probabilities based on experimentationexperimentation or historical dataor historical data

Assigning probabilities based on Assigning probabilities based on judgmentjudgment

Classical Method

Assigning probabilities based on the assumption of equally likely outcomes

If an experiment has n possible outcomes, this method would assign a probability of 1/n to each outcome.

Example

Experiment: Rolling a dieExperiment: Rolling a die

Sample Space: Sample Space: SS = {1, 2, 3, = {1, 2, 3, 4, 5, 6}4, 5, 6}

Probabilities: Each sample Probabilities: Each sample point has a 1/6 chance of point has a 1/6 chance of occurringoccurring

Example

Experiment: Tossing a CoinExperiment: Tossing a Coin

Sample Space: Sample Space: SS = {H, T} = {H, T}

Probabilities: Each sample point hasProbabilities: Each sample point has1/2 a chance of occurring1/2 a chance of occurring

Relative Frequency Method Assigning probabilities based on experimentation

or historical data

Example: Lucas Tool Rental Lucas Tool Rental would like to assign

probabilities to the number of car polishers it rents each day. Office records show the following frequencies of daily rentals for the last 40 days.

Relative Frequency Method

Number ofNumber ofPolishers RentedPolishers Rented

NumberNumberof Daysof Days

0011223344

44 6618181010 22

Example: Lucas Tool Rental

Each probability assignment is given bydividing the frequency (number of days) bythe total frequency (total number of days).

Relative Frequency Method

4/404/40

ProbabilityProbabilityNumber ofNumber of

Polishers RentedPolishers RentedNumberNumberof Daysof Days

0011223344

44 6618181010 224040

.10.10 .15.15 .45.45 .25.25 .05.051.001.00

Example: Favorite Party

Party Value Votes Relative Fre.

Rep 1 5 0.24

Dem 2 14 0.67

Greens 3 0 0.0

None 4 2 0.09

21 1.00

Subjective Method When economic conditions and a company’sWhen economic conditions and a company’s circumstances change rapidly it might becircumstances change rapidly it might be inappropriate to assign probabilities based solely oninappropriate to assign probabilities based solely on historical data.historical data. We can use any data available as well as ourWe can use any data available as well as our experience and intuition, but ultimately a probabilityexperience and intuition, but ultimately a probability value should express our value should express our degree of beliefdegree of belief that the that the experimental outcome will occur.experimental outcome will occur. The best probability estimates often are obtained byThe best probability estimates often are obtained by combining the estimates from the classical or relativecombining the estimates from the classical or relative frequency approach with the subjective estimate.frequency approach with the subjective estimate.

Some Basic Relationships of Probability

Complement of an EventComplement of an Event

Intersection of Two EventsIntersection of Two Events

Mutually Exclusive EventsMutually Exclusive Events

Union of Two EventsUnion of Two Events

Complement of an Event

Complement of an event A is the event consisting of all outcomes or sample points that are not in A and is denoted by Ac.

Event Event AA AAccSampleSpace S

VennVennDiagraDiagra

mm

Example: Rolling a die Event A: Getting a number greater than or

equal to 3A = {3, 4, 5, 6}

Ac = {1, 2} Event B: Getting a number greater than 1, but

less than 5B = {???}Bc = {???}

Intersection of two events

The intersection two events A and B is an event consisting of all sample points that are both in A and B, and is denoted by A ∩ B.

Event Event AA Event Event BB

Intersection of A and B

Union of two events

The Union two events A and B is an event consisting of all sample points that are in A or B or both A and B, and is denoted by A U B.


Union of A and B

Example: Rolling a Die (Cont’d)

A ∩ B = {3, 4}A U B = {2, 3, 4, 5, 6}

Lets find the following probabilities:P(A) = Outcomes of A / Total Number of Outcomes

= 4/6 = 2/3P(B) = ?P(A ∩ B) = ?P(A U B) = ?

Addition Law

According to the Addition law, the probability of the event A or B or both can also be written as

PP((AA BB) = ) = PP((AA) + ) + PP((BB) ) PP((AA BB

In our rolling the die example,P(A U B) = 2/3 + 1/2 – 1/3 = 5/6

Mutually Exclusive Events

Two events are said to be Mutually Exclusive if, when one event occurs, the other can not occur.

Or if they do not have any common sample points.


When Events A and B are mutually exclusive, P(A ∩ B) = 0.

The Addition Law for mutually exclusive events is

Mutually Exclusive Events

PP((AA BB) = ) = PP((AA) + ) + PP((BB) )

there’s no need tothere’s no need toinclude “include “ PP((AA BB””

Example: Mutually Exclusive Events

Suppose C is an event of getting a number less than 3 on one roll of a die.

C = {1, 2}A = {3, 4, 5 ,6}P(A ∩ C) = 0

Events A and C are mutually exclusive.

Conditional Probability

The probability of an event (Lets say A) given that another event (Lets say B) has occurred is called Conditional Probability of A.

It is denoted by P(A | B).

It can be computed using the following formula:

( )( | ) ( )P A BP A BP B

Rolling the Die Example

P(A ∩ B) = 1/3 P(A) =2/3 P(B) =1/2

P(A | B) = P(A ∩ B) / P(B) = (1/3)/(1/2) = 2/3

P(B | A) = P(A ∩ B) / P(A) = (1/3)/(2/3) = 1/2

Multiplication Law

The multiplication law provides the way to calculate the probability of intersection of two events and is written as follows:

PP((AA BB) = ) = PP((BB)×)×PP((AA||BB))

Independent Events

If the probability of an event A is not changed or affected by the existence of another event B, then A and B are independent events.

A and B are independent iff

OR

PP((AA||BB) = ) = PP((AA))

PP((BB||AA) = ) = PP((BB))

Multiplication Law for Independent Events

In case of independent events, the Multiplication Law is written as

PP((AA BB) = ) = PP((AA))PP((BB))


So there are two ways of checking whether two events are independent or not:

1. Conditional Probability Method:P(A | B) = 2/3 = P(A)P(B | A) = 1/2 = P(B)

A and B are independent.


2. The second way is using the Multiplication Law for independent events.

P(A ∩ B) = 1/3P(A) =2/3P(B) =1/2

P(A).P(B)=1/3 Since P(A ∩ B) = P(A). P(B), A and B are

independent events.

Education and Income Data

Highest Grade Completed

Annual Income<$25k $25k-50k >$50k Total

Not HS Grad 19638 4949 1048 25635

HS Grad 34785 25924 10721 71430

Bachelor’s 10081 13680 17458 41219

Total 64504 44553 29227 138284

There are two experiments here:1. Highest Grade Completed.

S1 = {not HS grad, HS grad, Bachelor’s}

2. Annual Income.S2={<$25K, $25K-50K, >50K}

What does each cell represent in the above crosstab?


Education and Income DataHighest Grade Completed

Annual Income<$25k $25k-50k >$50k Total

Not HS Grad 19638/138284=0.14

4949/138284=0.04

1048/138284=0.01

25635/138284=0.19

HS Grad 34785/138284=0.25

25924/138284=0.19

10721/138284=0.08

71430/138284=0.52

Bachelor’s 10081/138284=0.07

13680/138284=0.10

17458/138284=0.13

41219/138284=0.30

Total 64504/138284=0.47

44553/138284=0.32

29227/138284=0.21

138284/138284=1.00

P(Bachelor’s) = P(Bachelor’s and <25K) + P(Bachelor’s and 25-50K) + P(Bachelor’s and >50K) = 0.07+0.10+0.13 =0.30

P(>$50K) = P(Not HS and >50K) + P(HS grad and >50K) + P(Bachelor’s and >50K)

= 0.01+0.08+0.13 = 0.21



Lets define an event A as the event of making >$50K.

A={>$50K}P(A) = 0.21

Lets define another even B as the event of having a HS degree.

B= {HS Grad}P(B) = 0.52

Rules of Probability

A and B is an event of having an income >$50K and being a HS graduate:

P(A and B) = 0.08

A or B is an event of having an income >$50K or being a HS graduate or both:

P(A or B) = P(A) + P(B) – P(A and B) = 0.21 + 0.52 – 0.08 = 0.65

Event of making >$50K given the event of being a HS graduate:

P(A | B)= P(A and B) / P(B) =0.08/ 0.52 = 0.15

Are A and B independent?1. P(A | B)= 0.15 ≠ P(A) = 0.212. P(A and B)= 0.08 ≠ P(A)*P(B)=0.21*0.52=0.11

→ A and B are not independent.

Are Annual Income and Highest Grade Completed independent?



The probability of any event is the sum of probabilities of its sample points.

E.g. Lets define an event C as the event of having at least a HS degree.

C= {HS Grad, Bachelor’s}P(C) = P(HS Grad) + P(Bachelor’s)

=0.52 + 0.30 =0.82