+ All Categories
Home > Documents > MBA Statistics 51-651-00

MBA Statistics 51-651-00

Date post: 25-Feb-2016
Category:
Upload: darva
View: 34 times
Download: 1 times
Share this document with a friend
Description:
MBA Statistics 51-651-00. http://www.hec.ca/sites/cours/51-651-02/. What is statistics?. - PowerPoint PPT Presentation
Popular Tags:
67
MBA Statistics 51-651-00 http://www. hec .ca/sites/ cours /51-651-02/
Transcript
Page 2: MBA Statistics 51-651-00

2

What is statistics?"I like to think of statistics as the science of learning from data... Statistics is essential for the proper running of government, central to decision making in industry, and a core component of modern educational curricula at all levels."

Jon Kettenring ASA President, 1997

Page 3: MBA Statistics 51-651-00

3

What is statistics? American Heritage Dictionary® defines

statistics as: "The mathematics of the collection, organization,

and interpretation of numerical data, especially the analysis of population characteristics by inference from sampling.« 

The Merriam-Webster’s Collegiate Dictionary® definition is: "A branch of mathematics dealing with

the collection, analysis, interpretation, and presentation of masses of numerical data."

Page 4: MBA Statistics 51-651-00

4

Course syllabus Variation. Sampling and estimation. Decision making from statistical

inference. Qualitative data analysis. Simple and multiple linear regression. Forecasting. Statistical process control. Revision.

Page 5: MBA Statistics 51-651-00

5

EVALUATION Teamwork: 40%

Final exam: 60%

Page 6: MBA Statistics 51-651-00

COURSE # 1

Variation, sampling and estimation.

Page 7: MBA Statistics 51-651-00

7

Variation"The central problem in management

and in leadership ... is failure to understand the information in variation"

W. Edwards Deming

Page 8: MBA Statistics 51-651-00

8

Variation"Management takes a major step

forward when they stop asking you to explain random variation"

F. Timothy Fuller

Page 9: MBA Statistics 51-651-00

9

Variation"Failure to understand variation is a

central problem of management"

Lloyd S Nelson

Page 10: MBA Statistics 51-651-00

10

Airport ImmigrationOfficer Passengers

processed Alan 9 Barbara 10 Colin 4 Dave 8 Enid 6 Frank 14

Page 11: MBA Statistics 51-651-00

11

Airport Immigration Management expected their officers to

process 10 passengers during this period The immigration services manager, in

reviewing these figures, was concerned about the performance of

Colin thinking how best to reward Frank

Page 12: MBA Statistics 51-651-00

12

Debt Recovery When the amount of recovered debt is

much lower than the target recovery level of 80%, the General Manager visits all the District Offices in New Zealand to remind managers of the importance of customers paying on time

What do you think of the GM policy? What would you do?

Page 13: MBA Statistics 51-651-00

13

Budget Deviations Budget deviations measure the

difference between the amount budgeted and the actual amount, expressed as a percentage of the budgeted amount.

The aim is to have a zero deviation. Most of the variation lies between -3% and 4%.

Page 14: MBA Statistics 51-651-00

Illustration of variation

Excel program:beads.xls (Deming)

The red balls are associated with defective products.

Five times a day, 5 technicians select a sample of 50 beads and counts the number of defectives (red).

Page 15: MBA Statistics 51-651-00

15

Beads History – 17 July 2000

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 16: MBA Statistics 51-651-00

16

Beads History - 9 March 2000

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 17: MBA Statistics 51-651-00

17

Beads History - 8 March 2001

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 18: MBA Statistics 51-651-00

18

Beads History - 5 March 1999

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 19: MBA Statistics 51-651-00

19

Beads History - 19 July 1996

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 20: MBA Statistics 51-651-00

20

Beads History - 8 March 1996

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 21: MBA Statistics 51-651-00

21

Beads History - 10 March 1995

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 22: MBA Statistics 51-651-00

22

Beads History - 6 March 1998

0

5

10

15

20

Num

ber o

f red

bea

ds

Page 23: MBA Statistics 51-651-00

23

Beads History: 27 Experiments

0

5

10

15

20

25

Num

ber o

f red

bea

ds

Page 24: MBA Statistics 51-651-00

24

Beads Averages

0

5

10

15

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Num

ber o

f red

bea

ds

Page 25: MBA Statistics 51-651-00

25

Discussion What is the main difference between

the graph of the 25x27=675 draws and the graph of the 27 averages?

Page 26: MBA Statistics 51-651-00

26

Two approaches in management Fire-fighting

Scientific

Page 27: MBA Statistics 51-651-00

27

Fire-fighting approach

Problem Solution

Page 28: MBA Statistics 51-651-00

28

Scientific Approach

SolutionProblem

Cause

Page 29: MBA Statistics 51-651-00

29

Scientific Approach Making decisions based on data

rather than hunches. Looking for the root causes of

problems rather than reacting to superficial symptoms.

Seeking permanent solutions rather than quick fixes.

Page 30: MBA Statistics 51-651-00

30

The Need for Data To understand the process To determine priorities To establish relationships To monitor the process To eliminate causes of variation

Page 31: MBA Statistics 51-651-00

31

The steps of statistical analysis involve:1. Planning the collection of

information2. Collecting information3. Evaluating information4. Drawing conclusions

Page 32: MBA Statistics 51-651-00

32

Surveys: Collect information from a carefully

specified sample and extend the results to an entire population.

Sample surveys might be used to: Determine which political candidate is

more popular Discover what foods teenagers prefer

for breakfast Estimate the number of potential clients

Page 33: MBA Statistics 51-651-00

33

Sampling Definitions

Population

Sample

Parameter

Statistic

choose

calculate

estimate

Page 34: MBA Statistics 51-651-00

34

Government Operations: Conduct experiments to aid in the

development of public policy and social programs.

Such experiments include: consumer prices; fluctuations in the economy; employment patterns; population trends.

Page 35: MBA Statistics 51-651-00

35

Scientific Research: Statistical sciences are used to enhance

the validity of inferences in: radiocarbon dating to estimate the risk of

earthquakes; clinical trials to investigate the

effectiveness of new treatments; field experiments to evaluate irrigation

methods; measurements of water quality; psychological tests to study how we reach

the everyday decisions in our lives.

Page 36: MBA Statistics 51-651-00

36

Business and Industry: predict the demand for products and

services; check the quality of items

manufactured in a facility; manage investment portfolios; forecast how much risk activities

entail, and calculate fair and competitive insurance rates.

Page 37: MBA Statistics 51-651-00

37

Sampling Our knowledge, our attitudes and

our actions are mainly based on samples.

For example, a person’s opinion of an institution or a company which makes thousands of transactions every day is often determined by only one or two meetings with this institution.

Page 38: MBA Statistics 51-651-00

38

Census vs Sample Census = reality (True or false?!) The information needed is available

for all individuals of the study population.

Sample = estimation of the reality The information needed is only

available for a subset of the individuals of the study population.

Page 39: MBA Statistics 51-651-00

39

Advantages of a sample Reduced costs Accrued speed Offers more possibilities

in some cases it may be impossible to have a census (ex: quality control)

Perhaps more precise!Cases where highly qualified personal are necessary for collecting data

Page 40: MBA Statistics 51-651-00

40

Probabilistic vs non probabilistic samples

s im ple random system atic stra tified cluster(area)

Probabilist

on a vo luntary basis purposive

Non probabilist

Sam pling m ethods

Page 41: MBA Statistics 51-651-00

41

Sampling errors Random error

• different samples will produce different estimates of the study population characteristics

Systematic error - bias• non probabilistic sample• probabilistic sample with a high rate of

non respondents• biased instrument of measure

Page 42: MBA Statistics 51-651-00

42

TV Show Poll - March 1998 Should Hamilton be renamed

Waikato City? 4400 dialled the 0900 number 73% were against the change

What type of sample was taken? What conclusions would you draw?

Page 43: MBA Statistics 51-651-00

43

Bias vs variability Bias is a systematic error, in the

same direction, of successive estimations of a parameter.

Large variability means that repeated values of estimations are scattered; the results of successive sampling cannot be reproduced.

(see …)

Page 44: MBA Statistics 51-651-00

44

a) large bias, low variability b) low bias, high variability

c) large bias, high variability d) low bias, low variability

Page 45: MBA Statistics 51-651-00

45

Bias due to non-response

Bias is often caused by non-response in surveys. For example, suppose that the population is

divided in two groups : respondents (60%) and non-respondents (40%).

Within respondents, 65% are in favour of a project et within non-respondents, 20% are in favour.

The real proportion in the population in favour of the project is p = 47% , while a survey will give an estimation of p at about 65% 47%. The bias is 18%.

Page 46: MBA Statistics 51-651-00

46

How do we make a simple random sample drawing?

1. We need a list. Each element of the population is assigned a number from 1 to N.

2. We use a computer program to select n numbers as randomly as possible (ex: Excel, MINITAB, SAS, C).

3. The corresponding elements form the sample.

Page 47: MBA Statistics 51-651-00

48

Notes : The results obtained depend on the sample

taken. If the samples are taken according to codes of

practice, the results should all be similar. For a simple random draw, each individual of

the population is as likely to be selected at each draw.

For a simple random draw, there are many different possible samples. All possible samples of the same size have the same chance of being selected.

Page 48: MBA Statistics 51-651-00

49

Opinion polls The results obtained in a

probabilistic sample will be used to generalize the entire population.

But the fact of using a sample necessarily induces a margin of error that we will try to control.

We will distinguish two types of data: qualitative and quantitative.

Page 49: MBA Statistics 51-651-00

50

Types of data Qualitative (measurement scale: nominal or ordinal)

(parameter: %)Examples:

• sex (F, M)• political party (PLQ, PQ, ADQ)• preferred brand (Coke, Pepsi, Homemade brand, …)• satisfaction level (Likert scale from 1 to 5)

Quantitative (measurement scale: interval or ratio) (parameter: mean)Examples:

• age• income• temperature (in degrees Celsius)

Page 50: MBA Statistics 51-651-00

51

Case study

Data in credit.xls represent the credit balance and the total income of 100 randomly chosen families in Quebec.

What is the mean credit balance for a family in Quebec? What is the precision (margin of error) of your estimate?

What about a Canadian family? Assuming that 2 500 000 families use at least one

credit card regularly, what is the total debt of families in Quebec? What is the precision of the estimate?

Page 51: MBA Statistics 51-651-00

52

Confidence intervals

To estimate the proportion p of individuals with a given characteristic among the population,

or to estimate the mean of a given quantitative variable,

one uses a confidence interval at the (1- ) level.

Page 52: MBA Statistics 51-651-00

53

Confidence intervals (continued)

It consists of constructing an interval of values which enables one to affirm, with a certain level of confidence (in general: 90%, 95% or 99%), that the true value of the parameter for the population, is included in this interval.

Illustration: Confidence interval applet

Page 53: MBA Statistics 51-651-00

54

Confidence interval for estimating a proportion p Example: In a sample of 125 college

students who were questioned on their intentions to vote in the next election, 45 answered positively.

Estimate, in a specific way, the proportion of the entire student population of this institution who intend to vote at the next elections.

Page 54: MBA Statistics 51-651-00

55

Confidence interval for estimating a proportion (continued) Excel program: proportion1.xls

In general, if the sample size n is large enough, the (1 - ) confidence interval to estimate the true proportion p of the studied characteristic in the population is given by:

; - 1 ] z Z z- P[

as,such is on table,distributi normal in the read , z

/2/2

/2

. 1.96 z then 5%, if /2

Page 55: MBA Statistics 51-651-00

56

Example (continued) : Consequently, a confidence interval of 95%

certainty for the proportion of the entire student population of this institution who intend to vote at the next election is given by:

Page 56: MBA Statistics 51-651-00

57

Example (continued) : How would we report the results of

this survey in the student newspaper of this college?

36% of the students of this college intend to exercise their voting rights at the next student election. The margin of error is 8.4% with a 95% degree of confidence (or with 95% certainty or 19 times out of 20).

Page 57: MBA Statistics 51-651-00

58

Notes:This is an approximate formula and applies only for large samples.If all possible random samples of size n are taken and their 95% confidence interval calculated, 95% of them will include the true proportion p of the population, and thus 5% will not include it.

The quantity is called the margin of error and is used to establish the 95% confidence level (19 times out of 20).

Page 58: MBA Statistics 51-651-00

59

Margin of error at the 95% level

p (%)100 300 500 1000 3000 10000

10 5,9 3,4 2,6 1,9 1,1 0,620 7,8 4,5 3,5 2,5 1,4 0,830 9,0 5,2 4,0 2,8 1,6 0,940 9,6 5,5 4,3 3,0 1,8 1,050 9,8 5,7 4,4 3,1 1,8 1,060 9,6 5,5 4,3 3,0 1,8 1,070 9,0 5,2 4,0 2,8 1,6 0,980 7,8 4,5 3,5 2,5 1,4 0,890 5,9 3,4 2,6 1,9 1,1 0,6

size of the sample n

sample size n and the value of p at the 95% level

Page 59: MBA Statistics 51-651-00

60

Margin of error at the 90% level

p (%)100 300 500 1000 3000 10000

10 4,9 2,8 2,2 1,6 0,9 0,520 6,6 3,8 2,9 2,1 1,2 0,730 7,5 4,4 3,4 2,4 1,4 0,840 8,1 4,7 3,6 2,5 1,5 0,850 8,2 4,7 3,7 2,6 1,5 0,860 8,1 4,7 3,6 2,5 1,5 0,870 7,5 4,4 3,4 2,4 1,4 0,880 6,6 3,8 2,9 2,1 1,2 0,790 4,9 2,8 2,2 1,6 0,9 0,5

sample size n

Margin of error or precision (in %) according to the sample size n and the value of p at the 90% level

Page 60: MBA Statistics 51-651-00

61

Calculation of size n to ensure a maximum margin of error If we want to estimate the

proportion p at a (1-) confidence level with a maximum margin of error e, then we have the following relation for the calculation of the sample size n :

Page 61: MBA Statistics 51-651-00

62

Discussion Take a look at the following survey:

Survey on California recall election

In the light of the recent results, what can you say about the survey.

Page 62: MBA Statistics 51-651-00

63

Confidence interval for estimating the mean

In a general way, if the sample size n is large enough, the (1 - ) confidence interval for estimating the true mean of the population, is given by:

. - 1 ] t T t- P[

as,such and , on tabledistributiStudent in the read , t

1-n /2,1-n /2,

1-n /2,

Page 63: MBA Statistics 51-651-00

64

Notes: This is an approximate formula that applies

for small samples when the characteristic is has a normal distribution or from a large sample (n ≥ 30).

When n is very large (n ≥ 100), the values of t/2, n-1 and z/2 coincide.

Excel: Tools/Data Analysis/Descriptivre Statistics or mean-1.xls

Page 64: MBA Statistics 51-651-00

65

Notes: (continued)Interpretation of a 95% confidence interval for the mean of a characteristic in the population:

If all the random samples of size n are taken and their confidence intervals calculated, 95% of them will include the true mean of the population, and thus 5% will not include it.Recall the confidence interval applet.

Page 65: MBA Statistics 51-651-00

66

Confidence interval for :Example

In order to know the weekly average cost of the grocery basket for a family of 4 people residing in Sherbrooke, we take a sample of 50 of these families and we note the amount of their grocery for this week. We obtain an average amount of 155$ with a standard deviation estimate of 15$.

Page 66: MBA Statistics 51-651-00

67

Example (continued) :Estimate the current average cost of the grocery basket for a family of 4 people residing in Sherbrooke using a 95% confidence interval:

By stating that the current average cost of the grocery basket of a family of 4 people residing at Sherbrooke is included in the interval [150.74$; 159.26$], I am 95% certain to be right. My prediction will be true 95% of time.

Page 67: MBA Statistics 51-651-00

68

Example A company wants to commercialize a new software to get

rid of junk mail. The potential market is 800 000 consumers.

  Before starting selling the product, the company realized

a survey from a random sample of 40 families. Six families were interested in buying the new software.

  The net gain is 3$ per software and there are fixed costs

of 50 000$.   What is the decision?


Recommended