Mb40 Assignment Set 2003 Version

ASSIGNMENT SET 1

Master of Business Administration-MBA Semester 1

SUBJECT - MB0040 – STATISTICS FOR MANAGEMENT

1. (a) ‘Statistics is the backbone of decision-making’. Comment.

(b) Give plural meaning of the word Statistics?

Ans. (a) Due to advanced communication network, rapid changes in consumer behavior, varied

expectations of variety of consumers and new market openings, modern managers have a difficult task

of making quick and appropriate decisions. Therefore, there is a need for them to depend more upon

quantitative techniques like mathematical models, statistics, operations research and econometrics.

Decision making is a key part of our day-to-day life. Even when we wish to purchase a television, we like

to know the price, quality, durability, and maintainability of various brands and models before buying

one. As you can see, in this scenario we are collecting data and making an optimum decision. In other

words, we are using statistics. Again, suppose a company wishes to introduce a new product, it has to

collect data on market potential, consumer likings, availability of raw materials, feasibility of producing

the product. Hence, data collection is the back-bone of any decision making process. Many organizations

find themselves data-rich but poor in drawing information from it. Therefore, it is important to develop

the ability to extract meaningful information from rea data to make better decisions. Statistics play an

important role in this aspect. Statistics is broadly divided into two main categories – Descriptive

statistics: descriptive statistics is used to present the general description of data which is summarized

qwuantitatively. This is mostly useful in clinical research, when communicating the results of

experiments.

Inferential statistics: inferential statistics is used o make valid inferences from the data which are

helpful in effective decision making for managers or professionals. Statistical methods such as

estimation, prediction and hypothesis testing belong to inferential statistics. The researchers make

deductions or conclusions from the collected data sample regarding the characteristics of large

population from which the samples are taken.

(b) Statistics is usually and loosely defined as:

1. A collection of numerical data that measure something.

2. The science of recording, organizing, analyzing and reporting quantitative information.

Professor A.L.Bowley gave several definitions of Statistics. He defined statistics as:

1. The science of counting

2. The science of averages

3. The science of measurement of social phenomenon, regarded as a whole in all its

manisfestations.

4. A subject not confined to any one science

According to Horace Secrist, ‘’ statistics may be defined as the aggregate of facts affected to a marked

extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a

reasonable standard of accuracy, collected in a systematic manner, for a predetermined purpose and

placed in relation to each other.’’ This definition is both comprehensive and exhaustive.

Prof Boddington, on the other hand, defined statistics as ‘the science of estimates and probabilities’.

This definition is also not complete.

According to Croxton and Cowden, ‘ Statistics is the science of collection, presentation, analysis and

interpretation of numerical data from logical analysis’. Thus statistics can be defined in different ways.

2. (a). In a bivariate data on ‘x’ and ‘y’, variance of ‘x’ = 49, variance of ‘y’ = 9 and

covariance (x,y) = -17.5. Find coefficient of correlation between ‘x’ and ‘y’.

(b). Enumerate the factors which should be kept in mind for proper planning.

Ans. We know that

Co efficient of correlation r =

given that = 17.5

= = 7 3

Therefore r = 17.5/ 7x3 = 0.833 . Hence there is a highly negative correlation.

(b) The relevance and accuracy of data obtained in a survey depends upon the care exercised in

planning. A properly planned investigation can lead to best results with leas cost and time. Following

factors must be kept in mind while planning:

1. Nature of the problem to be investigated should be clearly defined in an unambiguous manner.

2. Objectives of investigation should be stated at the outset. Objectives could be to :

Obtain certain estimates

Establish a theory

Verify an existing statement

Find relationship between characteristics

3. The scope of investigation has to be made clear. The scope of investigation refers to the area to be

covered. Identification of units to be studied, nature of characteristics to be observed, accuracy of

measurements, analytical methods, time, cost and other resources required.

4. Whether to use data collected from primary or secondary source should be determined in advance.

5. The organization of investigation is the final step in the process. It encompasses the determination of

the number of investigators required, their training, supervision work needed, funds required.

3. The percentage sugar content of Tobacco in two samples was represented in table

11.11. Test whether their population variances are same.

Table 1. Percentage sugar content of Tobacco in two samples

Sample A 2.4 2.7 2.6 2.1 2.5

Sample B 2.7 3.0 2.8 3.1 2.2 3.6

Ans. Number of samples k is 2

Total number of samples N is 11

Degree of freedom in the numerator = k – 1 = 1

Degree of freedom in the denominator = N - k = 11 – 2 = 9

T = sum of all observations = 2.4+2.7+2.6+2.1+2.5 +2.7+3+2.8+3.1+2.2+3.6 = 29.7

Correction factor = T2 / N = 882.09 / 11 = 80.19.

SST = sum of squares of all observations - T2 / N

= 82.01-80.19

= 1.82

SSC = sum of the squares of the columns - T2 / N

= 26.01/2 + 32.49/2 + 29.16/2 + 27.04/2 + 22.09/2 + 12.96/2 – 80.19

= 13.005+16.245+14.58+13.57+11.045+6.48

= 74.875-80.19

= -5.315

Sum of the squares of the error - SSE = SST- SSC = 1.82-(-5.315)

= 1.82+5.315

= 7.135

Variance between samples MSC = SSC/K – 1 = -5.315 / 5= -1.06

Variance within the samples MSE = SSE/N-K= 7.135 /11 - 6 = 1.427

Therefore, F = MSC/MSE = -1.06/1.427 = -0.74

The table value of F tab at 5 % level of significance for DF (1,9) is 5.12

Calculated F < F tab , Hence H0 is accepted.

Therefore there is no significant population variation.

4. a. Explain the characteristics of business forecasting.

b. Differentiate between prediction, projection and forecasting.

Ans. (a) Business forecasting refers to the analysis of past and present economic conditions with the

object of drawing inferences about probable future business conditions. The Characteristics of business

forecasting are as follows:

1. Based on past and present conditions – Business forecasting is based on past and present

economic condition of the business. To forecast the future, various data, information and facts

concerning to economic condition of business for past and present are analyzed.

2. Based on mathematical and statistical methods – the process of forecasting includes the

use of statistical and mathematical methods. By using these methods, the actual trend which

may take place in future can be forecasted.

3. Period – The forecasting, can be made for long term, short term, medium term or any

specific period.

4. Estimation of future – The business forecasting is to forecast the future regarding

probable economic conditions.

5. Scope – the forecasting can be physical as well as financial.

(b) A prediction is an estimate based solely on past data of the series under investigation. It is purely

mechanical extrapolation.

A projection is a prediction where the extrapolated values are subject to certain numerical assumptions.

A forecast is an estimate which relates the series in which we are interested to external factors.

5. What are the components of time series? Bring out the significance of moving

average in analyzing a time series and point out its limitations.

Ans. The behavior of a time series over periods of time is called the movement of the time series. The

time series is classified into the following four components:

1. Long term trend or secular trend

2. Seasonal variations

3. Cyclic variations

4. Random variations

Long term trend or secular trend – This refers to the smooth or regular long term growth

or decline of the series. This movement can be characterized by a trend curve. If this curve is a

straight line, then it is called a trend line. If the variable is increasing over a long period of time,

then it is called an upward trend. If the variable is decreasing over a long period of time., then it

is called a downward trend. If the variable moves up or down along a straight line then it is called

a linear trend, otherwise it is called a non-linear trend.

Seasonal variations – Variations in a time series that are periodic in nature and occur

regularly over short periods of time during a year are called seasonal variations. By definition,

these variations are precise and can be forecasted. The following are examples of seasonal

variations in a time series – 1.The prices of vegetables drop down after rainy season or in winter

months and they go up during summer, every year.2. The prices of cooking oil reduce after the

harvesting of oil seeds and go up after some time.

Cyclic variations – the long-term oscillations that represent consistent rises and declines

in the values of the variable are called cyclic variations. Since these are lon-term oscillations in

the time series, the period of oscillation is usually greater than one year. The oscillations are

about a trend curve or a trend line. The period of one cycle is the time-distance between two

successive peaks or two successive troughs.

Random variations – random variations are called irregular movements. Movements that

occur usually in brief periods of time, without any pattern and which are unpredictable in nature

are called irregular movements. These movements do not have any regular period or time of

occurrences. For example, the effect of national strikes, floods, earthquakes and so on. It is very

difficult to study the behavior of such a time series.

(b) Moving average method is one of the methods of measuring trend. This method is used for

smoothing the time series. That is, it smoothes the fluctuations of the data by the method of moving

averages. It is one of the simple methods to measure trends. This method is objective in the sense that

anybody working on a problem with this method will get the same results. This method is used for

determining seasonal, cyclic and irregular variations besides the trend values. It is flexible enough to add

more figures to the data because the entire calculations are not changed. If the period of moving

averages coincides with the period of cyclic fluctuations in the data, such fluctuations are automatically

eliminated.

Limitations –

There is no functional relationship between the values and the time. Thus, this method is not helpful in

forecasting and predicting the values on the basis of time. No trend values for some years in the

beginning and some in the end. For example for five yearly moving average, there will be no trend

values for the first two years and the last three years. In case of non- linear trend, the values obtained by

this method are biased in one or the other direction. The period selection of moving averages is a

difficult task. Hence, great care has to be taken in period selection, particularly when there is no

business cycle during that time.

6. a. List down various measures of central tendency and explain the difference

between them?

b. What is a confidence interval, and why it is useful? What is a confidence level?

Ans. Graphical representation is a good way to represent summarized data. However, graphs provide us

only an overview and thus may not be used for further analysis. Hence, we use summary statistics like

computing averages to analyze the data. Mass data, which is collected, classified, tabulated and

presented systematically, is analyzed further to bring its size to a single representative figure. This single

figure is the measure which can be found at central part of the range of all values. It is the one which

represents the entire data set. Hence, this is called the measure of central tendency. The tendency of

data to cluster around a figure which is in central location is known as central tendency. Measure of

central tendency or average of first order describes the concentration of large numbers around a

particular value. It is a single value which represents all units.

Measures of Central tendency include

arithmetic mean

median

mode

geometric mean

harmonic mean

Arithmetic mean – Arithmetic mean is defined as the sum of all values divided by number of values.

Median – Median of a set of values is the value which is the middle most value when they are arranged

in the ascending order of magnitude.

Mode – Mode is the value which has the highest frequency and is denoted by Z

Geometric mean – Geometric mean of a series of “n” positive numbers is given by:

1. in case of discrete series without frequency,

GM =

2. in case of discrete series with frequency,

GM =

Where n = f1 + f2 + …….fn

3. in case of continuous series,

GM =

Where n = f1 + f2 + …….fn and x1, x2, ………xn are the mid points of class intervals.

Harmonic mean – if x1, x2, ………xn are ‘’n’’ values for discrete series without frequency, then their

harmonic mean HM is

H.M. =

All these measures of central tendency can be differentiated according to its properties:

Arithmetic mean:

it is simple to calculate and easy to understand.

It is affected by extreme values

It is based on all values

It cannot be determined for distributions with open-end class intervals

It cannot be graphically located

It is capable of further algebraic treatment

Median:

It can be easily understood and computed

It is not based on all values

It is not affected by extreme values

It can be determined graphically

It is not capable of further algebraic treatment

It can be calculated for distributions with open-end classes.

Mode:

It is not based on all values.

It is not affected by extreme values

It is not capable of further mathematical treatment.

It can be calculated for distributions with open end classes

It can be located graphically

Geometric mean:

It cannot be calculated when there are both positive and negative values in the series or

more observations have value zero

Compared to arithmetic mean it is more difficult to compute and interpret

Harmonic mean is used when:

It is difficult to understand and calculate

It cannot be computed when one or more values is zero.

(b) A confidence interval (CI) is a particular kind of interval estimate of a population parameter and is

used to indicate the reliability of an estimate. It is an observed interval (i.e. it is calculated from the

observations), in principle different from sample to sample, that frequently includes the parameter of

interest, if the experiment is repeated. How frequently the observed interval contains the parameter is

determined by the confidence level or confidence coefficient.

A confidence interval with a particular confidence level is intended to give the assurance that, if the

statistical model is correct, then taken over all the data that might have been obtained, the procedure

for constructing the interval would deliver a confidence interval that included the true value of the

parameter the proportion of the time set by the confidence level. More specifically, the meaning of the

term "confidence level" is that, if confidence intervals are constructed across many separate data

analyses of repeated (and possibly different) experiments, the proportion of such intervals that contain

the true value of the parameter will approximately match the confidence level; this is guaranteed by the

reasoning underlying the construction of confidence intervals.

http://en.wikipedia.org/wiki/Interval_estimation

http://en.wikipedia.org/wiki/Population_parameter

ASSIGNMENT SET 2

Master of Business Administration-MBA Semester 1

SUBJECT - MB0040 – STATISTICS FOR MANAGEMENT

1. (a) What are the characteristics of a good measure of central tendency?

(b) What are the uses of averages?

Ans. characteristics of a good measure of central tendency are as follows:

It shoud be simple to calculate and easy to understand

It should be based on all values

It should not be affected by extreme values

It should not be affected by sampling fluctuation

It should be rigidly defined

It should be capable of further algebraic treatment

(b) use of various averages are as follows:

Arithmetic mean is used when:

In depth study of the variable is needed

The variable is continuous and additive in nature

The data are in the interval or ratio scale

When the distribution is symmetrical

Median is used when

The variable is discrete

There exists abnormal values

The distribution is skewed

The extreme values are missing

The characteristics studied are qualitative

The data are on the ordinal scale

Mode is used when:

The variable is discrete

There exists abnormal values

The distribution is skewed

The extreme values are missing

The characteristics studied are qualitative

Geometric mean is used when:

The rate of growth, ratios and percentages are to be studied

The variable is of multiplicative nature

Harmonic mean is used when:

The study is related to speed, time

Average of rates which produce equal effects has to be found.

2. Calculate the 3 yearly and 5 yearly averages of the data in table below.

Table 1: Production data from 1988 to 1997

Year 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997

Production (in Lakh ton)

15 18 16 22 19 24 20 28 22 30

Ans. 3 yearly averages are calculated as below:

YearProduction(thousand Y tonnes)

3-yearly moving totals

3-yearly moving totals Ye

Short term fluctuations (Y-Ye)

1988 15

1989 18 16.33 5.44 12.561990 16 18.66 6.22 9.781991 22 19 6.33 15.671992 19 21.67 7.22 11.781993 24 21 7 171994 20 24 8 121995 28 23.34 7.78 20.221996 22 26.67 8.89 13.111997 30

5 yearly moving averages are calculated as given in the table below:

YearProduction(thousand Y tonnes)

5-yearly moving totals

3-yearly moving totals Ye

Short term fluctuations (Y-Ye)

1988 15 1989 18 1990 16 18 3.6 12.41991 22 19.8 3.96 18.041992 19 20.2 4.04 14.961993 24 22.6 4.52 19.481994 20 22.6 4.52 15.481995 28 24.8 4.96 23.041996 22 1997 30

3. What is meant by secular trend? Discuss any two methods of isolating trend values

in a time series.

(b)What is seasonal variation of a time series? Describe the various methods you know

to evaluate it and examine their relative merits.

Ans. Secular trend or Long term trend:

This refers to the smooth or regular long term growth or decline of the series. This movement can be

characterized by a trend curve. If this curve is a straight line, then it is called a trend line. If the variable is

increasing over a long period of time, then it is called an upward trend. If the variable is decreasing over

a long period of time, then it is called a downward trend. If the variable moves upward or downwards

along a straight line then the trend is called a linear trend, otherwise it is called a non-linear trend.

The Methods of measuring trend are as follows:

1. Free hand or graphic methods

2. Semi averages method

3. Moving average method

4. Method of least squares

1. Free hand or graphic methods:- this is the simplest method of drawing a trend curve. We plot the

values of the variable against time on a graph paper and join these points. The trend line is then fitted by

inspecting the graph of the time series. Fitting a trend line by this method is arbitrary. The trend line is

drawn such that the numbers of fluctuations on either side are approximately the same. The trend line

should be a smooth curve.

2. Semi-average method:- the methods of fitting a linear trend with the help of semi average method

are as follows:

I. When the number of years is even:, then the data of the time series is divided into two equal parts.

The total of the items in each of the part is done and it is then divided by the number of items to obtain

arithmetic means of the two parts. Each average is then centred in the period of the time from which it

has been computed and plotted on the graph paper. A straight line is drawn passing through these

points. This is the required trend line.

II. When the number of years is odd, then the value of the middle year is omitted to divide the time

series into two equal parts. Then the procedure described in ‘I’is followed.

A trend value of any future year may be predicted by multiplying the periodic increment by the number

of years into the future that is desired and adding the dresult to the best trend value listed in the series.

(b) Variations in a time series that are periodic in nature and occur regularly over short periods of time

during a year are called seasonal variations. By definition, these variations are precise and can be

forecasted.

The following are the methods of measuring seasonal variations:

Seasonal variation index or seasonal average method

Seasonal variation through moving averages

Chain or link relative method

Ratio to trend method

Seasonal average method: In the seasonal average method, the steps followed are described below.

1. The time series is arranged by years and monthes or quarters.

2. Total of each month or quarter over all the years are obtained.

3. The average for each month or quarter is obtained. The average may be mean or median. In

general, we take mean if not specified otherwise.

4. Taking the average of monthly or quarterly average equal to 100, seasonal index for each month or

quarter is calculated by the following formula:

5. Seasonal index for a month (or quarter) = monthly or quarterly average for the month ( or quarter

/ average or monthly (or quarterly ) averages X100

6. Seasonal variation through moving averages: it is also known as percentage of moving average

method. The steps involved in the computation of seasonal indices by this method are described below.

7. The moving averages of the data are computed. If the data are monthly then 12 monthly moving

averages, if they are quarterly, then 4 quarterly moving averages will be computed. In both the cases,

time periods of moving averages are even. Hence, these moving averages are to be centred.

8. Under additive model, from teach original valued, the corresponding moving average is deducted

to find out short time fluctuations, which is given as: Y-T = S+C+I

9. By preparing a separate table, monthly short time fluctuations are added for each month over all

the years and their average is obtained. These averages are known as seasonal variation s for each

month or quarter.

10. If we want to isolate measure irregular variations, the mean of the respective month or quarter is

deducted from he short time fluctuations.

Chain or link relative method

The steps involved in the chain or link relative method are described below.

1. Each quarterly or monthly value is divided by the preceding quarterly or monthly valued and the

result is multiplied by 100. These percentages are known as link relatives of the seasonal values.

2. The mean of the link relatives for each season is computed over all the years. Median can also be

taken instead of mean of the link relatives.

3. These average link relatives are converted into chain relatives. The chain relative of first is taken as

100.

4. The second chain relative of first is computed on the basis of the chain relative for the last

5. This chain relative may or may not be 100. It is not equal to 100 due to secular trend. If it is 100, go

to step 6 if it is not 100, go to step 5 and then go to step 6

6. Compute the difference ‘d’ between the new chain elatives first obtained in step 4 and chain

relative assumed as 100. ‘d’ is divided by the number of seasons and the resulting figure is multiplied by

1,2,3 and the product is deducted respectively from the chain relatives of 2nd, 3rd, and 4th quarters. These

are called corrected relatives.

7. The seasonal indices are obtained when the corrected chain relatives are expresses as percentage

of their relative averages.

Ratio to trend method: The steps to determine seasonal indices by this method are as described below.

1. Determine the trend values by the method of least squares.

2. To find ratio to trend, divide the original data by the corresponding trend values and multiply these

ratios by 100

3. Calculate the arithmetic mean of the trend ratios obtained in step2

4. Finally all the trend ratios will be converted into seasonal indices. For this, add all averages

obtained in step3 and find their general average. Seasonal indices are calculated by using the following

formula:

5. Seasonal indices = [ quarterly averages / general averages] x 100

4. The probability that a contractor will get an electrical job is 0.8, he will get a plumbing job is 0.6 and he will get both 0.48. What is the probability that he get at least one? Is the probabilities of getting electrical and plumbing job are independent?

Ans. The probability that a contractor will get an electrical job = 0.8

P(A) = 0.8

The probability that the contractor will get a plumbing job = 0.6

P(B)= 0.6

The probability that he will get both 0.48

Therefore, P( A П B) = 0.48

P( A U B ) = P (A) + P(B) – P(AПB)

= 0.8 + 0.6 – 0.48

= 0.92

Hence the probability that the person will get at least one job is 0.92.

5. (a) Discuss the errors that arise in statistical survey

(b) What is quota sampling and when do we use it?

Ans. The term error denotes the difference between population value and its estimate provided by sampling technique. Therefore, the term is not referred in its ordinary sense in statistics. There are four types of errors.

Sampling errors: the sample results are bound to differ from population results, since sample is only a small portion of the population. It is also known as inherent error and cannot be avoided. It is not worth to eliminate them completely. These errors may be due to the following factors:

Faulty selection of sample

Substitution of units to be studied

Faulty demarcation of sampling units

Error due to bias in estimation

However, the sampling errors follow random or chance variations and tend to cancel out each other on averaging.

Non –sampling errors: these are attributed to factors that can b controlled and eliminated by suitable actions. It is worth to eliminate these errors. They are due to the following factors:

Faulty planning, faulty definitions

Defective methods of interviewing

Personal bias of investigator

Lack of trained and qualified investigators

Respondent’s failure to answer

Improper coverage

Compiling errors

Publication errors

Biased errors: It arises in both census and sampling method. These errors occur due to personal bias of the investigator and the instruments used for measuring. They are also due to faculty collection of data, respondent’s bias and bias due to non response. Biased errors have a tendency to grow with sample size. Therefore, they are also known as cumulative errors. The magnitude of biased errors is directly proportional to the sample size.

(b) Quota sampling: It is a type of judgment sampling. Under this design, quotas are set up according to some specified characteristic such as age groups or income groups. From each group a specified number of units are sampled according to the quota allotted to the group. Within the group the selection of sample units depends on personal judgment. It has a risk of personal prejudice and bias entering the process. This method is often used in public opinion studies.

6. (a) Why do we use a chi-square test?

(b) Why do we use analysis of variance?

Ans. Chi-square is a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific hypothesis. For example, if, according to Mendel's laws, you expected 10 of 20 offspring from a cross to be male and the actual observed number was 8 males, then you might want to know about the "goodness to fit" between the observed and expected. Were the deviations (differences between observed and expected) the result of chance, or were they due to other factors. How much deviation can occur before you, the investigator, must conclude that something other than chance is at work, causing the observed to differ from the expected. The chi-square test is always testing what scientists call the null hypothesis, which states that there is no significant difference between the expected and observed result.

(b) When we have more than two populations, we have to use the analysis of variance to evaluate the mean differences between two or more populations. Analysis of variance (ANOVA) will enable us to test for the significance of the differences of variance among more than two sample means. Using analysis of variance, we will be able to make inferences about whether our samples are drawn from populations having the same mean or not.

Analysis of variance is useful in such situations as comparing the mileage achieved by five different brands of gasoline, testing which of four different training methods produce the fastest learning record, or comparing the first-year earnings of the graduates of half a dozen different business schools. In each of these cases, we would compare the means of more than two samples. Hence, in most of the fields, such as agriculture, medical, finance, banking, insurance, education, the concept of analysis of variance (ANOVA) is used.

Date post:	22-Sep-2014
Category:	Documents
Upload:	ajesh-mani
View:	109 times
Download:	0 times

Mb40 Assignment Set 2003 Version

Documents