Date post: | 22-Sep-2014 |
Category: |
Documents |
Upload: | ajesh-mani |
View: | 109 times |
Download: | 0 times |
ASSIGNMENT SET 1
Master of Business Administration-MBA Semester 1
SUBJECT - MB0040 – STATISTICS FOR MANAGEMENT
1. (a) ‘Statistics is the backbone of decision-making’. Comment.
(b) Give plural meaning of the word Statistics?
Ans. (a) Due to advanced communication network, rapid changes in consumer behavior, varied
expectations of variety of consumers and new market openings, modern managers have a difficult task
of making quick and appropriate decisions. Therefore, there is a need for them to depend more upon
quantitative techniques like mathematical models, statistics, operations research and econometrics.
Decision making is a key part of our day-to-day life. Even when we wish to purchase a television, we like
to know the price, quality, durability, and maintainability of various brands and models before buying
one. As you can see, in this scenario we are collecting data and making an optimum decision. In other
words, we are using statistics. Again, suppose a company wishes to introduce a new product, it has to
collect data on market potential, consumer likings, availability of raw materials, feasibility of producing
the product. Hence, data collection is the back-bone of any decision making process. Many organizations
find themselves data-rich but poor in drawing information from it. Therefore, it is important to develop
the ability to extract meaningful information from rea data to make better decisions. Statistics play an
important role in this aspect. Statistics is broadly divided into two main categories – Descriptive
statistics: descriptive statistics is used to present the general description of data which is summarized
qwuantitatively. This is mostly useful in clinical research, when communicating the results of
experiments.
Inferential statistics: inferential statistics is used o make valid inferences from the data which are
helpful in effective decision making for managers or professionals. Statistical methods such as
estimation, prediction and hypothesis testing belong to inferential statistics. The researchers make
deductions or conclusions from the collected data sample regarding the characteristics of large
population from which the samples are taken.
(b) Statistics is usually and loosely defined as:
1. A collection of numerical data that measure something.
2. The science of recording, organizing, analyzing and reporting quantitative information.
Professor A.L.Bowley gave several definitions of Statistics. He defined statistics as:
1. The science of counting
2. The science of averages
3. The science of measurement of social phenomenon, regarded as a whole in all its
manisfestations.
4. A subject not confined to any one science
According to Horace Secrist, ‘’ statistics may be defined as the aggregate of facts affected to a marked
extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a
reasonable standard of accuracy, collected in a systematic manner, for a predetermined purpose and
placed in relation to each other.’’ This definition is both comprehensive and exhaustive.
Prof Boddington, on the other hand, defined statistics as ‘the science of estimates and probabilities’.
This definition is also not complete.
According to Croxton and Cowden, ‘ Statistics is the science of collection, presentation, analysis and
interpretation of numerical data from logical analysis’. Thus statistics can be defined in different ways.
2. (a). In a bivariate data on ‘x’ and ‘y’, variance of ‘x’ = 49, variance of ‘y’ = 9 and
covariance (x,y) = -17.5. Find coefficient of correlation between ‘x’ and ‘y’.
(b). Enumerate the factors which should be kept in mind for proper planning.
Ans. We know that
Co efficient of correlation r =
given that = 17.5
= = 7 3
Therefore r = 17.5/ 7x3 = 0.833 . Hence there is a highly negative correlation.
(b) The relevance and accuracy of data obtained in a survey depends upon the care exercised in
planning. A properly planned investigation can lead to best results with leas cost and time. Following
factors must be kept in mind while planning:
1. Nature of the problem to be investigated should be clearly defined in an unambiguous manner.
2. Objectives of investigation should be stated at the outset. Objectives could be to :
Obtain certain estimates
Establish a theory
Verify an existing statement
Find relationship between characteristics
3. The scope of investigation has to be made clear. The scope of investigation refers to the area to be
covered. Identification of units to be studied, nature of characteristics to be observed, accuracy of
measurements, analytical methods, time, cost and other resources required.
4. Whether to use data collected from primary or secondary source should be determined in advance.
5. The organization of investigation is the final step in the process. It encompasses the determination of
the number of investigators required, their training, supervision work needed, funds required.
3. The percentage sugar content of Tobacco in two samples was represented in table
11.11. Test whether their population variances are same.
Table 1. Percentage sugar content of Tobacco in two samples
Sample A 2.4 2.7 2.6 2.1 2.5
Sample B 2.7 3.0 2.8 3.1 2.2 3.6
Ans. Number of samples k is 2
Total number of samples N is 11
Degree of freedom in the numerator = k – 1 = 1
Degree of freedom in the denominator = N - k = 11 – 2 = 9
T = sum of all observations = 2.4+2.7+2.6+2.1+2.5 +2.7+3+2.8+3.1+2.2+3.6 = 29.7
Correction factor = T2 / N = 882.09 / 11 = 80.19.
SST = sum of squares of all observations - T2 / N
= 82.01-80.19
= 1.82
SSC = sum of the squares of the columns - T2 / N
= 26.01/2 + 32.49/2 + 29.16/2 + 27.04/2 + 22.09/2 + 12.96/2 – 80.19
= 13.005+16.245+14.58+13.57+11.045+6.48
= 74.875-80.19
= -5.315
Sum of the squares of the error - SSE = SST- SSC = 1.82-(-5.315)
= 1.82+5.315
= 7.135
Variance between samples MSC = SSC/K – 1 = -5.315 / 5= -1.06
Variance within the samples MSE = SSE/N-K= 7.135 /11 - 6 = 1.427
Therefore, F = MSC/MSE = -1.06/1.427 = -0.74
The table value of F tab at 5 % level of significance for DF (1,9) is 5.12
Calculated F < F tab , Hence H0 is accepted.
Therefore there is no significant population variation.
4. a. Explain the characteristics of business forecasting.
b. Differentiate between prediction, projection and forecasting.
Ans. (a) Business forecasting refers to the analysis of past and present economic conditions with the
object of drawing inferences about probable future business conditions. The Characteristics of business
forecasting are as follows:
1. Based on past and present conditions – Business forecasting is based on past and present
economic condition of the business. To forecast the future, various data, information and facts
concerning to economic condition of business for past and present are analyzed.
2. Based on mathematical and statistical methods – the process of forecasting includes the
use of statistical and mathematical methods. By using these methods, the actual trend which
may take place in future can be forecasted.
3. Period – The forecasting, can be made for long term, short term, medium term or any
specific period.
4. Estimation of future – The business forecasting is to forecast the future regarding
probable economic conditions.
5. Scope – the forecasting can be physical as well as financial.
(b) A prediction is an estimate based solely on past data of the series under investigation. It is purely
mechanical extrapolation.
A projection is a prediction where the extrapolated values are subject to certain numerical assumptions.
A forecast is an estimate which relates the series in which we are interested to external factors.
5. What are the components of time series? Bring out the significance of moving
average in analyzing a time series and point out its limitations.
Ans. The behavior of a time series over periods of time is called the movement of the time series. The
time series is classified into the following four components:
1. Long term trend or secular trend
2. Seasonal variations
3. Cyclic variations
4. Random variations
Long term trend or secular trend – This refers to the smooth or regular long term growth
or decline of the series. This movement can be characterized by a trend curve. If this curve is a
straight line, then it is called a trend line. If the variable is increasing over a long period of time,
then it is called an upward trend. If the variable is decreasing over a long period of time., then it
is called a downward trend. If the variable moves up or down along a straight line then it is called
a linear trend, otherwise it is called a non-linear trend.
Seasonal variations – Variations in a time series that are periodic in nature and occur
regularly over short periods of time during a year are called seasonal variations. By definition,
these variations are precise and can be forecasted. The following are examples of seasonal
variations in a time series – 1.The prices of vegetables drop down after rainy season or in winter
months and they go up during summer, every year.2. The prices of cooking oil reduce after the
harvesting of oil seeds and go up after some time.
Cyclic variations – the long-term oscillations that represent consistent rises and declines
in the values of the variable are called cyclic variations. Since these are lon-term oscillations in
the time series, the period of oscillation is usually greater than one year. The oscillations are
about a trend curve or a trend line. The period of one cycle is the time-distance between two
successive peaks or two successive troughs.
Random variations – random variations are called irregular movements. Movements that
occur usually in brief periods of time, without any pattern and which are unpredictable in nature
are called irregular movements. These movements do not have any regular period or time of
occurrences. For example, the effect of national strikes, floods, earthquakes and so on. It is very
difficult to study the behavior of such a time series.
(b) Moving average method is one of the methods of measuring trend. This method is used for
smoothing the time series. That is, it smoothes the fluctuations of the data by the method of moving
averages. It is one of the simple methods to measure trends. This method is objective in the sense that
anybody working on a problem with this method will get the same results. This method is used for
determining seasonal, cyclic and irregular variations besides the trend values. It is flexible enough to add
more figures to the data because the entire calculations are not changed. If the period of moving
averages coincides with the period of cyclic fluctuations in the data, such fluctuations are automatically
eliminated.
Limitations –
There is no functional relationship between the values and the time. Thus, this method is not helpful in
forecasting and predicting the values on the basis of time. No trend values for some years in the
beginning and some in the end. For example for five yearly moving average, there will be no trend
values for the first two years and the last three years. In case of non- linear trend, the values obtained by
this method are biased in one or the other direction. The period selection of moving averages is a
difficult task. Hence, great care has to be taken in period selection, particularly when there is no
business cycle during that time.
6. a. List down various measures of central tendency and explain the difference
between them?
b. What is a confidence interval, and why it is useful? What is a confidence level?
Ans. Graphical representation is a good way to represent summarized data. However, graphs provide us
only an overview and thus may not be used for further analysis. Hence, we use summary statistics like
computing averages to analyze the data. Mass data, which is collected, classified, tabulated and
presented systematically, is analyzed further to bring its size to a single representative figure. This single
figure is the measure which can be found at central part of the range of all values. It is the one which
represents the entire data set. Hence, this is called the measure of central tendency. The tendency of
data to cluster around a figure which is in central location is known as central tendency. Measure of
central tendency or average of first order describes the concentration of large numbers around a
particular value. It is a single value which represents all units.
Measures of Central tendency include
arithmetic mean
median
mode
geometric mean
harmonic mean
Arithmetic mean – Arithmetic mean is defined as the sum of all values divided by number of values.
Median – Median of a set of values is the value which is the middle most value when they are arranged
in the ascending order of magnitude.
Mode – Mode is the value which has the highest frequency and is denoted by Z
Geometric mean – Geometric mean of a series of “n” positive numbers is given by:
1. in case of discrete series without frequency,
GM =
2. in case of discrete series with frequency,
GM =
Where n = f1 + f2 + …….fn
3. in case of continuous series,
GM =
Where n = f1 + f2 + …….fn and x1, x2, ………xn are the mid points of class intervals.
Harmonic mean – if x1, x2, ………xn are ‘’n’’ values for discrete series without frequency, then their
harmonic mean HM is
H.M. =
All these measures of central tendency can be differentiated according to its properties:
Arithmetic mean:
it is simple to calculate and easy to understand.
It is affected by extreme values
It is based on all values
It cannot be determined for distributions with open-end class intervals
It cannot be graphically located
It is capable of further algebraic treatment
Median:
It can be easily understood and computed
It is not based on all values
It is not affected by extreme values
It can be determined graphically
It is not capable of further algebraic treatment
It can be calculated for distributions with open-end classes.
Mode:
It is not based on all values.
It is not affected by extreme values
It is not capable of further mathematical treatment.
It can be calculated for distributions with open end classes
It can be located graphically
Geometric mean:
It cannot be calculated when there are both positive and negative values in the series or
more observations have value zero
Compared to arithmetic mean it is more difficult to compute and interpret
Harmonic mean is used when:
It is difficult to understand and calculate
It cannot be computed when one or more values is zero.
(b) A confidence interval (CI) is a particular kind of interval estimate of a population parameter and is
used to indicate the reliability of an estimate. It is an observed interval (i.e. it is calculated from the
observations), in principle different from sample to sample, that frequently includes the parameter of
interest, if the experiment is repeated. How frequently the observed interval contains the parameter is
determined by the confidence level or confidence coefficient.
A confidence interval with a particular confidence level is intended to give the assurance that, if the
statistical model is correct, then taken over all the data that might have been obtained, the procedure
for constructing the interval would deliver a confidence interval that included the true value of the
parameter the proportion of the time set by the confidence level. More specifically, the meaning of the
term "confidence level" is that, if confidence intervals are constructed across many separate data
analyses of repeated (and possibly different) experiments, the proportion of such intervals that contain
the true value of the parameter will approximately match the confidence level; this is guaranteed by the
reasoning underlying the construction of confidence intervals.
ASSIGNMENT SET 2
Master of Business Administration-MBA Semester 1
SUBJECT - MB0040 – STATISTICS FOR MANAGEMENT
1. (a) What are the characteristics of a good measure of central tendency?
(b) What are the uses of averages?
Ans. characteristics of a good measure of central tendency are as follows:
It shoud be simple to calculate and easy to understand
It should be based on all values
It should not be affected by extreme values
It should not be affected by sampling fluctuation
It should be rigidly defined
It should be capable of further algebraic treatment
(b) use of various averages are as follows:
Arithmetic mean is used when:
In depth study of the variable is needed
The variable is continuous and additive in nature
The data are in the interval or ratio scale
When the distribution is symmetrical
Median is used when
The variable is discrete
There exists abnormal values
The distribution is skewed
The extreme values are missing
The characteristics studied are qualitative
The data are on the ordinal scale
Mode is used when:
The variable is discrete
There exists abnormal values
The distribution is skewed
The extreme values are missing
The characteristics studied are qualitative
Geometric mean is used when:
The rate of growth, ratios and percentages are to be studied
The variable is of multiplicative nature
Harmonic mean is used when:
The study is related to speed, time
Average of rates which produce equal effects has to be found.
2. Calculate the 3 yearly and 5 yearly averages of the data in table below.
Table 1: Production data from 1988 to 1997
Year 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997
Production (in Lakh ton)
15 18 16 22 19 24 20 28 22 30
Ans. 3 yearly averages are calculated as below:
YearProduction(thousand Y tonnes)
3-yearly moving totals
3-yearly moving totals Ye
Short term fluctuations (Y-Ye)
1988 15
1989 18 16.33 5.44 12.561990 16 18.66 6.22 9.781991 22 19 6.33 15.671992 19 21.67 7.22 11.781993 24 21 7 171994 20 24 8 121995 28 23.34 7.78 20.221996 22 26.67 8.89 13.111997 30
5 yearly moving averages are calculated as given in the table below:
YearProduction(thousand Y tonnes)
5-yearly moving totals
3-yearly moving totals Ye
Short term fluctuations (Y-Ye)
1988 15 1989 18 1990 16 18 3.6 12.41991 22 19.8 3.96 18.041992 19 20.2 4.04 14.961993 24 22.6 4.52 19.481994 20 22.6 4.52 15.481995 28 24.8 4.96 23.041996 22 1997 30
3. What is meant by secular trend? Discuss any two methods of isolating trend values
in a time series.
(b)What is seasonal variation of a time series? Describe the various methods you know
to evaluate it and examine their relative merits.
Ans. Secular trend or Long term trend:
This refers to the smooth or regular long term growth or decline of the series. This movement can be
characterized by a trend curve. If this curve is a straight line, then it is called a trend line. If the variable is
increasing over a long period of time, then it is called an upward trend. If the variable is decreasing over
a long period of time, then it is called a downward trend. If the variable moves upward or downwards
along a straight line then the trend is called a linear trend, otherwise it is called a non-linear trend.
The Methods of measuring trend are as follows:
1. Free hand or graphic methods
2. Semi averages method
3. Moving average method
4. Method of least squares
1. Free hand or graphic methods:- this is the simplest method of drawing a trend curve. We plot the
values of the variable against time on a graph paper and join these points. The trend line is then fitted by
inspecting the graph of the time series. Fitting a trend line by this method is arbitrary. The trend line is
drawn such that the numbers of fluctuations on either side are approximately the same. The trend line
should be a smooth curve.
2. Semi-average method:- the methods of fitting a linear trend with the help of semi average method
are as follows:
I. When the number of years is even:, then the data of the time series is divided into two equal parts.
The total of the items in each of the part is done and it is then divided by the number of items to obtain
arithmetic means of the two parts. Each average is then centred in the period of the time from which it
has been computed and plotted on the graph paper. A straight line is drawn passing through these
points. This is the required trend line.
II. When the number of years is odd, then the value of the middle year is omitted to divide the time
series into two equal parts. Then the procedure described in ‘I’is followed.
A trend value of any future year may be predicted by multiplying the periodic increment by the number
of years into the future that is desired and adding the dresult to the best trend value listed in the series.
(b) Variations in a time series that are periodic in nature and occur regularly over short periods of time
during a year are called seasonal variations. By definition, these variations are precise and can be
forecasted.
The following are the methods of measuring seasonal variations:
Seasonal variation index or seasonal average method
Seasonal variation through moving averages
Chain or link relative method
Ratio to trend method
Seasonal average method: In the seasonal average method, the steps followed are described below.
1. The time series is arranged by years and monthes or quarters.
2. Total of each month or quarter over all the years are obtained.
3. The average for each month or quarter is obtained. The average may be mean or median. In
general, we take mean if not specified otherwise.
4. Taking the average of monthly or quarterly average equal to 100, seasonal index for each month or
quarter is calculated by the following formula:
5. Seasonal index for a month (or quarter) = monthly or quarterly average for the month ( or quarter
/ average or monthly (or quarterly ) averages X100
6. Seasonal variation through moving averages: it is also known as percentage of moving average
method. The steps involved in the computation of seasonal indices by this method are described below.
7. The moving averages of the data are computed. If the data are monthly then 12 monthly moving
averages, if they are quarterly, then 4 quarterly moving averages will be computed. In both the cases,
time periods of moving averages are even. Hence, these moving averages are to be centred.
8. Under additive model, from teach original valued, the corresponding moving average is deducted
to find out short time fluctuations, which is given as: Y-T = S+C+I
9. By preparing a separate table, monthly short time fluctuations are added for each month over all
the years and their average is obtained. These averages are known as seasonal variation s for each
month or quarter.
10. If we want to isolate measure irregular variations, the mean of the respective month or quarter is
deducted from he short time fluctuations.
Chain or link relative method
The steps involved in the chain or link relative method are described below.
1. Each quarterly or monthly value is divided by the preceding quarterly or monthly valued and the
result is multiplied by 100. These percentages are known as link relatives of the seasonal values.
2. The mean of the link relatives for each season is computed over all the years. Median can also be
taken instead of mean of the link relatives.
3. These average link relatives are converted into chain relatives. The chain relative of first is taken as
100.
4. The second chain relative of first is computed on the basis of the chain relative for the last
5. This chain relative may or may not be 100. It is not equal to 100 due to secular trend. If it is 100, go
to step 6 if it is not 100, go to step 5 and then go to step 6
6. Compute the difference ‘d’ between the new chain elatives first obtained in step 4 and chain
relative assumed as 100. ‘d’ is divided by the number of seasons and the resulting figure is multiplied by
1,2,3 and the product is deducted respectively from the chain relatives of 2nd, 3rd, and 4th quarters. These
are called corrected relatives.
7. The seasonal indices are obtained when the corrected chain relatives are expresses as percentage
of their relative averages.
Ratio to trend method: The steps to determine seasonal indices by this method are as described below.
1. Determine the trend values by the method of least squares.
2. To find ratio to trend, divide the original data by the corresponding trend values and multiply these
ratios by 100
3. Calculate the arithmetic mean of the trend ratios obtained in step2
4. Finally all the trend ratios will be converted into seasonal indices. For this, add all averages
obtained in step3 and find their general average. Seasonal indices are calculated by using the following
formula:
5. Seasonal indices = [ quarterly averages / general averages] x 100
4. The probability that a contractor will get an electrical job is 0.8, he will get a plumbing job is 0.6 and he will get both 0.48. What is the probability that he get at least one? Is the probabilities of getting electrical and plumbing job are independent?
Ans. The probability that a contractor will get an electrical job = 0.8
P(A) = 0.8
The probability that the contractor will get a plumbing job = 0.6
P(B)= 0.6
The probability that he will get both 0.48
Therefore, P( A П B) = 0.48
P( A U B ) = P (A) + P(B) – P(AПB)
= 0.8 + 0.6 – 0.48
= 0.92
Hence the probability that the person will get at least one job is 0.92.
5. (a) Discuss the errors that arise in statistical survey
(b) What is quota sampling and when do we use it?
Ans. The term error denotes the difference between population value and its estimate provided by sampling technique. Therefore, the term is not referred in its ordinary sense in statistics. There are four types of errors.
Sampling errors: the sample results are bound to differ from population results, since sample is only a small portion of the population. It is also known as inherent error and cannot be avoided. It is not worth to eliminate them completely. These errors may be due to the following factors:
Faulty selection of sample
Substitution of units to be studied
Faulty demarcation of sampling units
Error due to bias in estimation
However, the sampling errors follow random or chance variations and tend to cancel out each other on averaging.
Non –sampling errors: these are attributed to factors that can b controlled and eliminated by suitable actions. It is worth to eliminate these errors. They are due to the following factors:
Faulty planning, faulty definitions
Defective methods of interviewing
Personal bias of investigator
Lack of trained and qualified investigators
Respondent’s failure to answer
Improper coverage
Compiling errors
Publication errors
Biased errors: It arises in both census and sampling method. These errors occur due to personal bias of the investigator and the instruments used for measuring. They are also due to faculty collection of data, respondent’s bias and bias due to non response. Biased errors have a tendency to grow with sample size. Therefore, they are also known as cumulative errors. The magnitude of biased errors is directly proportional to the sample size.
(b) Quota sampling: It is a type of judgment sampling. Under this design, quotas are set up according to some specified characteristic such as age groups or income groups. From each group a specified number of units are sampled according to the quota allotted to the group. Within the group the selection of sample units depends on personal judgment. It has a risk of personal prejudice and bias entering the process. This method is often used in public opinion studies.
6. (a) Why do we use a chi-square test?
(b) Why do we use analysis of variance?
Ans. Chi-square is a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific hypothesis. For example, if, according to Mendel's laws, you expected 10 of 20 offspring from a cross to be male and the actual observed number was 8 males, then you might want to know about the "goodness to fit" between the observed and expected. Were the deviations (differences between observed and expected) the result of chance, or were they due to other factors. How much deviation can occur before you, the investigator, must conclude that something other than chance is at work, causing the observed to differ from the expected. The chi-square test is always testing what scientists call the null hypothesis, which states that there is no significant difference between the expected and observed result.
(b) When we have more than two populations, we have to use the analysis of variance to evaluate the mean differences between two or more populations. Analysis of variance (ANOVA) will enable us to test for the significance of the differences of variance among more than two sample means. Using analysis of variance, we will be able to make inferences about whether our samples are drawn from populations having the same mean or not.
Analysis of variance is useful in such situations as comparing the mileage achieved by five different brands of gasoline, testing which of four different training methods produce the fastest learning record, or comparing the first-year earnings of the graduates of half a dozen different business schools. In each of these cases, we would compare the means of more than two samples. Hence, in most of the fields, such as agriculture, medical, finance, banking, insurance, education, the concept of analysis of variance (ANOVA) is used.