MB0040 Statistics for Management Sem 1 Aug Spring Assignment

Fall 2011- August driveMBA SEMESTER 1 MB0040 STATISTICS FOR MANAGEMENT- 4 Credits (Book ID: B1129) Assignment Set- 1 (60 Marks) Note: Each question carries 10 Marks. Answer all the questions

1. (a) Statistics is the backbone of decision-making. Comment. [ 5 marks] (b) Give plural meaning of the word Statistics? [5 marks]

2. a. In a bivariate data on x and y, variance of x = 49, variance of y = 9 and covariance (x,y) = -17.5. Find coefficient of correlation between x and y. [ 5 marks] b. Enumerate the factors which should be kept in mind for proper planning. [ 5 marks] 3. The percentage sugar content of Tobacco in two samples was represented in table 11.11. Test whether their population variances are same. [ 10 marks] Table 1. Percentage sugar content of Tobacco in two samples

Sample A Sample B

2.4 2.7

2.7 3.0

2.6 2.8

2.1 3.1

2.5 2.2 3.6

4. a. Explain the characteristics of business forecasting. [ 5 marks] b. Differentiate between prediction, projection and forecasting. [ 5 marks] 5. What are the components of time series? Bring out the significance of moving average in analysing a time series and point out its limitations. [ 10 marks] 6. List down various measures of central tendency and explain the difference between them? [ 5 marks] b. What is a confidence interval, and why it is useful? What is a confidence level? [ 5 marks]

Fall 2011- August driveQ1. (a) Statistics is the backbone of decision-making. Comment. [5 marks] (b) Give plural meaning of the word Statistics? [5 marks]

(a). Due to advanced communication network, rapid changes in consumer behaviour, varied expectations of variety of consumers and new market openings, modern managers have a difficult task of making quick and appropriate decisions. Therefore, there is a need for them to depend more upon quantitative techniques like mathematical models, statistics, operations research and econometrics. Decision making is a key part of our day-to-day life. Even when we wish to purchase a television, we like to know the price, quality, durability, and maintainability of various brands and models before buying one. As you can see, in this scenario we are collecting data and making an optimum decision. In other words, we are using Statistics. Again, suppose a company wishes to introduce a new product, it has to collect data on market potential, consumer likings, availability of raw materials, feasibility of producing the product. Hence, data collection is the back-bone of any decision making process. Many organisations find themselves data-rich but poor in drawing information from it. Therefore, it is important to develop the ability to extract meaningful information from raw data to make better decisions. Statistics play an important role in this aspect. Statistics is broadly divided into two main categories. Below Figure illustrates the two categories. The two categories of Statistics are descriptive statistics and inferential statistics.

Fall 2011- August drive

Divisions in Statistics Descriptive Statistics: Descriptive statistics is used to present the general description of data which is summarized quantitatively. This is mostly useful in clinical research, when communicating the results of experiments. Inferential Statistics: Inferential statistics is used to make valid inferences from the data which are helpful in effective decision making for managers or professionals. Statistical methods such as estimation, prediction and hypothesis testing belong to inferential statistics. The researchers make deductions or conclusions from the collected data samples regarding the characteristics of large population from which the samples are taken. So, we can say Statistics is the backbone of decision-making.

(b) Give plural meaning of the word Statistics? The word statistics is used as the plural of the word Statistic which refers to a numerical quantity like mean, median, variance etc, calculated from sample value. In plural sense, the word statistics refer to numerical facts and figures collected in a systematic manner with a definite purpose in any field of study. In this sense, statistics are also aggregates of facts which are expressed in numerical form. For example, Statistics on industrial production, statistics or population growth of a country in different years etc. For Example: If we select 15 student from a class of 80 students, measure their heights and find the average height. This average would be a statistic.

Fall 2011- August driveQ2. a. In a bivariate data on x and y, variance of x = 49, variance of y = 9 and covariance (x,y) = -17.5. Find coefficient of correlation between x and y. [5 marks] b. Enumerate [5 marks] the factors which should be kept in mind for proper planning.

a. In a bivariate data on x and y, variance of x = 49, variance of y = 9 and covariance (x,y) = -17.5. Find coefficient of correlation between x and y As We know that r= xy N given that, r=17.5 (xyN=r) x=49=7, y=9=3. r= 17.5/7*3 This will be equal to 0.83 which means that there is a highly negative correlation b. Enumerate the factors which should be kept in mind for proper planning. The relevance and accuracy of data obtained in a survey depends upon the care exercised in planning. A properly planned investigation can lead to best results with least cost and time. Steps involved in the planning stage. x y


3. The percentage sugar content of Tobacco in two samples was represented in table 11.11. Test whether their population variances are same. [10 marks] Table 1.

Percentage sugar content of Tobacco in two samples Sample A Sample B 2.4 2.7 2.7 3.0 2.6 2.8 2.1 3.1 2.5 2.2 3.6

Table 1. Percentage sugar content of Tobacco in two samples

Required values of the method I to calculate sample mean

X 2.4 2.7 2.6 2.1 2.5 Total

d= X-2.5 0.1 -0.2 -0.1 0.4 0 0.2

d2 0.01 0.04 0.01 0.16 0 0.22

Required values of the method II to calculate sample mean X 2.7 3 2.8 3.1 2.2 3.6 Total d= X - 3 0.3 0 0.2 -0.1 0.8 -0.6 0.6 d2 0.09 0 0.04 0.1 0.64 0.36 1.23


= 0.053.

= 0.244 not significant.

Q4. (a.) Explain the characteristics of business forecasting.

[ 5 marks]

(b.) Differentiate between prediction, projection and forecasting. [5 marks] (a).Characteristics of business forecasting Characteristics of Business Forecasting y Based on past and present conditions: The business forecasting is based on past and present economic condition of the business. To forecast the future, various data, information and facts concerning to economic condition of business for past and present are analysed. Based on mathematical and statistical methods: The process of forecasting includes the use of statistical and mathematical methods. By using these methods the actual trend which may take place in future can forecasted. Period: The forecasting can be made for long term, short term, medium term or any specific term. Estimation of future: The business forecasting is to forecast the future regarding probable economic conditions. Scope: The forecasting can be physical as well as financial.

y

y y y

Fall 2011- August driveSteps in Forecasting:The forecasting of business fluctuations consists of the following steps: y Understanding why changes in the past have occurred: One of the basic principles of statistical forecasting is that the forecaster should use the data on past performance. The current rate and changes in the rate constitute the basis of forecasting. Once they are known various mathematical techniques can develop projections from them. If an attempt is made to forecast business fluctuations without understanding why past changes have taken place, the forecast will be purely mechanical based solely upon the application of mathematical formulae and subject to series error. Determining which phases of business activity must be measured: After it knows why business fluctuations have occurred, it is necessary to measure certain phase of business activity in order to predict what changes will probably follow the present level of activity. Selecting and compiling data to be used as measuring devices: This is an independent relationship between the selection of statistical data and determination of why business fluctuations occur. Statistical data cannot be collected and analysed in an intelligent manner unless there is a sufficient understanding of business fluctuations. It is important that reasons for business fluctuations be stated in such a manner that is possible to secure data that are related to the reasons. Analysing the data: Lastly, the data are analysed in the light of understanding of the reason why change occurs. For example, if it is reasoned that a certain combination of forces will result in a Statistics for Management Unit 13 Sikkim Manipal University 209 given change, the statistical part of the problem is to measure these forces, from the data available, to draw conclusions on the future course of action. The methods of drawing conclusions may be called forecasting techniques.

y

y

y

Methods of Business Forecasting:Almost all the businessmen make forecasting about the business conditions related to their business. In recent years scientific methods of forecasting have been developed. The base of scientific forecasting is statistics. To handle the increasing variety of managerial forecasting problems, several forecasting techniques have been developed in recent years. Forecasting techniques vary from simple expert guesses to complex analysis of mass data. Each technique has its special use, and care must be taken to select the correct technique for a particular situation. Before applying a method of forecasting the following questions should be answered: y y y What is the purpose of the forecast how is it to be used? What are the dynamics and components of the system for which the forecast will be made? How important is the past in estimating the future?


(b). Differentiate between prediction, projection and forecasting

Business forecasting has always been one component of running an enterprise. However, forecasting traditionally was based less on concrete and comprehensive data than on face-to-face meetings and common sense. In recent years, business forecasting has developed into a much more scientific endeavour, with a host of theories, methods, and techniques designed for forecasting certain types of data. The development of information technologies and the Internet propelled this development into overdrive, as companies not only adopted such technologies into their business practices, but into forecasting schemes as well. In the 2000s, projecting the optimal levels of goods to buy or products to produce involved sophisticated software and electronic networks that incorporate mounds of data and advanced mathematical algorithms tailored to a company's particular market conditions and line of business. Business forecasting involves a wide range of tools, including simple electronic spread sheets; enterprise resource planning (ERP) and electronic data interchange (EDI) networks, advanced supply chain management systems, and other Webenabled technologies. The practice attempts to pinpoint key factors in business production and extrapolate from given data sets to produce accurate projections for future costs, revenues, and opportunities. This normally is done with an eye toward adjusting current and near-future business practices to take maximum advantage of expectations. In the Internet age, the field of business forecasting was propelled by three interrelated phenomena. First, the Internet provided a new series of tools to aid the science of business forecasting. Second, business forecasting had to take the Internet itself into account in trying to construct viable models and make predictions. Finally, the Internet fostered vastly accelerated transformations in all areas of business that made the job of business forecasters that much more exacting. By the 2000s, as the Internet and its myriad functions highlighted the central importance of information in economic activity, more and more companies came to recognize the value, and often the necessity, of business forecasting techniques and systems. Business forecasting is indeed big business, with companies investing tremendous resources in systems, time, and employees aimed at bringing useful projections into the planning process. According to a survey by the Hudson, Ohio-based Answer Think Consulting Group, which specializes in studies of business planning, the average U.S. Company spends more than 25,000 person-days on business forecasting and related activities for every billion dollars of revenue. Companies have a vast array of business forecasting systems and software from which to choose, but choosing the correct one for their particular needs requires a good deal of investigation. According to the Journal of Business Forecasting Methods & Systems, any forecasting system


Q5. What are the components of time series? Bring out the significance of moving average in analysing a time series and point out its limitations. Components of Time Series The behavior of a time series over periods of time is called the movement of the time series. The time series is classified into the following four components: i) Long term trend or secular trend ii) Seasonal variations iii) Cyclic variations iv) Random variations Method of moving averages Moving averages method is used for smoothing the time series. That is, it smoothest the fluctuations of the data by the method of moving averages. When period of moving average is odd To determine the trend by this method, the procedure is described in.

To determine the trend by this method, the procedure is described in Procedure for determining the trend when moving average is odd By plotting these trend values (if desired) you can obtain the trend curve with the help of which you can determine the trend whether it is increasing or decreasing. If needed, you can also compute short-term fluctuations by subtracting the trend values from the actual


values. When period of moving averages is even When period of moving average is even (such as 4 years), we compute the moving averages by using the steps described in below

Procedure for determining the trend when moving average is even Merits and demerits of moving averages method Merits Demerits No functional relationship between the values and the time. Thus, this method This is a simple method. is not helpful in forecasting and predicting the values on the basis of time. This method is objective in the sense that anybody working on a problem with this method will get the same results. No trend values for some years in the beginning and some in the end. For example, for 5 yearly moving average, there will be no trend values for the first two years and the last three years.


This method is used for determining seasonal, cyclic and irregular variations besides the trend values. This method is flexible enough to add more figures to the data because the entire calculations are not changed. If the period of moving averages coincides with the period of cyclic fluctuations in the data, such fluctuations are automatically eliminated

In case of nonlinear trend, the values obtained by this method are biased in one or the other direction

The period selection of moving average is a difficult task. Hence, great care has to be taken in period selection, particularly when there is no business cycle during that time

Q 6. List down various measures of central tendency and explain the difference between them? Measures of Central Tendency Several different measures of central tendency are defined below. 1 Arithmetic Mean The arithmetic mean is the most common measure of central tendency. It simply the sum of the numbers divided by the number of numbers. The symbol m is used for the mean of a population. The symbol M is used for the mean of a sample. The formula for m is shown below:

Where X is the sum of all the numbers in the numbers in the sample and N is the number of numbers in the sample. As an example, the mean of the numbers 1 + 2 + 3+ 6 + 8 = 20/5 = 4 regardless of whether the numbers constitute the entire population or just a sample from the population. The table, Number of touchdown passes (Table 1: Number of touchdown passes), shows the number of touchdown (TD) passes thrown by each of the 31 teams in the National Football League in the 2000 season The mean number of touchdown passes thrown is 20.4516 as shown below.


Number of touchdown passes 37 22 19 14 33 22 19 14 33 22 18 14 32 21 18 12 29 21 18 12 28 21 18 9 28 20 16 6 23 20 15

Although the arithmetic mean is not the only "mean" (there is also a geometric mean), it is by far the most commonly used. Therefore, if the term "mean" is used without specifying whether it is the arithmetic mean, the geometric mean, or some other mean, it is assumed to refer to the arithmetic mean. Median The median is also a frequently used measure of central tendency. The median is the midpoint of a distribution: the same number of scores is above the median as below it. For the data in the table, Number of touchdown passes (Table 1: Number of touchdown passes), there are 31 scores. The 16th highest score (which equals 20) is the median because there are 15 scores below the 16th score and 15 scores above the 16th score. The median can also be thought of as the 50th percentile3. Let's return to the made up example of the quiz on which you made a three discussed previously in the module Introduction to Central Tendency4 and shown in Table 2: Three possible datasets for the 5-point make-up quiz. Three possible datasets for the 5-point make-up quiz

Student You Jhon Maria Jasmir Sam

Dataset 1 3 3 3 3 3

Dataset 2 3 4 4 4 5

Dataset 3 3 2 2 2 1

For Dataset 1, the median is three, the same as your score. For Dataset 2, the median is 4. Therefore, your score is below the median. This means you are in the lower half of the class. Finally for Dataset 3, the median is 2. For this dataset, your score is above the median and therefore in the upper half of the distribution. 3 mode. The mode is the most frequently occurring value. For the data in the table, Number of touchdown passes (Table 1: Number of touchdown passes), the mode is 18 since more

Fall 2011- August driveteams (4) had 18 touchdown passes than any other number of touchdown passes. With continuous data such as response time measured to many decimals, the frequency of each value is one since no two scores will be exactly the same (see discussion of continuous variables5). Therefore the mode of continuous data is normally computed from a grouped frequency distribution. The Grouped frequency distribution (Table 3: Grouped frequency distribution) table shows a grouped frequency distribution for the target response time data. Since the interval with the highest frequency is 600-700, the mode is the middle of that interval (650). Grouped frequency distribution

Range 500-600 600-700 700-800 800-900 900-1000 1000-1100

Frequency 3 6 5 5 0 1

Proportions and Percentages When the focus is on the degree to which a population possesses a particular attribute, the measure of interest is a percentage or a proportion. A proportion refers to the fraction of the total that possesses a certain attribute. For example, we might ask what proportion of women in our sample weigh less than 135 pounds. Since 3 women weigh less than 135 pounds, the proportion would be 3/5 or 0.60. A percentage is another way of expressing a proportion. A percentage is equal to the proportion times 100. In our example of the five women, the percent of the total who weigh less than 135 pounds would be 100 * (3/5) or 60 percent. Notation Of the various measures, the mean and the proportion are most important. The notation used to describe these measures appears below:

X: Refers to a population mean. x: Refers to a sample mean.

Fall 2011- August drive P: The proportion of elements in the population that has a particular attribute. p: The proportion of elements in the sample that has a particular attribute. Q: The proportion of elements in the population that does not have a specified attribute. Note that Q = 1 - P. q: The proportion of elements in the sample that does not have a specified attribute. Note that q = 1 - p.

Q6 b. What is a confidence interval, and why it is useful? What is a confidence level? Confidence Intervals In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval (i.e. it is calculated from the observations), in principle different from sample to sample, that frequently includes the parameter of interest, if the experiment is repeated. How frequently the observed interval contains the parameter is determined by the confidence level or confidence coefficient. A confidence interval with a particular confidence level is intended to give the assurance that, if the statistical model is correct, then taken over all the data that might have been obtained, the procedure for constructing the interval would deliver a confidence interval that included the true value of the parameter the proportion of the time set by the confidence level. More specifically, the meaning of the term "confidence level" is that, if confidence intervals are constructed across many separate data analyses of repeated (and possibly different) experiments, the proportion of such intervals that contain the true value of the parameter will approximately match the confidence level; this is guaranteed by the reasoning underlying the construction of confidence intervals. A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained. (An interval intended to have such a property, called a credible interval, can be estimated using Bayesian methods; but such methods bring with them their own distinct strengths and weaknesses).The confidence level sets the boundaries of a confidence interval, this is conventionally set at 95% to coincide with the 5% convention of statistical significance in hypothesis testing. In some studies wider (e.g. 90%) or narrower (e.g. 99%) confidence intervals will be required. This rather depends upon the nature of your study. You should consult a statistician before using CI's other than 95%. You will hear the terms confidence interval and confidence limit used. The confidence interval is the range Q-X to Q+Y where Q is the value that is central to the study question, Q-X is he lower confidence limit and Q+Y is the upper confidence limit. Familiarize yourself with alternative CI interpretations: Common A 95% CI is the interval that you are 95% certain contains the true population value as it might be estimated from a much larger study.

Fall 2011- August driveThe value in question can be a mean, difference between two means, a proportion etc. The CI is usually, but not necessarily, symmetrical about this value.

Pure Bayesian The Bayesian concept of a credible interval is sometimes put forward as a more practical concept than the confidence interval. For a 95% credible interval, the value of interest (e.g. size of treatment effect) lies with a 95% probability in the interval. This interval is then open to subjective molding of interpretation. Furthermore, the credible interval can only correspond exactly to the confidence interval if prior probability is so called "uninformative". Pure frequentist Most pure frequentists say that it is not possible to make probability statements, such CI interpretation, about the study values of interest in hypothesis tests. Neymanian A 95% CI is the interval which will contain the true value on 95% of occasions if a study were repeated many times using samples from the same population. Neyman originated the concept of CI as follows: If we test a large number of different null hypotheses at one critical level, say 5%, and then we can collect all of the rejected null hypotheses into one set. This set usually forms a continuous interval that can be derived mathematically and Neyman described the limits of this set as confidence limits that bound a confidence interval. If the critical level (probability of incorrectly rejecting the null hypothesis) is 5% then the interval is 95%. Any values of the treatment effect that lie outside the confidence interval are regarded as "unreasonable" in terms of hypothesis testing at the critical level.


MBA SEMESTER 1 MB0040 STATISTICS FOR MANAGEMENT- 4 Credits (Book ID: B1129) Assignment Set- 2 (60 Marks) Note: Each question carries 10 Marks. Answer all the questions 1. (a) What are the characteristics of a good measure of central tendency? [ 5 marks] (b) What are the uses of averages? [ 5 marks] 2. Calculate the 3 yearly and 5 yearly averages of the data in table below. [ 10 marks] Table 1: Production data from 1988 to 1997 Year 1988 1989 1990 1991 1992 1993 1994 1995 1996 18 16 22 19 24 20 28 22 1997 30

Production 15 (in Lakh ton)

3. (a) What is meant by secular trend? Discuss any two methods of isolating trend values in a time series. [ 5 Marks] (b)What is seasonal variation of a time series? Describe the various methods you know to evaluate it and examine their relative merits. [ 5 marks] 4. The probability that a contractor will get an electrical job is 0.8, he will get a plumbing job is 0.6 and he will get both 0.48. What is the probability that he get at least one? Is the probabilities of getting electrical and plumbing job are independent? [ 10 marks] 5. (a) Discuss the errors that arise in statistical survey. (b) What is quota sampling and when do we use it? 6. (a) Why do we use a chi-square test? (b) Why do we use analysis of variance? [ 5 marks] [ 5 marks] [ 5 marks] [ 5 marks]


Q1. (a) What are the characteristics of a good measure of central tendency? [ 5 marks] (b) What are the uses of averages? (a).Characteristics of a Good Average (i) It should be rigidly defined. If an average is left to the estimation of an observer and if it is not a definite and fixed value it cannot be representative of a series. The bias of the investigator in such cases would considerably affect the value of the average. If the average is rigidly defined; this instability in its value would be no more, and it would always be a definite figure, (ii) It should be based on all the observations of the series. If some of the items of the series are not taken into account in its Calculation the average cannot be said to be a representative one. As we shall see later on there are some averages which do not take into account all the values of a group and to this extent they are not satisfactory averages. (iii) It should be capable of further algebraic treatment. If an average dose not possess this quality, its use is bound to be very limited. It will not be possible to calculate, say, the combined average of two or more series from their individual averages; further it will not be possible to study the average relationship of various parts of a variable if it is expressed as the sum of two or more variables. Many other similar studies would not be possible if the average is not capable of further algebraic treatment. (iv) It should be easy to calculate and simple to follow. If the calculation of the average involves tedious mathematical processes it will not be readily understood and its use will be confined only to a limited number of persons. It can never be a popular average. As such, one of the qualities of a good average is that it should not be too abstract or mathematical and there should be no difficulty in its calculation. Further, the properties of the average should be such that they can be easily understood by persons of ordinary intelligence. (v) It should not be affected by fluctuations of sampling. If two independent sample studies are made in any particular field, the averages thus obtained, should not materially differ from each other. No doubt, when two separate enquires are made, there is bound to be a difference, in the average values calculated but in some cases this difference would be great while in others comparatively less. These averages in which this difference, which is technically called "fluctuation of sampling" is less, are considered better than those in which its difference is more. One more thing to be remembered about averages is that the items whose average is being calculated should form a homogenous group. It is absurd to talk about the average of a man's height and his weight. If the data from which an average is being calculated are not homogeneous, misleading conclusions are likely to be drawn. To find out the average production of cotton cloth per mill, if big and small mills are not separated the average would be unrepresentative. Similarly, to study wage level in cotton mill industry of India, separate averages should be calculated for the male and female workers. Again, adult workers should be separately studied from the juvenile group. Thus we see that as far as possible, the data from which an average is calculated should be a homogeneous lot. Homogeneity can be achieved either by selecting only like items or by dividing the heterogeneous data into a number of homogeneous groups. [ 5 marks]

Fall 2011- August drive(b) What are the uses of averages? The use or application of a particular average depends upon the purpose of the investigation. Some of the cases of different averages are as follows: Arithmetic Mean Arithmetic mean is considered a deal average. It is frequently used in all the aspects of life. It possesses many mathematical properties and due to this it is of immense utility in further statistical analysis. In economic analysis arithmetic mean is used extensively to calculate average production, average wage, average cost, per capital income exports, imports, consumption, prices, etc. When different items of a series have different relative importance, then weighted arithmetic mean is used. Geometric Mean Use of Geometric mean is important in a series having items of wide dispersion. It is used in the construction of index number. The averages of proportions, percentages and compound rates are computed by geometric mean. The growth of population is measured in it as population increases in geometric progression. Harmonic Mean Harmonic mean is applied in the problems where small items must get more relative importance than the large ones. It is useful in cases where time, speed, values given in quantities, rate and prices are involved. But in practice, it has little applicability. Median and partition Values Median and partition values are positional measures of central tendency. There are mainly used in the qualitative cases like honestly, intelligence, ability, etc. In the distributions which are positively skewed, median is a more suitable average. These are also suitable for the problems of distribution of income, wealth, investment, etc. Mode Mode is also positional average. Its applicability of daily problems is increasing. Mode is used to calculate the 'modal size of a collar', 'modal size of shore', or 'modal size of readymade garments' etc. It is also used in the sciences of Biology, Meteorology, Business and Industry.


Date post:	13-Jul-2015
Category:	Documents
Upload:	salim-kottayil
View:	700 times
Download:	2 times

MB0040 Statistics for Management Sem 1 Aug Spring Assignment

Documents