Date post: | 06-Apr-2018 |
Category: |
Documents |
Upload: | satish-patil |
View: | 225 times |
Download: | 0 times |
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 1/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
CANDIDATE NAME: Satish Patil
ROLL NUMBER: 521053391
LEARNING CENTER: 01736
COURSE: Master of Business Administration
SEMISTER: I
SUBJECT NAME: MB0040 – STATISTICS FOR MANAGEMENT
ASSIGNMENT NO: Set-1
DATE OF SUBMISSION AT THE LEARNINGCENTRE: 10 Dec 2010
FACULTY SIGNATURE:
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 2/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
MBA SEMESTER 1 MB0040 – STATISTICS FOR MANAGEMENT- 4 Credits
(Book ID: B1129)
Assignment Set- 1 (60 Marks)
Note: Each question carries 10 Marks. Answer all the questions
1. Why it is necessary to summarise data? Explain the approaches available to
summarize the data distributions?
Answer:
Graphical representation is a good way to represent summarized data. However, graphs
Provide us only an overview and thus may not be used for further analysis. Hence, we use
Summary statistics like computing averages. to analyses the data. Mass data, which is
Collected, classified, tabulated and presented systematically, is analyzed further to bring its Size
to a single representative figure. This single figure is the measure which can be found at Central
part of the range of all values. It is the one which represents the entire data set. Hence, this is
called the measure of central tendency.
In other words, the tendency of data to cluster around a figure which is in central location is
known as central tendency. Measure of central tendency or average of first order describes the
concentration of large numbers around a particular value. It is a single value which Represents allunits.
Statistical Averages: The commonly used statistical averages are arithmetic mean, Geometric
mean, harmonic mean.
Arithmetic mean: is defined as the sum of all values divided by number of values and is
represented by X.
Before we study how to compute arithmetic mean, we have to be familiar with the terms such as
discrete data, frequency and frequency distribution, which are used in this unit.
If the number of values is finite, then the data is said to be discrete data. The number of
occurrences of each value of the data set is called frequency of that value. A systematic
Presentation of the values taken by variable together with corresponding frequencies is called a
frequency distribution of the variable.
Median: Median of a set of values is the value which is the middle most value when they are
arranged in the ascending order of magnitude. Median is denoted by M.
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 3/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
Mode: Mode is the value which has the highest frequency and is denoted by Z.
Modal value is most useful for business people. For example, shoe and readymade garment
manufacturers will like to know the modal size of the people to plan their operations. For
discrete data with or without frequency, it is that value corresponding to highest frequency.
Appropriate Situations for the use of Various Averages
1. Arithmetic mean is used when:
a. In depth study of the variable is needed
b. The variable is continuous and additive in nature
c. The data are in the interval or ratio scale
d. When the distribution is symmetrical
2. Median is used when:
a. The variable is discreteb. There exist abnormal values
c. The distribution is skewed
d. The extreme values are missing e. The characteristics studied are qualitativef. The data are on the ordinal scale
3. Mode is used when:a. The variable is discreteb. There exist abnormal valuesc. The distribution is skewedd. The extreme values are missing e. The characteristics studied are qualitative
4. Geometric mean is used when:a. The rate of growth, ratios and percentages are to be studiedb. The variable is of multiplicative nature5. Harmonic mean is used when:a. The study is related to speed, timeb. Average of rates which produce equal effects has to be found
Positional AveragesMedian is the mid-value of series of data. It divides the distribution into two equal portions.Similarly, we can divide a given distribution into four, ten or hundred or any other number of equal portions.
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 4/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
2. Explain the purpose of tabular presentation of statistical data. Draft a form of
tabulation to show the distribution of population according to i) Community by
age, ii) Literacy , iii) sex , and iv) marital status.
Answer:
Tabulation is an orderly arrangement of data in columns and rows systematically in a
tabular form. It is the logical listing of related quantitative data in vertical columns and
horizontal rows. The presentation of data in tables should be simple, systematic and
unambiguous.
The purpose of tabular presentation of statistical data is to:
Simplify complex data: Tabulation simplifies the complex data by presenting
them systematically in columns and rows in a condensed form. It avoids all theunnecessary data that is found in a narrative form.
Highlight important characteristics: It also helps to highlight the importantcharacteristics of the data. As it avoids all the unnecessary data that is usually found in a narrative form.
Present data in minimum space: Tabulation achieves economy in using thespace for presenting the data. The textual matter is presented neatly in a shortform without sacrificing utility of the data.
Facilitate comparison: The data presented in a tabular form is helpful for acomparative study. The relationship among the various items can be easily understood.
Bring out trends and tendencies: Tabulation depicts the data and theirsignificance at first in the form of figures, which cannot be understood when thesame data are in a narrative form.
Facilitate further analysis: The Tabulation is analytical in nature and hence ithelps in further analysis.
Marital
StatusSex Educated
Non-
Educated
AgeBelow 20
Yrs20-40 Yrs
Above
40 Yrs
Below
20 Yrs
20-40
Yrs
Above
40 Yrs
MarriedMale
Female
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 5/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
Unmarried Male
Female
3. Give a brief note of the measures of central tendency together with their merits
& Demerits. Which is the best measure of central tendency and why?
Answer:
Condensation of data is necessary for a proper statistical analysis. A large number of
big numbers are not only confusing to mind but also difficult to analyze. After a
thorough scrutiny of collected data, classification which is a process of arranging data
into different homogenous classes according to resemblances and similarities is carried
out first. Then of course tabulation of data is resorted to. The classification and
tabulation of the collected data besides removing the complexity render condensation
and comparison. An average is defined as a value which should represent the whole
mass of data. It is a typical or central value summarizing the whole data. It is also called
a measure of central tendency for the reason that the individual values in the data show
some tendency to centre about this average. It will be located in between the minimum
and the maximum of the values in the data.
There are five types of average which are:
1. Arithmetic Mean
2. Median
3. Mode
4. Geometric and Harmonic Mean
5. Arithmetic Mean
The Arithmetic mean or simply the mean is the best known easily understood and
most frequently used average in any statistical analysis. It is defined as the sum of all
the values in the data.
Median: Median is another widely known and frequently used average. It is defined asthe most central or the middle most value of the data given in the form of an array. By
an array, we mean an arrangement of the data either in ascending order or descending
order of magnitude. In the case of ungrouped data one has to form an array first and
then locate the middle most value which is the median. For ungrouped data the median
is fixed by using,
Median = [n+1/2] the value in the array.
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 6/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
Mode: The word mode seems to have been derived French 'a la mode' which means
'that which is in fashion'. It is defined as the value in the data which occurs most
frequently. In other words, it is the most frequently occurring value in the data. For
ungrouped data we form the array and then fix the mode as the value which occurs
most frequently. If all the values are distinct from each other, mode cannot be fixed.
For a frequency distribution with just one highest frequency such data are called uni-
modal or two highest frequencies [such data are called bimodal], mode is found by
using the formula,
Mode=l+cf2/f1+f2
Where l is the lower limit of the model class, c is its class interval f1 is the frequency
preceding the highest frequency and f2 is the frequency succeeding the highest
frequency.
Relative merits and demerits of Mean, Median and Mode
Mean: The mean is the most commonly and frequently used average. It is a simple
average, understandable even to a layman. It is based on all the values in a given data.
It is easy to calculate and is basic to the calculation of further statistical measures of
dispersion, correlation etc. Of all the averages, it is the most stable one. However it has
some demerits. It gives undue weightages to extreme value. In other words it is greatly
influenced by extreme values. Moreover; it cannot be calculated for data with open -
ended classes at the extreme. It cannot be fixed graphically unlike the median or the
mode. It is the most useful average of analysis when the analysis is made with full
reference to the nature of individual values of the data. In spite of a few shortcomings;
it is the most satisfactory average.
Median: The median is another well-known and widely used average. It is well-defined
formula and is easily understood. It is advantageously used as a representative value of
such factors or qualities which cannot be measured. Unlike the mean, median can be
located graphically. It is also possible to find the median for data with open ended
classes at the extreme. It is amenable for further algebraic processes. However, it is an
average, not based on all the values of the given data. It is not as stable as the mean. It
has only a limited use in practice.
Mode: It is a useful measure of central tendency, as a representative of the majority of
values in the data. It is a practical average, easily understood by even laymen. Its
calculations are not difficult. It can be ascertained even for data with open-ended
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 7/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
classes at the extreme. It can be located by graphical means using a frequency curve.
The mode is not based on all the values in the data. It becomes less useful when the
data distribution is not uni-model. Of all the averages, it is the most unstable average.
4. Machines are used to pack sugar into packets supposedly containing 1.20 kg
each. On testing a large number of packets over a long period of time, it was
found that the mean weight of the packets was 1.24 kg and the standard
deviation was 0.04 Kg. A particular machine is selected to check the total
weight of each of the 25 packets filled consecutively by the machine. Calculate
the limits within which the weight of the packets should lie assuming that themachine is not been classified as faulty.
Answer:
Mean weight of the packets = 1.24 kg
• Standard Deviation, SD = 0.04kg
• Variance = 0.04^2 = 0.0016
• Standard Error, SE = 0.04/sqrt (25)
= 0.04/5 = 0.008
• Considering 99.7% confidence level
The means will lie between (1.2+3SE) and (1.2-3SE)
• Upper limit is 1.224kg
• Lower Limit is 1.176kg
5. A packaging device is set to fill detergent power packets with a mean weight of
5 Kg. The standard deviation is known to be 0.01 Kg. These are known to drift
upwards over a period of time due to machine fault, which is not tolerable. Arandom sample of 100 packets is taken and weighed. This sample has a mean
weight of 5.03 Kg and a standard deviation of 0.21 Kg. Can we calculate that
the mean weight produced by the machine has increased? Use 5% level of
significance.
Answer:
Mean weight of packages, X1 = 5kg
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 8/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
SD1 = 0.01kg
Sample size, N= 100
Sample mean weight, X2 = 5.03kg
SD2 = 0.21kg
Using 95% confidence level
Z = 1.96
1.96 = [(X-X2)/SD1]/sqrt N
1.96 = [(X-5.03)/0.01]/sqrt (100)
Mean Weight X = 5.226kgs
6. Find the probability that at most 5 defective bolts will be found in a box of 200
bolts if it is known that 2 per cent of such bolts are expected to be defective
.(you may take the distribution to be Poisson; e-4= 0.0183).
Answer:
Poisson distribution
A Poisson random variable is the number of successes that result from a Poisson
experiment.
The probability distribution of a Poisson random variable is called a Poisson
distribution.
Given the mean number of successes (µ) that occur in a specified region, we can
compute the Poisson probability based on the following formula:
Poisson Formula. Suppose we conduct a Poisson experiment, in which the average
number of successes within a given region is µ. Then, the Poisson probability is:
P( x ; µ) = (e-µ) (µx) / x!
Where x is the actual number of successes that result from the experiment and e is
approximately equal to 2.71828.
The Poisson distribution has the following properties:
The mean of the distribution is equal to µ
The variance is also equal to µ
M = 5
PX = 0.0183*4/5
=0.01464
Thus, the probability that at most 5 defective bolts will be found in a box of 200 bolts
is 0.01464
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 9/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
CANDIDATE NAME: Satish Patil
ROLL NUMBER: 521053391
LEARNING CENTER: 01736
COURSE: Master of Business Administration
SEMISTER: I
SUBJECT NAME: MB0040 – STATISTICS FOR MANAGEMENT
ASSIGNMENT NO: Set-2
DATE OF SUBMISSION AT THE LEARNINGCENTRE: 10 Dec 2010
FACULTY SIGNATURE:
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 10/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
MBA SEMESTER 1
MB0040 – STATISTICS FOR MANAGEMENT- 4 Credits
(Book ID: B1129)
Assignment Set- 2 (60 Marks)
Note: Each question carries 10 Marks. Answer all the questions
1. What do you mean by Statistical Survey? Differentiate between
“Questionnaire” and “Schedule”.
Answer:
A statistical survey is a scientific process of collection and analysis of numerical data. Statisticalsurvey are use to collect numerical information about units in population. Surveys involve asking
questions to individuals. Surveys of human populations are common in government, health,
social science and marketing sectors.
Stages of Statistical Survey-
Statistical surveys are categorized into two stages- Planning and
Execution.
1) Planning a Statistical Survey- The relevance and accuracy of data obtained in a survey
depends upon the care exercised in planning. A properly planned investigation can lead
to best result with least cost and time.
A. Nature of the problem to investigate should be clearly defined in an unambiguous
manner.
B. Objective of investigation should be stated at the outset objectives could be to:
➢Obtain certain estimates
➢ Establish a theory
➢ Verify an existing statement
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 11/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
➢ Find relationship between characteristics
A . The scope of investigation has to be made clear. The scope of investigation refers to the area
to be covered, identification of units to be studied, nature of characteristics to be observed,
accuracy of measurements, analytical methods, time, cost and other resources required.
B. Whether to use data collected from primary or secondary source should be determination in
advance.
C. The organization of investigation I the final step in the process. It encompasses the
determination of the number of investigators required, their training, and supervision work
needed
funds required.
Execution of Statistical Survey- Control methods should be adopted at every stage of carrying out the investigation to check the accuracy, coverage, methods of measurements,analysis and interpretation. The collected data should be edited, classified, tabulated andpresented in diagrams and graphs. The data should be carefully and systematically analyzed andinterpreted.
Differentiate between “Questionnaire” and “Schedule:
A questionnaire is a list of some pre designed and systematically arranged questions pertaining to
the subject of enquiry. It is meant for the whole population group, from whom the data are
collected. Since a questionnaire is intended for the common men, it is necessary that a
questionnaire is made with due care so that necessary data may be easily collected. On the other
hand, a schedule is only a list of items, on which the data collector gathers data. So it is meant forthe investigator rather than the one who answers. Therefore unlike a questionnaire, a schedule
need not be a complete one, even for a schedule the questions may be written in incomplete
sentences, since it solely depends upon the enquirer how he is asking the questions to the people.
2. The table shows the data of Expenditure of a family on food, clothing,
education, rent and other items.
Items Expenditure
Food 4300
Clothing 1200
Education 700
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 12/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
Rent 2000
Others 600
Depict the data shown in the table using Pie chart.
Answer:
3. Average weight of 100 screws in box ‘A’ is 10.4 gms. It is mixed with 150
screws of box ‘B’. Average weight of mixed screws is 10.9 gms. Find the
average weight of screws of box ‘B’.
Answer:
Avg weight of BOX A = 10.4 gms
Avg weight of Mixed BOX = 10.9 gms
Avg Weight of BOX B = 0.5 gms
4. (a) Discuss the rules of “Probability”.
(b) What is meant by “Conditional Probability”?
Answer:
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 13/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
Managers very often come across with situations where they have to take decisions aboutimplementing either course of action A or course of action B or course of action C. Sometimes,they have to take decisions regarding the implementation of both A and B.
• Addition rule
The addition rule of probability states that:
i) If ‘A’ and ‘B’ are any two events then the probability of the occurrence of either ‘A’ or ‘B’ is
given by:
ii) If ‘A’ and ‘B’ are two mutually exclusive events then the probability of occurrence of either
A or B is given by:
iii) If A, B and C are any three events then the probability of occurrence of either A or B or C
is given by:
In terms of Venn diagram, from the figure 5.4, we can calculate the probability of occurrence of either event ‘A’ or event ‘B’, given that event ‘A’ and event ‘B’ are dependent events. From thefigure 5.5, we can calculate the probability of occurrence of either ‘A’ or ‘B’, given that, events‘A’ and ‘B’ are independent events. From the figure 5.6, we can calculate the probability of occurrence of either ‘A’ or ‘B’ or ‘C’, given that, events ‘A’, ‘B’ and ‘C’ are dependent events.
iv) If A1, A2, A3………, An are ‘n’ mutually exclusive and exhaustive events then the
probability of occurrence of at least one of them is given by:
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 14/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
• Multiplication rule
If ‘A’ and ‘B’ are two independent events then the probability of occurrence of ‘A’ and ‘B’ isgiven by:
Conditional Probability
Sometimes we wish to know the probability that the price of a particular petroleum product willrise, given that the finance minister has increased the petrol price. Such probabilities are knownas conditional probabilities.
Thus the conditional probability of occurrence of an event ‘A’ given that the event ‘B’ has
already occurred is denoted by P (A / B). Here, ‘A’ and ‘B’ are dependent events. Therefore, wehave the following rules.
If ‘A’ and ‘B’ are dependent events, then the probability of occurrence of ‘A and B’ is given by:
It follows that:
For any bivariate distribution, there exist two marginal distributions and‘m + n’ conditional distributions, where ‘m’ and ‘n’ are the number of classifications/characteristics studied on two variables.
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 15/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
5. (a) What is meant by “Hypothesis Testing”? Give Examples
(b) Differentiate between “Type-I” and “Type-II” Errors
Answer:
Hypothesis Testing: The Basics
Say I hand you a coin. How would you tell if it’s fair? If you flipped it 100 times and it came up
heads 51 times, what would you say? What if it came up heads 5 times, instead?
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 16/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
In the first case you’d be inclined to say the coin was fair and in the second case you’d be
inclined to say it was biased towards tails. How certain are you? Or, even more specifically, how
likely is it actually that the coin is fair in each case?
Hypothesis Testing
Questions like the ones above fall into a domain called hypothesis testing . Hypothesis testing is a
way of systematically quantifying how certain you are of the result of a statistical experiment.
In the coin example the “experiment” was flipping the coin 100 times. There are two questions
you can ask. One, assuming the coin was fair, how likely is it that you’d observe the results we
did? Two, what is the likelihood that the coin is fair given the results you observed?
Of course, an experiment can be much more complex than coin flipping. Any situation whereyou’re taking a random sample of a population and measuring something about it is an
experiment, and for our purposes this includes A/B testing.
Let’s focus on the coin flip example understand the basics.
The Null Hypothesis
The most common type of hypothesis testing involves a null hypothesis . The null hypothesis,
denoted H0, is a statement about the world which can plausibly account for the data you observe.
Don’t read anything into the fact that it’s called the “null” hypothesis — it’s just the hypothesis
we’re trying to test.
For example, “the coin is fair” is an example of a null hypothesis, as is “the coin is biased.” The
important part is that the null hypothesis be able to be expressed in simple, mathematical terms.
We’ll see how to express these statements mathematically in just a bit.
The main goal of hypothesis testing is to tell us whether we have enough evidence to reject the
null hypothesis. In our case we want to know whether the coin is biased or not, so our null
hypothesis should be “the coin is fair.” If we get enough evidence that contradicts this
hypothesis, say, by flipping it 100 times and having it come up heads only once, then we cansafely reject it.
All of this is perfectly quantifiable, of course. What constitutes “enough” and “safely” are all a
matter of statistics.
The Statistics, Intuitively
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 17/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
So, we have a coin. Our null hypothesis is that this coin is fair. We flip it 100 times and it comes
up heads 51 times. Do we know whether the coin is biased or not?
Our gut might say the coin is fair, or at least probably fair, but we can’t say for sure. Theexpected number of heads is 50 and 51 is quite close. But what if we flipped the coin 100,000
times and it came up heads 51,000 times? We see 51% heads both times, but in the second
instance the coin is more likely to be biased.
Lack of evidence to the contrary is not evidence that the null hypothesis is true. Rather, it means
that we don’t have sufficient evidence to conclude that the null hypothesis is false. The coin
might actually have a 51% bias towards heads, after all.
If instead we saw 1 head for 100 flips that would be another story. Intuitively we know that the
chance of seeing this if the null hypothesis were true is so small that we would be comfortable
rejecting the null hypothesis and declaring the coin to (probably) be biased.
Let’s quantify our intuition.
The Coin Flip
Formally the flip of a coin can be represented by a Bernoulli trial. A Bernoulli trial is a random
variable X such that
That is, X takes on the value 1 (representing heads) with probability p, and 0 (representing tails)
with probability 1 – p1.
Now, let’s say we have 100 coin flips. Let Xi represent the i th coin flip. Then the random variable
represents the run of 100 coin flips.
The Statistics, Mathematically
Say you have a set of observations O and a null hypothesis H0. In the above coin example we
were trying to calculate
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 18/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
i.e., the probability that we observed what we did given the null hypothesis. If that probability issufficiently small we’re confident concluding the null hypothesis is false2
We can use whatever level of confidence we want before rejecting the null hypothesis, but most
people choose 90%, 95%, or 99%. For example if we choose a 95% confidence level we reject
the null hypothesis if
The Central Limit Theorem is the main piece of math here. Briefly, the Central Limit Theorem
says that the sum of any number of re-averaged identically distributed random variables
approximates a normal distribution.
Remember our random variables from before? If we let
then p is the proportion of heads in our sample of 100 coin flips. In our case, it is equal to 0.51,or 51%.
But by the central limit theorem we also know that p approximates a normal distribution. This
means we can estimate the standard deviation of p as
Wrapping It Up
Our null hypothesis is that the coin is fair. Mathematically we’re saying
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 19/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
Here’s the normal curve:
A 95% level of confidence means we reject the null hypothesis if p falls outside 95% of the area
of the normal curve. Looking at that chart we see that this corresponds to approximately 1.98
standard deviations.
The so-called “z-score” tells us how many standard deviations away from the mean our sample
is, and it’s calculated as
The numerator is “p – 0.50″ because our null hypothesis is that p = 0.50. This measures how far
the sample mean, p, diverges from the expect mean of a fair coin, 0.50.
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 20/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
Example:
Let’s say we flipped three coins 100 times each and got the following data.
Data for 100 Flips of a Coin
Coin Flips Pct. Heads Z-score
Coin 1 100 51% 0.20
Coin 2 100 60% 2.04
Coin 3 100 75% 5.77
Using a 95% confidence level we’d conclude that Coin 2 and Coin 3 are biased using the
techniques we’ve developed so far. Coin 2 is 2.04 standard deviations from the mean and Coin 3
is 5.77 standard deviations.
When your test statistic meets the 95% confidence threshold we call it statistically significant .
This means there’s only a 5% chance of observing what you did assuming the null hypothesis
was true. Phrased another way, there’s only a 5% chance that your observation is due to random
variation.
B. Statistical error: Type I and Type II
Statisticians speak of two significant sorts of statistical error. The context is that there is a "null
hypothesis" which corresponds to a presumed default "state of nature", e.g., that an individual is
free of disease, that an accused is innocent. Corresponding to the null hypothesis is an
"alternative hypothesis" which corresponds to the opposite situation, that is, that the individual
has the disease, that the accused is guilty. The goal is to determine accurately if the null
hypothesis can be discarded in favor of the alternative. A test of some sort is conducted and data
are obtained. The result of the test may be negative (that is, it does not indicate disease, guilt).
On the other hand, it may be positive (that is, it may indicate disease, guilt). If the result of the
test does not correspond with the actual state of nature, then an error has occurred, but if the
result of the test corresponds with the actual state of nature, then a correct decision has been
made. There are two kinds of error, classified as "type I error" and "type II error," depending
upon which hypothesis has incorrectly been identified as the true state of nature
Type I error
Type I error, also known as an "error of the first kind", an α error, or a "false positive": the error
of rejecting a null hypothesis when it is actually true. Plainly speaking, it occurs when we are
observing a difference when in truth there is none, thus indicating a test of poor specificity. An
example of this would be if a test shows that a woman is pregnant when in reality she is not, or
telling a patient he is sick when in fact he is not. Type I error can be viewed as the error of
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 21/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
excessive credulity
In other words, a Type I error indicates "A Positive Assumption is False"
Type II error
Type II error, also known as an "error of the second kind", a β error, or a "false negative": the
error of failing to reject a null hypothesis when in fact we should have rejected the null
hypothesis. In other words, this is the error of failing to observe a difference when in truth there
is one, thus indicating a test of poor sensitivity. An example of this would be if a test shows that
a woman is not pregnant, when in reality, she is. Type II error can be viewed as the error of
excessive scepticism.
6. From the following table, calculate Laspyres Index Number, Paasches
Index Number, Fisher’s Price Index Number and Dorbish & Bowley’s
Index Number taking 2008 as the base year.
Commodity
2008 2009
Price
(Rs) per
Kg
Quantity in
Kg
Price
(Rs) per
Kg
Quantity in
Kg
A 6 50 10 56
B 2 100 2 120
C 4 60 6 60
D 10 30 12 24
E 8 40 12 36
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 22/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736
8/3/2019 3. MB0040 Mba1 Stats
http://slidepdf.com/reader/full/3-mb0040-mba1-stats 23/23
Fall 2010
Submitted By: Satish Patil Roll Number: 521053391 Learning Center: 01736