LECTURELECTURE
CENTRAL TENDENCIES CENTRAL TENDENCIES & DISPERSION& DISPERSION
PO
STG
RA
DU
AT
E
ME
TH
OD
OL
OG
Y C
OU
RSE
Measures of Central Measures of Central TendencyTendency
Summarizes the entire data set into a Summarizes the entire data set into a single variable ( measurement )single variable ( measurement )
Measures of Central Tendency Measures of Central Tendency includes: includes: – ModeMode– MedianMedian– MeanMean– Trimmed MeanTrimmed Mean– SkewnessSkewness
ModeMode
The measurement that occurs most often The measurement that occurs most often ( with the highest frequency )( with the highest frequency )
Commonly used as a measure of Commonly used as a measure of popularity.popularity.
There can be more than 1 mode.There can be more than 1 mode. Not influence by extreme measurements.Not influence by extreme measurements. Applicable for both qualitative and Applicable for both qualitative and
quantitative data.quantitative data.
MedianMedian
The middle value when the measurements The middle value when the measurements are arranged from lowest to highest.are arranged from lowest to highest.
50% of the measurement lie above it and 50% of the measurement lie above it and 50% fall below it.50% fall below it.
Often used to measure the midpoint of a Often used to measure the midpoint of a large set of measurement.large set of measurement.
There is only 1 medianThere is only 1 median Not influenced by extreme measurements.Not influenced by extreme measurements. Applicable to quantitative data only.Applicable to quantitative data only.
MeanMean
The sum of the measurements The sum of the measurements divided by the total number of divided by the total number of measurements or better known as measurements or better known as the average.the average.
There is only 1 mean.There is only 1 mean. Value is influences by extreme Value is influences by extreme
measurementsmeasurements Applicable to quantitative data only.Applicable to quantitative data only.
Trimmed MeanTrimmed Mean
The mean is influenced by extreme The mean is influenced by extreme values ( Outliers )values ( Outliers )
To reduce the effect of outliers which To reduce the effect of outliers which distort the mean value, a variation of distort the mean value, a variation of the mean is introduced.the mean is introduced.
Trimmed mean drops the highest Trimmed mean drops the highest and lowest extreme values and and lowest extreme values and averages the rest.averages the rest.
SkewnessSkewness Relationship of the mode, median, mean Relationship of the mode, median, mean
and trimmed mean is reflected through the and trimmed mean is reflected through the skewness of the data.skewness of the data.
Skewness of the data measures how the Skewness of the data measures how the data is distributed.data is distributed.
Zero Skewness Zero Skewness – symmetrical ( Mode = Median = Mean)symmetrical ( Mode = Median = Mean)
Positive Skewness Positive Skewness – skewed to the right ( Mode < Median < skewed to the right ( Mode < Median <
Mean )Mean ) Negative Skewness Negative Skewness
– skewed to the left ( Mode > Media > Mean )skewed to the left ( Mode > Media > Mean )
SkewnessSkewness
Negatively or left skewed
0
10
20
30
40
50
1 2 3 4 5 6 7 8
Symmetrical
0
5
10
15
20
25
1 2 3 4 5 6 7 8 9
Positively or right skewed
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7 8
Measures of Variability / Measures of Variability / Dispersion Dispersion
It is not sufficient to describe a data set using It is not sufficient to describe a data set using only measures of central tendencyonly measures of central tendency
Need to determine how dispersed / spread out Need to determine how dispersed / spread out the data is.the data is.
Measures of variability/spread includesMeasures of variability/spread includes– RangeRange– Percentile / QuartilePercentile / Quartile– Deviation / Standard Deviation (sisihan piawai)Deviation / Standard Deviation (sisihan piawai)– VarianceVariance– Coefficient of variationCoefficient of variation
RangeRange
The difference between the largest The difference between the largest and the smallest measurement of and the smallest measurement of the set.the set.
It is easy to compute but very It is easy to compute but very sensitive to outliers.sensitive to outliers.
Does not give much information Does not give much information about the pattern of variabilityabout the pattern of variability
Percentile / QuartilePercentile / Quartile The pThe pthth percentile of a set of n percentile of a set of n
measurements arranged in order of measurements arranged in order of magnitude is that value that has at most magnitude is that value that has at most p% of the measurements below it and at p% of the measurements below it and at most ( 100 – p ) % above it.most ( 100 – p ) % above it.
Example: 60Example: 60thth percentile has 60% of the percentile has 60% of the data below it and 40% above it.data below it and 40% above it.
Percentile of interest are the 25Percentile of interest are the 25thth, 50, 50thth, , 7575thth, percentiles often called the lower , percentiles often called the lower quartile, median, and upper quartile. quartile, median, and upper quartile.
Interquartile range – difference between Interquartile range – difference between the upper and lower quartile the upper and lower quartile
Variance and Standard Variance and Standard DeviationDeviation
The variance of a set of n measurements The variance of a set of n measurements yy11, y, y22, … ,y, … ,ynn with mean y is the sum of the with mean y is the sum of the squared deviations divided by n – 1.squared deviations divided by n – 1.
The standard deviation of a set of The standard deviation of a set of measurement is defined to be the measurement is defined to be the positive square root of the variance.positive square root of the variance.
Both measure how spread out the data is Both measure how spread out the data is from the mean.from the mean.
Coefficient of VariationCoefficient of Variation
Measures the variability in the values Measures the variability in the values in a population relative to the in a population relative to the magnitude of the population mean.magnitude of the population mean.
CV = CV = Standard DeviationStandard Deviation
|Mean||Mean| The CV is a unit-free number, it is The CV is a unit-free number, it is
useful when comparing variation of useful when comparing variation of different sets of data.different sets of data.
BoxplotBoxplot
Top line Top line – MaximumMaximum
22ndnd line line – Upper QuartileUpper Quartile
33rdrd line line– MedianMedian
44thth line line– Lower QuartileLower Quartile
55thth line line– MinimumMinimum
312N =
Item 5 NF
8
7
6
5
4
3
2
1
0