Measures of Central TendencyMeasures of Location
Measures of Dispersion
Dr. Lisa Grace S. Bersales
Summary Statistics
• numerical measures that are used
to describe certain characteristics of
data : what is the typical value ( “ on
the average”), how different the data
are from each other ( “how
variable”), data values indicating
lower and upper ends when data
are ordered from lowest to highest.
• Measures of Central Tendency
• Measures of Location
• Measures of Dispersion
Summary Statistics
Measure of Central Tendency
• Any single value which is used to
identify the center of the data or the
typical value.
• Often referred to as the average.
The Mean
Sum of all values of the observationsdivided by the number of observations inthe data set.
Characteristics of The Mean
• The most familiar measure of centraltendency.
• Employs all available data in thecomputation.
• Strongly influenced by extreme values.
• May not be an actual observation inthe data set.
• Can be applied in at least intervallevel.
Example
The Top 10 Provinces in Terms of Per Capita Income
Growth (1985-2003)
Province Per Capita Income Growth
Camiguin 5.21
Antique 4.18
Capiz 4.14
Batanes 4.09
Samar (western) 4.02
Nueva Vizcaya 3.58
Catanduanes 3.51
Northern Samar 3.20
South Cotabato 3.16
Cebu 3.09
Mean Growth 3.82
The Median
• A value that divides an ordered data
set (array) into two equal parts.
• A value below which half of the data
fall.
Example
Province Growth
Abra 2.87
Agusan del Norte 2.19
Agusan del Sur 1.75
Aklan 0.32
Albay 2.40
Antique 4.18
Aurora 3.07
Basilan 1.55
Bataan 2.68
Batanes 4.09
Province Growth
Aklan 0.32
Basilan 1.55
Agusan del Sur 1.75
Agusan del Norte 2.19
Albay 2.40
Bataan 2.68
Abra 2.87
Aurora 3.07
Batanes 4.09
Antique 4.18
Raw Data Data in Array
54.22
68.240.2
2
)6()5(=
+=
+=
XXM
d
Characteristics of the Median
• A positional measure;
• Not influenced by extreme values;
• May not be an actual value in the data
set.
• Can be applied to data that are
measured in at least the ordinal level.
The Mode
• The value that occurs with thegreatest frequency.
• The easiest to interpret.
• Not affected by extreme values.
• Does not always exist and may not beunique.
• May be applied to nominal level data.
The Mode
• The mode of 1,2,3,3.4.5 is 3.
• The modes of 1,2,3,3,4,4 are 3 and 4.
• There is no mode for 1,2,3,4,5.
• The mode for red,red, blue, yellow isred.
Measures of Location
• Numbers below which a specified
amount or percentage of data lie.
• Oftentimes used to find the position of
a specific piece of data in relation to
the entire data set.
Percentiles
• 99 values (P1, P2,…,P99) that divide
an ordered data set into 100 equal
parts.
• The ith percentile, Pi, is a value below
which i % of the data lie.
Deciles
• 9 values (denoted by D1, D2,…,D9)
that divide an ordered data set into
10 equal parts.
• The ith decile, Di, is a value below
which 10xi % of the data lie.
Quartiles
• 3 values (Q1, Q2, Q3) that divide an
ordered data set into 4 equal parts.
• The ith quartile, Qi, is a value below
which (25xi) % of the data lie.
• The Median is equal to the 2nd quartile.
Special Percentiles
• Median = 50th percentile (the value
below which half of the data values
fall)
• First Quartile = 25th percentile (the
value below which one-fourth of the
data values fall)
• Second Quartile = Median = 50th
percentile
Special Percentiles
Third Quartile = 75th percentile (thevalue below which three-fourths of thedata values fall)
Example of Percentile: The 30th percentile
of family income based on the 2003 Family
Income and Expenditure Survey is about
57,370 pesos per year. This means that
30% of Filipino families have annual
income below 57,370 pesos .
Relationship of Median, Quartiles,
Deciles, and Percentiles
• Min Md Max
• Min Q1 Q2 Q3 Max
• Min D5 Max
• Min P50 Max
Measures of Dispersion
• Measures of dispersion indicate the
extent to which individual items in a
series are scattered about an
average.
• Used as a measure of reliability of the
average value.
General Classifications of
Measures of Dispersion
•Measures of Absolute Dispersion
• used to describe the variability of a data
set
•Measures of Relative Dispersion
• used to compare two or more data sets
with different means and different units
of measurement
Variance and Standard Deviation
• The variance and standard deviation are
measures of dispersion of data with respect
to the mean.
Variance and Standard Deviation
The standard deviation is defined as the
positive square root of the variance,
The standard deviation is often referred to the
measure of “volatility.”
• If there is a large amount of variation in the
data set, the data values will be far from the
mean. In this case, the standard deviation will
be large.
• If, on the other hand, there is only a small
amount of variation in the data set, the data
values will be close to the mean. Hence, the
standard deviation will be small.
Variance and Standard Deviation
Characteristics of the Standard
Deviation
• Just like the mean, it is affected by
the value of every observation.
• It may be distorted by few extreme
values.
• It is always positive.
Measures of Relative Dispersion
Measures of Relative Dispersion are
unit less and are used to compare the
scatter of one distribution with the
scatter of another distribution.
Coefficient of Variation
• Commonly used measure of relative
dispersion.
• The coefficient of variation utilizes two
measures: the mean and the standard
deviation.
• It is expressed as a percentage, removing
the unit of measurement, thus, allowing
comparison of two or more data sets.
The coefficient of variation is the ratio of the
standard deviation to the mean expressed in
percentage.
Coefficient of Variation
Summary Statistics using Excel?
We can compute for the summary
statistics by clicking,
Tools \Data Analysis\ Descriptive
Statistics
Examples
Given 74 countries’
Average Gross Domestic Product (GDP)
Growth Rate (per capita) from 1975 to
2000
And
Life Expectancy at Birth (average from
1975 to 2000)
In the Excel file, we get the following results
Ave GDP Growth Rate Ave. life expectancy
per capita (1975-2000) (1975-2000)
Mean 5.09 Mean 62.92
Median 5.16 Median 64.70
Mode 3.59 Mode #N/A
Standard
Deviation 1.86
Standard
Deviation 11.82
Sample
Variance 3.47
Sample
Variance 139.77
Range 8.68 Range 42.15
Minimum 1.22 Minimum 36.00
Maximum 9.90 Maximum 78.15
Sum 376.82 Sum 4655.84
Count 74 Count 74
cv 37% 19%
1st quartile 3.78 3rd quartile 74.41
50th percentile 5.16 90th percentile 76.41