+ All Categories
Home > Documents > Measures and indexes of variability - Stanford...

Measures and indexes of variability - Stanford...

Date post: 21-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
21
Measures and indexes of variability Stats48n
Transcript
Page 1: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Measures and indexes of variability

Stats48n

Page 2: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Measures of spread/dispersion/variabilityI A measure of center needs to be complemented by a measure

of spread around this center.I The definition of averages that we explore naturally lead

themselves to measures of variabilityI Variance: average square distance from the mean

V(x1, . . . , xn) = 1n

n∑i=1

(xi − x̄)2

I Standard Deviation: √√√√1n

n∑i=1

(xi − x̄)2

I Note that R actually divides by n − 1 rather than n. This isbecause when x1, . . . , xn are a sample from a larger populationof possible values, dividing by n − 1 one has a “better”estimator for the population quantity.

Page 3: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

A note: data with frequencies

I Often data is summarized so that we have counts ofoccurrences of the same values: we have a set v1, . . . , vm ofpossible values, with their frequencies fi

v1 v2 · · · vmf1 f2 · · · fm

- Calculating averages and standard deviations has to adapt to thisdifferent set-up

v̄ = 1∑mi=1 fi

m∑i=1

vi fi

Variance = 1∑mi=1 fi

m∑i=1

(vi − v̄)2fi

Page 4: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

A note: the maximal variance of x1, . . . , xnI Generally speaking, the variance of a dataset can be arbitrarily

largeI Let’s consider some restrictions that make the statement

meaningfulI xi ≥ 0 ∀i

I fix the total sum of valuesn∑

i=1xi = nx̄

n∑i=1

(xi − x̄)2 =n∑

i=1(x2

i + x̄2 − 2xi x̄)

=n∑

i=1x2

i +n∑

i=1x̄2 − 2x̄

n∑i=1

xi

=n∑

i=1x2

i + nx̄2 − 2x̄(nx̄)

=n∑

i=1x2

i − nx̄2

Page 5: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

A note: the maximal variance of x1, . . . , xn

So, V(x1, . . . , xn) = (n∑

i=1x2

i − nx̄2)/n. Now,

n∑i=1

x2i − nx̄2 ≤ (

n∑i=1

xi )2 − nx̄2 = n2x̄2 − nx̄2 = x̄2n(n − 1)

Which means that

V(x1, . . . , xn) ≤ x̄2(n − 1)

I Can we imagine a set of values of x1, . . . , xn for which thevariance is actually equal to this max?

Page 6: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Index of concentration

The opposite of spread-out is “concentrated.”

Let’s consider variables like the one we just talked about, that iswith only positive values. One such variable might be the incomeof households in a nation.

It is interesting to study how “concentrated” or not such income is.One can imagine that the total income of a nation is the totalamount of a resource that one could distribute.

Page 7: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Income inequality in the media

Page 8: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Income inequality in politics

Page 9: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

How can we measure “income inequality”?

I Let’s think we have a population with n individuals, each withincome x1, . . . xn.

I nx̄ is the total income in the population (with x̄ =∑n

i=1 xi/n)I What would be the values of x1, . . . xn in the case of maximal“income equality”?

I What would be the values of x1, . . . xn in the case of maximal“income inequality”?

I How are we going to judge cases in the middle?I Any known measure?I Any measure we can come up with given what we already

know?

Page 10: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

A graphical display for income distributionWe take the values x1, . . . xn and order them

x(1) ≤ x(2) ≤ · · · ≤ x(n)

For simplicity, we are going to drop the parentheses from the indexnotation, and just remember that x1 is the smallest index.We now calculate two quantities:

Fi = in Qi =

∑ij=1 xj∑nj=1 xj

I F1 is the fraction of the population that correspond to thebottom earner; F2 is the fraction of the population thatcorrespond to the two bottom earners etc.

I Q1 is the fraction of the national income earned by the bottomearner; Q2 is the fraction of the national income earned by thetwo bottom earners etc.

Page 11: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

A graphical display for income distribution

I Let’s think about the relation between Fi and Qi in the case ofperfect income equality

I In general, Qi ≤ Fi . To see this, let’s look at their definitionand multiply by

∑nj=1 xj and divide by i

Qi ≤ Fi∑ij=1 xj∑nj=1 xj

≤ in∑i

j=1 xj

i ≤∑n

j=1 xj

n

and remember that the xi are increasing.

Page 12: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

A graphical display for income distributionIncome values = 1,2,3,10,15,15,30,50

● ●●

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

F

Q

Page 13: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

A graphical display for income distribution

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

F

Q

Perfect equality

● ● ● ● ● ● ● ●

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

F

Q

Maximal inequality

How could we use this to construct an Index?

Page 14: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

An idea for the index

● ●●

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

F

Q

Page 15: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

From area to index

I Index varies between 0 and 1I Area in between curves A= 1/2- area under bottom curveI Area under bottom curve: sum of areas of trapezoids. Thus

A = 12 −

n∑i=1

(Fi − Fi−1)(Qi + Qi−1)2

I Gini’s index= G = A1/2 = 1−

n∑i=1

(Fi − Fi−1)(Qi + Qi−1)

Page 16: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

How do things change if we have repetition?

I data in the formx1 ≤ x2 ≤ · · · ≤ xk

n1 n2 · · · nk

with∑

j nj = nI Define

Fi =∑i

j=1 nj

n Qi =∑i

j=1 njxj∑kj=1 njxj

I Everything else stays the same.

Page 17: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Income distribution in USA 2015Current Population Survey, Income Data

●●

●●●

●●

●●●

●●

●●●●

●●

●●

●●

●●●●●

●●●●●

●●●●●●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●●●

●●●●●

●●●●●●●

●●

●●●

●●

●●

●●

●●●●●●●●●

●●●

●●

●●●●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●●●●●●

0.00

0.02

0.04

0.06

0.08

0e+00 1e+05 2e+05 3e+05 4e+05

Average Income in Bracket

Pro

port

ion

in In

com

e B

rack

et

●●●●●

●●●●●

●●●●●

●●●●●

●●●●●

All

Whites

Blacks

Asians

Hispanics

Page 18: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Revisiting the Income data

All

Asian

Black

Hispanic

White

0.47 0.48 0.49 0.50

Gini Index

Rac

e

Page 19: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Gini index for other data

Page 20: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Gini index for other data

Page 21: Measures and indexes of variability - Stanford Universitystatweb.stanford.edu/~sabatti/Stat48/Variability... · 2019. 10. 10. · Measures and indexes of variability Author: Stats48n

Something to note

We can calculate the following summary of “mutual variability”

∆ =k∑

i=1

k∑j=1|xi − xj |

nin

njn

And one can show thatG = ∆

2x̄


Recommended