+ All Categories
Home > Documents > Lecture 2 - Probability Distributions

Lecture 2 - Probability Distributions

Date post: 03-Apr-2018
Category:
Upload: ze-chen
View: 213 times
Download: 0 times
Share this document with a friend

of 15

Transcript
  • 7/29/2019 Lecture 2 - Probability Distributions

    1/15

    Statistical inference: probability

    distributions and confidence intervals

  • 7/29/2019 Lecture 2 - Probability Distributions

    2/15

    We are now familiar with descriptivestatistics; but the main use of statisticalmethods is not description, but prediction

    o i.e. we collect samples mostly to predict

    characteristics of the whole population

    The key instrument of extrapolation fromsample to population is the analysis ofprobability distributions:

    o By assuming that our variables have a certaindistribution (normal, uniform, etc.), we can usesamples to infer population properties

    In the following, well examine the concept

    and uses of statistical distributions 2

  • 7/29/2019 Lecture 2 - Probability Distributions

    3/15

    Most utilised statistical distribution is the

    normal distribution (the Bell curve)

    o also the most infamous due to certain misuses

    o

    http://crab.rutgers.edu/~goertzel/normalcurve.htm

    However, there is nothing intrinsically wrong

    with using probability distributions

    o

    well, anything in the wrong hands (from a breadknife to a fundamental law of nature proposed by

    a pacifist) may become a weapon

    3

    http://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htm
  • 7/29/2019 Lecture 2 - Probability Distributions

    4/15

    The first reason for popularity of the normal curve isdescriptive; i.e. we use it to model distribution of certaintraits that look bell-shaped

    What traits are bell-shaped? Typically, traits that are

    optimised or established by biological or social processes,and thus have a tendency to occur at anexpected/established/optimal value

    o classic example: biological traits under natural selection

    o but the only reason Darwin applied the principle of optimisationto nature is that this was a current concept in Victorian society;

    human societies also define optimal values of certain features,with deviations (in both directions) being less common

    4

  • 7/29/2019 Lecture 2 - Probability Distributions

    5/15

    Normal curve can describe features of optimised traitso mean value should be the most likely or frequently observed

    o the furthest from the mean, the less likely a value should be

    o sum of all cases (or probabilities) should be 100% (thats thewhole sample)

    What kind of curve/distribution produces this pattern?

    Lets try an exponential:o > x y plot(x,y) # what does it look like?

    Try others:o > y y y

  • 7/29/2019 Lecture 2 - Probability Distributions

    6/15

    The normal distribution is just a modified version ofour exponential

    The curve

    N(0,1) =

    is thestandard normal distribution with

    mean=0

    sd=1

    sum of frequencies=1

    Distribution N(0, 1) is possibly the most used instatistical analyses

    Is says that for example:

    the probability of being well above average (+3standard deviations above mean) is only 0.1%

    probability of being one standard deviation

    below average (-1 sd) is 0.1+2.1+13.6=15.8%(i.e. everything below -1) 6

    -3 -2 -1 0 +1 +2 +3

  • 7/29/2019 Lecture 2 - Probability Distributions

    7/15

    However, real traits (body height,income, schooling years,number of social mediaaccounts) may have a normaldistribution (bell shape), butrarely with mean=0 andstandard deviation=1

    That is not a problem: we canstandardise variables, i.e.

    transform them so thateverything you measure hasmean=0 and sd=1

    How is this done? With z-scores7

  • 7/29/2019 Lecture 2 - Probability Distributions

    8/15

    1) We take variablex and subtract the mean fromeach caseo If mean height is 180 cm, someone 170 cm tall now

    measures 170-180=-10

    2) We take all residuals (case minus mean) anddivide by standard deviation of sampleo If sd=10 and mean is 180cm, then someone measuring

    190 cm deviates -10 cm/10 cm= -1 standard deviationbelow the mean

    In summary, standardisation or calculation ofz-scores is simply convertingany measurements into

    standard deviation units using

    =

    i.e., if your height, or age, or income, etc. are average,then on the z-scale all those things measure zero

    8

  • 7/29/2019 Lecture 2 - Probability Distributions

    9/15

    So: if in a population

    o mean height = 180 cm

    o standard deviation=10

    and you are 170cm, then

    o you measure 10 cm above the average

    o you measure z = (170 180)/10 = -1

    This means that the probability of beingshorter than 170 cm in this populationis

    o 0.1 + 2.1 + 13.6 = 15.8%

    The reason for standardising is clear: itis the theoretical step that allows theapplication of the normal distribution tomany quantifiable aspects of reality

    9

  • 7/29/2019 Lecture 2 - Probability Distributions

    10/15

    We are interested in intervals of the normal

    curve, not points

    Why? What does it mean to ask what is the

    probability of being a millionaire in the UK?(or their frequency)

    o it does not mean the probability of having

    exactly 1 million (thats a point)

    o

    it means everyone havingover 1 million(and thats an interval)

    Cumulative probabilityis the probability of an

    interval of values

    10

    a lower interval

    an upper interval

  • 7/29/2019 Lecture 2 - Probability Distributions

    11/15

    It is easy to estimate cumulative probability of being

    smaller than a value

    o You provide the individual value, the mean, and

    the sd, and R calculates z-score and the

    probability of the interval defined by that value

    Command pnorm(test value, mean, sd) calculates

    cumulative probability from left to right, i.e. from to a valuex;

    o (thats the blue area)

    Example: if your height is 170 cm (and average is

    180 cm, sd=10 cm), then

    o > pnorm(170,180,10)

    o [1] 0.1586553 11

    a lower interval

  • 7/29/2019 Lecture 2 - Probability Distributions

    12/15

    pnorm can estimate upper intervals too (i.e. the probability of

    being over a given value:

    Example:

    o what is the probability of being at least (i.e. taller than)

    190cm in the same population?

    Probability of beingsmallerthan 190 cm is

    > pnorm(190,180,10)

    [1] 0.8413447

    i.e. 0.841=84.1%

    Therefore the probability of being over 190 is the rest. i.e.

    > 1-pnorm(190,180,10)

    [1] 0.1586553

    i.e.: probability of being taller than 190 cm is 1 (100%) minus the

    probability of being smaller than 190 cm 12

    an upper interval

  • 7/29/2019 Lecture 2 - Probability Distributions

    13/15

    Important: we can combine the two

    things to calculate probability of extreme

    values (i.e. too large or too small)

    So what is the probability of being

    shorter than 170cm OR taller than 190

    cm, with N(180, 10)?

    > 1pnorm(190, 180, 10)+pnorm(170, 180, 10)

    (check why)

    13

  • 7/29/2019 Lecture 2 - Probability Distributions

    14/15

    Now the most important case (well see why):

    What about probability ofnot being extreme, i.e. of being

    between 170 cm and 190 cm? (This means less than 10 cm

    off average of 180 cm)

    o > pnorm(190, 180, 10) pnorm(170, 180, 10)

    14

  • 7/29/2019 Lecture 2 - Probability Distributions

    15/15

    Take the estimates of years at school by country (from the HDR2011

    database); this is the variableschoolingyears:

    How can we estimate the proportion of countries with children havinga) less than 3 years of schooling?

    b) less than 5 years of schooling?

    c) at least 7 years of schooling?

    Hints:

    -You need to use function pnorm

    -To use pnorm you need the test value, the mean and the standard

    deviation of variableschooling years 15


Recommended