+ All Categories
Home > Documents > DISTRIBUSI Multivariate Normal

DISTRIBUSI Multivariate Normal

Date post: 01-Jun-2018
Category:
Upload: dyonisius-h-s-jewaru
View: 222 times
Download: 0 times
Share this document with a friend

of 81

Transcript
  • 8/9/2019 DISTRIBUSI Multivariate Normal

    1/81

    DISTRIBUSI MULTIVARIATE

    NORMAL

    Pertemuan 5 mtv

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    2/81

    Distribusi MultivariateNormal

    •  adala !euba a"a# berdistribusi p-variatenormal den$an ve#tor mean dan matri#s varian%#ovarians &oint !d'n(a da!at ditulis#an seba$ai)

      den$an

    •  *! variate Normal+

    •  

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    3/81

    Distribusi MultivariateNormal

    • n-.  Univariate Normal

    • n-/  Bivariate Normal

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    4/81

    N/*Bivariate Normal+  

    0

    •  

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    5/81

    Surface Plots of the bivariate

     Normal distribution

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    6/81

    Contour dari Distribusi BivariateNormal

    X1

    X2

    Semua pasangan titik (x,y) yang memiliki f(x,y) yang samadisebut suatu contour , dideniskan dalam ruang dimensip , semua nilai x sedikian sehingga

    !ontours

    ( ) ( )− −'

    -1 2x =Σ cx1 1

    µ µ

    f(X1, X2)

    Bivariate Normal Response Surface

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    7/81

    "he !ontours

    f(X1, X2)

    X2

    #here

    Contour

    unt suatukonstan

    !

    membentuk suatu elipsoid yang terpusat di µ dgn sumbu

    X1

    f(X1,

    X2)

    ( ) ( )− −'

    -1 2x =Σ cx1 11 1

    µ µ

    i i±cλ e

    1 1±cλ e

    2 2±cλ e

    ∑ i ii

    ieλ e for i == 1, , p$ 

    1

    2

    μμ =

    μ1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    8/81

    Contour Plots of the bivariate

     Normal distribution

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    9/81

    Scatter Plots of data from the

     bivariate Normal distribution

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    10/81

    Bentuk umum dari !ontours untuk suatu bivariate normalprobability distribution dengan e%ual varian!e (

    σ&& ' σ)

    dapat diturunkan sbb

    "ntukan eigenvalues dari Σ

    ( )

    ( ) ( )

    ,

    2 211 12 11 12

    12 11

    11 12 11 12

    1 11 12 2 11 12

    Σ - λI = 0 or

    σ - λ σ0 = =σ - λ - σσ σ - λ

      =λ - σ - σ λ - σ + σ

    so λ = σ + σ λ = σ - σ

    1 1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    11/81

    $emudian tentukan eigenve!tors (normalili*ed)dari Σ

    ( )

    ( )

    i i i

    11 12 1 11

    12 11 2 2

    11 1 12 2 11 12 1

    12 1 11 2 11 12 2

    1 2 1

    2 11 12 2

    Σe = λ e or

    σ σ e e=λσ σ e e

    or σ e + σ e = σ + σ e

      σ e + σ e = σ + σ e

    1 2 which implies e = e or e =

    1 2

     1 2and λ = σ - σ similarl leads !o e =

    -1 2

    11 1

    1

    1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    12/81

      + ntuk nilai positive !ovarian!eσ&, merupakan eigenvalus

    terbesar (eigenvalue yang pertama) dan eigenve!tor yangbersesuaian terletak sepan-ang garis ./0 melalui !entroid

    µ

    f(X1, X2)

    X2

    !ontour

    for!onstant

    X11pa yang ter-adi -ika !ovarian!e bernilai negative23engapa2

    f(X1,

    X2)

    11 12cσ - σ( ) ( )− −

    '-1c = xΣ x

    1 11 1

    µ µ

    11 12cσ + σ

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    13/81

      + for a negative !ovarian!eσ&, maka tmerupakan

    eigenvalues terbesarhe se!ond eigenvalue and itsasso!iated eigenve!tor lie at right angles to the ./0 line

    running through the !entroid µ

    f(X1, X2)

    X2

    !ontour

    for!onstant

    X14hat do you suppose happens #hen the !ovarian!e is*ero2 4hy2

    f(X1,

    X2)

    11 12cσ - σ( ) ( )− −

    '-1c = xΣ x

    1 11 1

    µ µ

    11 12cσ + σ

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    14/81

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    15/81

      + for !ovarian!eσ& of *ero the t#o eigenvalues and

    eigenve!tors are e%ual (ex!ept for signs) + one runs alongthe ./0 line running through the !entroid µ and the otheris perpendi!ular

     

    f(X1, X2)

    X2

    !ontour

    for!onstant

    X14hat do you suppose happens #hen the !ovarian!e is*ero2 4hy2

    11 12cσ - σ( ) ( )− −

    '-1c = xΣ x

    1 11 1

    µ µ

    11 12cσ + σ

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    16/81

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    17/81

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    18/81

      6 "he density

    is symmetri! along its !onstant density !ontours and is!entered at

    µ

    , i6e6, the mean is e%ual to the median:

      ;6 6 Conditional distributions of the !omponents of 5 are(multivariate) normal

    ( )

    ( ) ( )− − −=

    Σ

    '-1

    xΣ x

    21 2 p 2

    1f#x$ e

    2

    1 11 1

    1

     

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    19/81

    D6 Some ?mportant @esults@egarding the 3ultivariateNormal Distribution

      &6 ?f 5  Np(µ,Σ), then any linear !ombination

    9urthermore, if aA5  Np(µ,Σ) for every a, then 5  Np(µ,Σ)

    ( )∑ p

    ' ' '

    i i p

    i = 1

    a ( = a ( ) * aμ, a Σa1 11 1 1 1

    1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    20/81

      6 ?f 5  Np(µ,Σ), then any set of % linear !ombinations

    9urthermore, if d is a !onformable ve!tor of !onstants,then 5 d  Np(µ  d,Σ)

    ( )

     p

    1i i

    i=1

     p

    2i i' ' '

    i=1  

     p

    i i

    i=1

    = )

    a (

    a (  ( * μ, Σ 

    a (

    11 1 11113

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    21/81

      ;6 ?f 5  Np(µ,Σ), then all subsets of 5 are (multivariate)

    normally distributed, i6e6, for any partition

    then 5&  N%(µ&, Σ&&), 5  Np+%(µ, Σ)

    ( )

    ( )

    ( )( )

    ( )

    ( )

    ( )( )

    ( )

    ( )   ( )( )

    ( )( )   ( ) ( )( )

    ___ , ___ ,

    ÷ ÷  

    ÷ ÷   ÷ ÷   ÷ ÷   ÷ ÷

    11 121 1x    x p- x1 x1

     px1 pxp px1

    2 2 21 22 p- x1 p- x1 p- x p- x p-  

    Σ Σ(

    ( = =Σ =

    (Σ Σ

     

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    22/81

      .6 ?f 5&  N%&(µ&,Σ&&) and 5  N%(µ,Σ) are independent, then

    Cov(5&, 5) ' Σ& ' 0

    and if 

    then 5& and 5 are independent i Σ& ' 0

    and if 5&  N%&(µ&,Σ&&) and 5  N%(µ,Σ) and are

    independent, then

      ÷ ÷   ÷  

    ,1 1 11

    1 1 11

    1 2

    11 11 12

    +  

    2 2 21 22

    (Σ Σ) *

    (Σ Σ

     

      ÷ ÷   ÷  

    ,1 11

    1 11

    1 2

    11 11

    +  

    2 2 22

    (Σ 0) *

    ( 0Σ

     

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    23/81

      and

    the Np(µ,Σ) distribution assigns probability & 7 α to the

    solid ellipsoid

      /6 ?f 5  Np(µ,Σ) and Σ E 0, then

    ( ) ( )− −'

    -1 pxΣ x ) &

    1 11 1

     

    ( ) ( )   ( ){ }− −'

    -1 2

     px &xΣ x %1 11 1

    µ µ ≤

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    24/81

      >6

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    25/81

    H6Sampling 9rom a 3ultivariateNormal Distribution and

    3aximum

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    26/81

    3aximum

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    27/81

    9or a k x k symmetri! matrix 1 and a k x & ve!tor x

      + xA1x ' tr(xI1x) ' tr(1xxA)

      + tr(1) ' #hereλi, ? ' &F, k are the eigenvalues of 1

    "hese t#o results !an be used to simplify the -oint density ofn mutually independent random observations 5 -Is, ea!h

    have distribution Np(µ,Σ) 7 #e rst re#rite

    i

    i=1

    λ

    ( ) ( ) ( ) ( )( ) ( )

    ' '-1 -1

    . . . .

    '-1

    . .

    x -μ Σ x - μ = !r x - μ Σ x - μ

      =Σ x - μ x - μ

    1 1 1 1 1 11 1 1 1

    1 1 11 1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    28/81

    "hen #e re#rite

    sin!e the tra!e of the

    sum of matri!es ise%ual to the sum of

    their individual tra!es

    ( ) ( ) ( ) ( )

    ( ) ( )

    ( ) ( )

      ÷  

    ∑ ∑∑

    n n' '-1 -1

    . . . .

    .=1 .=1

    n '-1

    . .

    .=1

    n '-1

    . .

    .=1

    x -μ Σ x - μ = !r x - μ Σ x - μ

      = !rΣ x - μ x - μ

      = !rΣ x - μ x - μ

    1 1 1 1 1 11 1 1 1

    1 1 11 1

    1 1 11 1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    29/81

    4e !an further state that

    Be!ause the!rossprodu!t

    terms

    are both matri!esof *eros

    ( ) ( ) ( ) ( )

    ( ) ( )   ( ) ( )

    ( ) ( )   ( ) ( )

    '

    '

    ∑ ∑∑ ∑

    n n' '

    . . . ..=1 .=1

    n n '

    . ..=1 .=1

    n '

    . ..=1

    x -μ x - μ = x - x + x - μ x - x + x - μ

      = x - x x - x + x -μ x - μ

      = x - x x - x + n x -μ x - μ

    1 1 1 1 1 1 1 11 1 1 1

    1 1 1 1 1 11 1

    1 1 1 1 1 11 1

    ( ) ( )

    ( ) ( )

    '

    n

    .

    .= 1n

    '

    .

    .= 1

    x - x x -μ and

    x -μ x - x

    1 1 1   1

    1 1 11

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    30/81

    Substitution of these t#o results yield an alternativeexpression of the -oint density of a random sample from ap+dimensional population

    Substitution of the observed values x&,F,xn into the -oint

    density yields the likelihood fun!tion for the !orrespondingsample 5, #hi!h is often denoted as

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    31/81

    So for observed values x&,F,xn that !omprise random

    sample 5 dra#n from a p+dimensional normally distributedpopulation, the likelihood fun!tion is

    ( )

    ( ) ( )   ( ) ( )'

      ÷ ÷  

    Σ

    n '-1

    . .

    .= 1

    -!rΣ x -x x -x +n x-μ x-μ

    2n 2np 2

    1#μ, Σ$ = e2

    1 1 1 1 1 1 11 1

    111

    π

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    32/81

    9inally, note that #e !an express the exponent of thelikelihood fun!tion in many #ays 7 one parti!ular alternateexpression #ill be parti!ularly !onvenient

    ( ) ( )   ( ) ( )

    ( ) ( )   ( ) ( )

    ( ) ( )   ( ) ( )

    '

    '

    '

      ÷  

          ÷   ÷    

      ÷  

    n '-1

    . .

    .=1

    n '-1 -1

    . .

    .= 1

    n '-1 -1

    . .

    .= 1

    !rΣ x - x x - x + n x - μ x - μ

      = !rΣ x - x x - x + n !r Σ x - μ x - μ

      = !rΣ x - x x - x + n x - μ Σ x - μ

    1 1 1 1 1 1 11 1

    1 1 1 1 1 1 1 11 1

    1 1 1 1 1 1 1 11 1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    33/81

    #hi!h, by another substitution, yields the likelihoodfun!tion

    1gain, keep in mind that #e are pursuing estimates ofµ

     and

    Σ

     that maximi*e the likelihood fun!tion

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    34/81

    "his result #ill also be helpful in deriving the maximumlikelihood estimates of

    µ

     andΣ

    6

    9or a p x p symmetri! positive denite matrix B and s!alarb E 0, it follo#s that

    for all positive deniteΣ

     of dimension p x p, #ith e%ualityholding only for

    ( )

    ( )≤

    -1-!rΣ 3 p -p

    1 1e 2 e

    Σ 3

    1 1

    1 1

    1Σ = 3

    21 1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    35/81

    No# #e are ready for maximum likelihood estimation of µ and

    Σ

    6

    9or a random sample 5&,F,5n from a normal population#ith mean µ and !ovarian!e Σ, the maximum likelihoodestimators 

    µ

     andΣ

     ofµ

     andΣ

     are

    "heir observed values for observed data x&,F,xn 

    J J

    are the maximum likelihood estimates of µ and Σ6

    ( ) ( )∑  '

    ˆˆ1 1 1 1 1 11

    n

    . .

    .= 1

    1 n - 1μ = (, Σ = ( - ( ( - ( = 4n n

    ( ) ( )'

    ∑n

    . .

    .=1

    1x and x - x x - x

    n1 1 1 1 1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    36/81

    Note that the maximum of the likelihood is a!hieved at

    and sin!e

    #e have that

    generali*ed

    varian!e

    !onstant

    ( )ˆˆ

      ÷      ÷   ÷ ÷   ÷   ÷ ÷   ÷     ÷           ÷ ÷ ÷

    ÷ ÷      

    np-np n2- -

    2 2np 2 n 2

     p

    1 1 n - 1#μ,Σ$ = e = 2 e 4

    n2 n - 14

    n

    1   11

    1

    π

    π

    ( )ˆˆ

    ˆ

        ÷ ÷ ÷ ÷   ÷     

    np

    - 2np 2 n 21 1#μ, Σ$ = e

    2 Σ111

    π

    ˆ     ÷  

     pn - 1Σ = 4

    n1   1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    37/81

    ?t !an be sho#n that maximum likelihood estimators (or3

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    38/81

    ?t !an be also be sho#n that

    are suK!ient for the multivariate normal -oint density

    i6e6, the density depends on the entire set of observationsx&,F,xn only through

    "hus, #e refer to 5 and S as the suK!ient statisti!s for themultivariate normal distribution6

    SuK!ient Statisti!s !ontain all information ne!essary toevaluate a parti!ular density for a given sample6

    L

    ( )   ( )x and n - 1 4 or 4

    ( )

    ( ) ( )   ( ) ( )

    ( )   ( ) ( )   ( ) ( )

         

      ∑

         

    '

    '

    1 1 1 1 1 1 11 1

    1

    1

    1 1 1 1 1 1 1 11 1

    n'

    -1. .

    .=1

    -!rΣ x -x x -x +n x-μ x-μ

    2n 2np 2

    n '-n 2-np 2 -1

    . .

    .=1

    1f#x$ = e

    1  = 2Σ exp - !r Σ x - x x - x + n x - μ x - μ

    2

    π

    π

    ( )   ( )  .x and n + & S or S

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    39/81

    96 "he Sampling Distributions of 5and S

    "he assumption that 5&,F,5n !onstitute a random sample

    #ith mean µ and !ovarian!e Σ !ompletely determines thesampling distributions of 5 and S6

    9or a univariate normal distribution, 5 is normal #ith

    L

    1nalogously, for the multivariate (p≥

     ) !ase (i6e6, 5 is

    normal #ith mean µ and !ovarian!e Σ), 5 is normal #ith

    L

    L

    21 pop5la!ion 6ariance mean μ and 6ariance σ =

    n sample si7e

    1

    1 mean μ and co6ariance ma!rix Σ

    n

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    40/81

    Similarly, for random sample 5&,F, 5n from a univariate

    normal distribution #ith meanµ

     and varian!eσ

     

    1nalogously, for the multivariate (p ≥ ) !ase (i6e6, 5 isnormal #ith mean

    µ

     and !ovarian!eΣ

    ), S is Wishart

    distributed  (denoted 4m(

    Σ) #here

    #here

    ( )

    1

    11

     m 

     m '

    . ..=1

    8 9Σ = 8ishar! dis!ri5!ion wi!h m de:rees of freedom 

      = dis!ri5!ion of ; ;

    ( )   ( )∑ ∑1 1

    n n-122 2 2 2

    . n-1 .

    .=1 .=1

    n - 1 s = ( - ( )& = σ ;

    ( )2 2.; ) * 0,σ , . = 1,

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    41/81

    Some important properties of the 4ishart distribution

      + "he 4ishart distribution exists only if n E p

      + ?f 

    then

    independently of !ommon!ovarian

    !ematrix

      + and

    ( )1 1 1

    11 m 1  ) 8 9Σ

    ( )1 1 1

    22 m 2  ) 8 9Σ

    ( )1 1 1 1 1

    1 21 2 m +m 1 2  + ) 8 + 9Σ

    ( )1' ' '

    1 m 1 ) 8 9 Σ

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    42/81

      + 4hen it exists, the 4ishart distribution has a density of 

    for a positive symmetri! denite matrix 16

    ( )( )   ( )

    ( ) ( )   ( ) ( )      ∏

    11

    11 1

    1

    -1n-p-2 2 -!r Σ 2

    n-1  pn-1 2 p n-1 2 p p-1 >

    i=1

      e8 9Σ =

    12? Σ @ n - i

    2

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    43/81

    96

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    44/81

      + "he

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    45/81

    3ultivariate impli!ations of the

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    46/81

    "hese statements are sometimes #ritten as

    and

    or similarly

       → 1 p

    n -B ( - μ B 1

    → ∞

      → 11 1 1 p

    n -B 4 - Σ B 1→ ∞

      → 11 1 1 p

    n n -B 4 - Σ B 1→ ∞≤

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    47/81

      + "hese results !an be used to support the (3ultivariate)Central

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    48/81

    Be!ause the sample !ovarian!e matrix S (or Sn) !onverges

    to the population !ovarian!e matrixΣ

     so %ui!kly (i6e6, at

    relatively small values of n 7 p), #e often substitute thesample !ovarian!e for the population !ovarian!e #ith little!on!ern for the rami!ations 7 so #e have

    6

    for n large relative to p6

    "his !an be restated as

    again for n large relative to p6

    6

    1   11 n

    1( ) *μ, 4n

    ( )   1   1 11

    nn ( -μ ) * 0,4

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    49/81

    Qne nal important result due to the C

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    50/81

    R6 1ssessing the 1ssumption ofNormality

    "here are t#o general !ir!umstan!es in multivariatestatisti!s under #hi!h the assumption of multivariatenormality is !ru!ial

      + the te!hni%ue to be used relies dire!tly on the ra#observations 5

     -

      + the te!hni%ue to be used relies dire!tly on sample mean

    ve!tor 5 - (in!luding those #hi!h rely on distan!es of theform n(5 7 µ)AS+&(5 7 µ)) 

    ?n either of these situations, the %uality of inferen!es tobe made depends on ho# !losely the true parentpopulation resembles the assumed multivariate normalform:

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    51/81

    Based on the properties of the 3ultivariate NormalDistribution, #e kno#

      + all linear !ombinations of the individual normal are normal

      + the !ontours of the multivariate normal density are!on!entri! ellipsoids

    "hese fa!ts suggest investigation of the follo#ing %uestions(in one or t#o dimensions)

      + Do the marginal distributions of the elements of 5 appearnormal2 4hat about a fe# linear !ombinations2

      + Do the bivariate s!atterplots appear ellipsoidal2

      + 1re there any unusual looking observations (outliers)2

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    52/81

    "ools fre%uently used for assessing univariate normalityin!lude

      + the empiri!al rule

      + dot plots (for small samples sets) and histograms or stem leaf plots (for larger samples)

      + goodness+of+t tests su!h as the Chi+S%uare RQ9 "est andthe $olmogorov+Smirnov "est

      + the test developed by Shapiro and 4ilk M&T>/O !alled theShapiro+4ilk test

      + U+U plots (of the sample %uantiles against the expe!ted%uantile for ea!h observation given normality)

    ≤ ≤

    ≤ ≤

    ≤ ≤

    '#μ - 1σ x μ + 1σ$ 0

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    53/81

    Hxample 7 suppose #e had the follo#ing fteen (ordered)sample observations on some random variable 5

    Do these data support the

    assertion that they #eredra#n from a normal parentpopulation2

    Ordered

    Observations

    x(j)

    1!"

    1#2

    2!#

    2!$2%&

    !'"

    !!&

    ##1

    ##$#&%

    &!#

    &$$

    $%2

    %!2

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    54/81

    ?n order to assess normality by the the empiri!al rule, #eneed to !ompute the generali*ed distan!e from the !entroid(!onvert the data to a standard normal random variable) 7for our data #e have

    so the !orresponding standard

    normal values for our data are

    Nine of the observations (or >0V) lie#ithin one standard deviation of themean, and all fteen of the

    observations lie #ithin t#o standarddeviation of the mean 7 does thissupport the assertion that they#ere dra#n from a normal parentpopulation2

    x = F

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    55/81

    +& 0 & ; . / > W X T &0&&

      6

      66 6 6 6 6 6 666 6 6 6 6

    1 simple dot plot !ould look like this

    "his doesnAt seem to tell us mu!h (of !ourse, fteen datapoints isnAt mu!h to go on)6

    Yo# about a histogram2

    "his doesnAt seem totell us mu!h either:

    /isto0ram

    '

    1

    2

    "

    !

    #

    ' + 2 2 + ! ! + # # + $ $ + 1'

    lasses

    bsolute

    3re.uec4

    4e !ould use S1S to !al!ulate the Shapiro+4ilk test statisti!d di l

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    56/81

    and !orresponding p+value

    ,J,  s!5ffK

    I*LJ xK

    ,3M x='Nser6ed /al5es of ('K,O4K

    1G

    1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    57/81

      Test --Statistic--- -----p Value------  Shapiro-Wilk W 0.935851 Pr W 0.3331  !olmo"oro#-Smir$o# % 0.159&93 Pr ' % '0.1500  (ramer-#o$ )ises W-S* 0.058+,+ Pr ' W-S* '0.500  $/erso$-%arli$" -S* 0.3,,15 Pr ' -S* '0.500 

    Stem eaf 2oplot  9 & 1 4  8 9 1 4  + 59 -----  , ,+8 3 4 4  5 8 1 6----6  & 05 4 4  3 0 1 4 4

      55 -----  1 &, 4  ----------------

      Normal Pro7a7ility Plot  9.5 6  4 6  4 66

      4 666  5.5 6  4 66  4   4 6 6 6  1.5 6 6  ----------------------------------------  - -1 0 1

    Qr a U+U plot

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    58/81

      + put the observed values in as!ending order + !all these thex(-)

      + !al!ulate the !ontinuity !orre!ted !umulative probabilitylevel (- 7 06/)Zn for the sample data

      + nd the standard normal %uantiles (values of the N(0,&)distribution) that have a !umulative probability of level (- 706/)Zn 7 !all these the %(-), i6e6, nd * su!h that

      + plot the pairs (%(-), x(-) )6 ?f the points lie onZnear a straight

    line, the observations support the !ontention that theycould have been dra#n from a normal parent population6( ) ( )

    2-7 2

    . .

    1. -

    1 2 p 7 + = e d7 = p =n2?

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    59/81

    "he results of !al!ulations for the U+U plot look like this

    Ordered

    Observations

    x(j)

    djusted

    -robabilit45evel

    (j+')6n

    Standard

    Normaluantiles

    .(j)

    1!" ''"" +1$"!

    1#2 '1'' +12$2

    2!# '1#& +'%#&

    2!$ '2"" +'&2$2%& '"'' +'2!

    !'" '"#& +'"!1

    !!& '!"" +'1#$

    ''' ''''

    ##1 '#& '1#$

    ##$ '#"" '"!1#&% '&'' '2!

    &!# '& '&2$

    &$$ '$"" '%#&

    $%2 '%'' 12$2

    %!2 '%#& 1$"!

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    60/81

    Fand the resulting U+U plot looks like this

    "here donAt appear to be great departures from thestraight line dra#n through the points, but it doesnAt t

    terribly #ell, eitherF

    + -lot

    Standard Normal uantiles .(j)

    Observed

    alues x j

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    61/81

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    62/81

    9or our previous example, the intermediate !al!ulations aregiven in the table belo#

    x(j) + x (x(j) + x)2 .(j) + . (.(j) + .)

    2 (x(j) + x)(.(j) + .)

    +"$" 1!#%& +1$"! ""#" &'"1

    +"# 1""1! +12$2 1#!2 !#

    +2$' &$ +'%#& '%"# 2&11

    +2&% &&&& +'&2$ '"' 2'"'

    +2"' 2&' +'2! '2& 12'!

    +12" 11! +'"!1 '11# '!1%

    +'$' '#"& +'1#$ ''2$ '1"!

    '!% '2!! '''' '''' ''''

    1" 1$1' '1#$ ''2$ '22#

    1!1 2''1 '"!1 '11# '!$2

    12 2"1 '2! '2& '&%$21% !$'$ '&2$ '"' 1%#

    2#2 #$#' '%#& '%"# 2"!

    "## 1""$& 12$2 1#!2 !#$%

    !1 1&2" 1$"! ""#" &#1!

    ''' %%&2! '''' 1"&$1 "#1!"

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    63/81

    Hvaluation of the 8earsonAs !orrelation !oeK!ient bet#een%(-) and x(-) yields

    "he sample si*e is n ' &/, so !riti!al points for the test ofnormality are 06T/0; at α ' 06&0, 06T;XT at α ' 060/, and06T&> at

    α

     ' 060&6 "hus #e do not re-e!t the hypothesis ofnormality at any

    α

     larger than 060&6

    ( )( )   ( )( )

    ( )( )   ( )( )

    ∑ ∑

    n

    . .

    .=1

    Qn n2 2

    . .

    .=1 .=1

    x - x -  

    r =

    x - x -  

    GCG  =

    EE 1G

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    64/81

    4hen addressing the issue of multivariate normality, thesetools aid in assessment of normality for the univariatemarginal distributions6 Yo#ever, #e should also !onsiderbivariate marginal distributions (ea!h of #hi!h should be

    normal if the overall -oint distribution is multivariatenormal)6

    "he methods most !ommonly used for assessing bivariatenormality are

      + s!atter plots

      + Chi+S%uare 8lots

    Hxample suppose #e had the follo#ing fteen (ordered)

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    65/81

    Hxample 7 suppose #e had the follo#ing fteen (ordered)sample observations on some random variables 5& and 5

    Do these data support the

    assertion that they #ere dra#nfrom a bivariate normal parentpopulation2

    x j1 x j2

    1!" +'#%

    1#2 +''

    2!# +11"

    2!$ +2'

    2%& +#"%

    !'" 2$&

    !!& +&$$

    +"%&

    ##1 2"2

    ##$ +"2!

    #&% +"#

    &!# 1#1

    &$$ +1$&

    $%2 +##'

    %!2 +&#!

    "he s!atter plot of pairs (x x ) support the assertion that

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    66/81

    "he s!atter plot of pairs (x&, x) support the assertion that

    these data #ere dra#n from a bivariate normal distribution(and that they have little or no !orrelation)6

    Scatter -lot

    X1

    X2

    "o !reate a Chi S%uare plot #e #ill need to !al!ulate the

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    67/81

    "o !reate a Chi+S%uare plot, #e #ill need to !al!ulate thes%uared generali*ed distan!e from the !entroid for ea!hobservation x -

    9or our bivariate data #e have

    ( ) ( )1 1 1 11

    '2 -1

    . . .d = x - x 4 x - x ,. = 1, ,n$ 

    1

    -1

     F

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    68/81

    Fso the s%uared generali*ed distan!es from the !entroidare

    if #e order theobservationsrelative to theirs%uaredgenerali*eddistan!es →

    x j1 x j2 d2 j

    +"%& ''%'

    ##$ +"2! '2$1

    #&% +"# '"""

    &$$ +1$& 11"$

    2!# +11" 1""#

    2!$ +2' 1!$

    2%& +#"% 1&"%

    !!& +&$$ 2''

    1#2 +'' 22&%

    1!" +'#% 2!''

    &!# 1#1 2#22

    $%2 +##' 2#$#

    ##1 2"2 2&"&

    !'" 2$& 2%

    %!2 +&#! "$1%

    x j1 x j2 d2 j

    1!" +'#% 2!''

    1#2 +'' 22&%

    2!# +11" 1""#

    2!$ +2' 1!$

    2%& +#"% 1&"%

    !'" 2$& 2%

    !!& +&$$ 2''

    +"%& ''%'

    ##1 2"2 2&"&

    ##$ +"2! '2$1

    #&% +"# '"""

    &!# 1#1 2#22

    &$$ +1$& 11"$

    $%2 +##' 2#$#

    %!2 +&#! "$1%

    4 th d th di til

     

    !h1. -

    2

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    69/81

    4e then nd the !orresponding per!entile

    No# #e !reate a s!atterplot of the pairs

     (d -&, %!,M(-+6/)ZnO) ?f these points lie on astraight line, the datasupport the assertionthat they #ere dra#n

    from a bivariate normalparent population6

    of the Chi+S%uare distribution #ith p degrees of freedom6

    x j1 x j2 d2 j (j+')6n .c,27(j+')6n8

    +"%& ''%' ''"" ''#$

    ##$ +"2! '2$1 '1'' '211

    #&% +"# '""" '1#& '"#&$$ +1$& 11"$ '2"" '"1

    2!# +11" 1""# '"'' '&1"

    2!$ +2' 1!$ '"#& '%1!

    2%& +#"% 1&"% '!"" 11"#

    !!& +&$$ 2'' ''' 1"$#

    1#2 +'' 22&% '#& 1#&21!" +'#% 2!'' '#"" 2''&

    &!# 1#1 2#22 '&'' 2!'$

    $%2 +##' 2#$# '& 2%11

    ##1 2"2 2&"& '$"" "$!

    !'" 2$& 2% '%'' !#'

    %!2 +&#! "$1% '%#& #$'2

           

    2100n

    "hese data donAt seem to support the assertion that they#ere dra#n from a bivariate normal parent populationF

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    70/81

    #ere dra#n from a bivariate normal parent populationF

    possibleoutliers:

    9i+S.uare -lot

    .c,27(j+')6n8

    d2

    (j)

    S t l l ki t if hl h lf th

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    71/81

    Some suggest also looking to see if roughly half thes%uared distan!es d - are less than or e%ual to %!,p(06/0)

    (i6e6, lie #ithin the ellipsoid !ontaining /0V of all potentialp+dimensional observations)6

    9or our example, W of our fteen observations (about.>6>WV) of all observations are less than %!,p(06/0) ' &6;X>

    standardi*ed units from the !entroid (i6e6, lie #ithin theellipsoid !ontaining /0V of all potential p+dimensionalobservations)6

    Note that the Chi+S%uare plot !an easily be extended to pE dimensions6

    Note also that some resear!hers also !al!ulate the!orrelation bet#een d

     -&

     and %!,p

    M(-+6/)ZnO6 9or our example

    this is 06XT/6

    Y Qutlier Dete!tion

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    72/81

    Y6 Qutlier Dete!tion

    Dete!ting outliers (extreme or unusual observations) in pE dimensions is very tri!ky6 Consider the follo#ing

    situation

    T0V!onden!eellipsoid

    T0V

    !onden!e

    T0V!onden!einterval for

    5

    5

    5&

    1 strategy for multivariate outlier dete!tion

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    73/81

    gy

      +

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    74/81

    Yere are !al!ulated standardi*ed values (* -iAs) and s%uared

    generali*ed distan!es (d -As) for our previous data

    "his one looks a little

    unusual in p '

    x j1 * j1 x j2 * j2 d2 j

    '1$ +"%& +'2' ''%'

    ##$ '"' +"2! +''!" '2$1

    #&% '&' +"# +'1"1 '"""

    &$$ '%$1 +1$& '"!& 11"$

    2!# +1'' +11" '# 1""#

    2!$ +1'! +2' +'%% 1!$

    2%& +'$#' +#"% +'%"# 1&"%

    !!& +'2%% +&$$ +1"% 2''

    1#2 +1"#& +'' +'!1 22&%

    1!" +1!"# +'#% '#$1 2!''

    &!# '$22 1#1 1""" 2#22

    $%2 1"&1 +##' +'%%! 2#$#

    ##1 ''! 2"2 1"# 2&"&

    !'" +'!#1 2$& 1#%1 2%

    %!2 1# +&#! +12%1 "$1%

    ? "ransformations to Near

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    75/81

    ?6 "ransformations to NearNormality

    "ransformations to make nonnormal data approximatelynormal are usually suggested by

      + theory

      + the ra# data

    Some !ommon transformations in!lude

    Qriginal S!ale "ransformed S!ale

    Counts y

    8roportions p

    Correlations rJ

    ( )   

       

    ˆˆ

    ˆ

    1 plo:i! p = lo:

    2 1 - p

    ( )   

       

    1 1 + rRisher's 7 r = lo:

    2 1 - r

    i d i bl i

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    76/81

    9or !ontinuous random variables, an appropriatetransformation !an usually be found among the family ofpo#er 7 Box and Cox M&T>.O suggest an approa!h tonding an appropriate transformation from this family6

    Box and Cox !onsider the slightly modied family of po#ertransformations

    ( )

    ( )

    λ

    λ

    x - 1 λ 0

    x = λ

    ln xλ = 0

    9 b i h B C h i f i

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    77/81

    9or observations x&,F,xn, the Box+Cox !hoi!e of appropriate

    po#er λ for the normali*ing transformation is that #hi!hmaximi*es

    #here

    and

    ( )   ( ) ( )( )   ( )   ( )

    ∑ ∑n n2

    λ λ

    . . .

    .=1 .=1

    n 1λ = - ln x - x + λ - 1 ln x

    2 nl

    ( )

    ( )

    λ

    λ

    x - 1 λ 0

    x = λ

    ln xλ = 0

    ( ) ( )( )  

       

    ∑λn.λ λ

    . .

    .=1

    x - 11 1x = x =

    n nλ

    4 th l t l ( ) t i t h t i t l

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    78/81

    4e then evaluate l (λ

    ) at many points on an short interval(say M+&,&O or M+,O), plot the pairs (

    λ

    , l (λ

    )) and look for amaximum point6

     Qften a logi!al value ofλ

     nearλ

    \ is !hosen6

    l (λ

    )

    λ

    l (λ

      )

    λ

     

    nfortunately, l is very volatile asλ

     !hanges (#hi!h

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    79/81

    !reate some other analyti! problems to over!ome)6 "hus#e !onsider another transformation to avoid thisadditional problem

    #here

    is the geometri! mean of the responses and is fre%uently!al!ulated as the antilog of 

    ( )

    ( )

          ÷

     

    ∏ 

    λ λ

    . .

    < λ-11 n

    λ-1 n

    .λi

    .i=1

    <

    x - 1 x - 1 = for λ 0

    λxλ x =

    xln x for λ = 0

     

    ÷  ∏

    1

    n n<

    ii=1

    = xx

    ( )( )

    n<

    -1

    ii=1

    = n ln xln x

     <λ 1

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    80/81

    "heλ

     that results in minimum varian!e of this transformedvariable also maximi*es our previous !riterion

    and is the nth po#er of the appropriate [a!obian ofthe transformation (#hi!h !onverts the responses (xiAs into

    As)6

    9rom this point for#ard pro!eed substituting the Is forthe As in the previous analysis6

    ( )   ( ) ( )( )   ( )   ( ) ∑ ∑n n2

    λ λ. . .

    .=1 .=1

    n 1λ = - ln x - x + λ - 1 ln x2 n

    l

    λ-1x

    ( )λ.

    ( )λ.

    x

    ( )λ

    .

  • 8/9/2019 DISTRIBUSI Multivariate Normal

    81/81

    Note that

      + the value of λ generated by the Box+Cox transformation isonly optimal  in a mathemati!al sense 7 use something !losethat has some meaning6

      + an approximate !onden!e interval forλ

     !an be found

      + other means for estimatingλ

     exist

      + if #e are dealing #ith a response variable, transformationsare often use to Istabili*eA the varian!e

      + for a p+dimensional sample, transformations are!onsidered independently for each of the p variables

      + #hile the Box+Cox methodology may help !onvert ea!hmarginal distribution to near normality, it does notguarantee the resulting transformed set of p variables #ill

    have a multivariate normal distribution6


Recommended