+ All Categories
Home > Documents > Sufficient statistics. The Poisson and the exponential can be summarized by (n, ).

Sufficient statistics. The Poisson and the exponential can be summarized by (n, ).

Date post: 11-Jan-2016
Category:
Upload: bluma
View: 20 times
Download: 0 times
Share this document with a friend
Description:
More Chapter 4. Sufficient statistics. The Poisson and the exponential can be summarized by (n, ). So too can the normal with known variance Consider a statistic S(Y) - PowerPoint PPT Presentation
44
Sufficient statistics. The Poisson and the exponential can be summarized by (n, ). So too can the normal with known variance Consider a statistic S(Y) Suppose that the conditional distribution of Y given S does not depend on , then S is a sufficient statistic for based on Y Occurs iff the density of Y factors into a function of s(y) and and a function of y that doesn't depend on y More Chapter 4
Transcript
Page 1: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Sufficient statistics.

The Poisson and the exponential can be summarized by (n, ).

So too can the normal with known variance

Consider a statistic S(Y)

Suppose that the conditional distribution of Y given S does not depend on , then S is a sufficient statistic for based on Y

Occurs iff the density of Y factors into a function of s(y) and and a function of y that doesn't depend on

y

More Chapter 4

Page 2: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example. Exponential

IExp() ~ Y

E(Y) = Var(Y) = 2

Data y1,...,yn

L() = -1 exp(-yj /)

l() = -nlog() - yj /

yj /n is sufficient

Page 3: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

2222

2

322

2

2

2)ˆ(

2)(

m.l.e. ˆ

1

0)ˆ( .

yn

yn

ynl

nnl

y

ynl

UequationLikelihood

maximum

Page 4: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

0})(

2{)}({

)}({)(

.

2)(

.

2000

0

2

32

ynn

EUE

n

JEI

nInformatioFisher

ynn

J

nInformatioObserved

=

Page 5: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

))(

,(~ˆ

distin )()(

)()(

probin 1)2

()()()(

)/()}({

)/()()}({

200

0

20

002/10

3

0

2

0

1

2

0

010

200

0

2000

0

nN

Zyn

nUI

ynnnn

JI

nJE

nIUVar

Page 6: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Approximate 100(1-2 )% CI for 0

2

2/1

/)ˆ()ˆ(

)ˆI(insert Could

)ˆ(ˆ

ynJIHere

Jz

Example. spring data

8.34)(168.26,16 )0188(.96.130.168

000353.3.168/10/)ˆ()ˆ(

3.168ˆ

22

ynJI

cyclesky

Page 7: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Weibull.

)/log()/(/log/

)/(/),(

log)1(loglog),(

)(exp),;()(

1 if lExponentia

0,, ,)(exp),;(

1

1

1

jjj

j

jj

j

jn

n

yyyn

ynU

yynnl

yyyfL

yyy

yf

Page 8: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Note.

),ˆ(),(max

problemmax D-1

)(ˆ /11

lll

yn

profile

j

Expected information

large Want

/)(log)(

/)(2)(2)'(1 (2)/-(2)/- )/(

),(22

2

I

dzzdz

nI

Page 9: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Gamma.

sufficient ),log(

log)1()(loglog)(

)exp()(

)(

0,, ),exp()(

),;(

1

1

jj

jj

jjn

n

yyS

yynnl

yyL

yyy

yf

Page 10: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example. Bernoulli

Pr{Y = 1} = 1 - Pr{Y = 0} = 0 1

L() = ^yi (1 - )^(1-yi)

= r(1 - )n-r

l() = rlog() + (n-r)log(1-)

r = yj

R = Yj is sufficient for , as is R/n

L() factors into a function of r and a constant

Page 11: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Score vector

[ yj / - (n-yj )/(1-)]

Observed information

[yj /2 + (n-yj )/(1-)2 ]

ny j /ˆ

M.l.e.

Page 12: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Cauchy.

ICau()

f(y;) = 1/(1+(y-)2 )

E|Y| = Var(Y) =

L() = 1/((1+(yj -)2 )

Many local maxima

l() = -log(1+(yj -)2 )

J() = 2((1-(yj -)2 )/(1+(yj -)2 )2 I() = n/2

sufficient is ,....y

N(0,1) closer to is

)ˆ()ˆ( Z)ˆ()ˆ(

)((1)

0

2/1

J0

2/1

n

J

I

y

Z

JIZ

Page 13: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).
Page 14: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Uniform.

f(u;) = 1/ 0 < u <

= 0 otherwise

L() = 1/n 0 < y1 ,..., yn <

= 0 otherwise

0ˆ//)ˆ(

0/)ˆ(

,...y y

)max(ˆ

222

1(n)

nddl

ddl

y

y

n

j

Page 15: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

l() becomes increasingly spikey

E u() = -1 i() = -

ondistributiin lExponentia)ˆ(

1

0 )/(

0 }ˆPr{

n

a

aa

aan

Page 16: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Logistic regression. Challenger data Ibinomials Rj , mj , j

)21()ˆ)(ˆ,ˆ()ˆ(

region Confidence

),(

statistic Sufficient

))exp(1(

)exp(

)!(!

!

),;Pr(),(

})exp{1/(}exp{

2001000

1

110

110

1010

110110

cJ

xRRS

x

xrr

rmr

m

rRL

xx

T

jjj

m

j

jjj

jjj

j

jj

jjj

j

Page 17: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).
Page 18: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Likelihood ratio.

Model includes dim() = p

true (unknown) value 0

Likelihood ratio statistic

)( ason distributiin )(

)}()ˆ({2)(

020

00

IW

llW

p

Page 19: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Justification.

Multinormal result

If Y ~ N (,) then (Y- )T -1(Y- ) ~ p2

)ˆ)(ˆ()ˆ(

)ˆ)(ˆ()ˆ(

)ˆ()ˆ(

)ˆ(21

)ˆ(

)ˆ()ˆ()ˆ({2

)}()ˆ({2)(

00

00

02

00

00

I

J

llll

llW

T

T

T

TT

Page 20: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Uses.

Pr[W(0) cp(1-2 )] 1-2

)}21(21

)ˆ()(:{

)21( )(

p

p

cll

cW

Approx 100(1-2 )% confidence region

Page 21: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example. exponential

84.3}/log1/{2:{

84.3)95.0( 1

}/log1/{2)}()ˆ({2

log)ˆ(

/log)(

1

000

yyn

cp

yynll

nynl

ynnl

Spring data: 96 < <335

vs. asymp normal approx 64 < <273 kcycles

Page 22: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Prob-value/P-value. See (7.28)

Choose T whose large values cast doubt on H0

Pr0(T tobs)

Example. Spring data

Exponential E(Y) =

H0: = 100?

.071.0368*2

)802.1|Pr(|)248.3(Pr

248.3)100(

}/log1/{2)(

/)ˆ()ˆ(

10n 3.168ˆ

2

10

2

ZvalueP

W

yynW

ynJI

y

Page 23: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Nesting

: p by 1 parameter of interest

: q by 1 nuisance parameter

Model with params (0, ) nested within (, )

Second model reduces to first when = 0

)ˆ,()ˆ,ˆ(

Note.

0

0

ll

Page 24: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example. Weibull

params (,)

exponential when = 1

How to examine H0 : = 1?

1p on,distributiin

)]ˆ,()ˆ,ˆ([2)(

2

p

000

llWp

Page 25: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Spring failure times. Weibull

07-5.73E

07)-2(2.867E

)00.5|(|

02.25)]1,168()6,181([2

26.61)1,168( 2749.2)1,168(

)1,168()1,( 75.48)6,181(

227.6)6,181( )6,181()ˆ,ˆ(

1

ZPvalueP

ll

lEL

l

EL

Page 26: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Challenger data. Logistic regression

temperature x1 pressure x2

(0 , 1 , 2 ) = exp{}/(1+exp{})

= 0 + 1 x1 + 2 x2 linear predictor

loglike l(0 , 1 , 2 ) =

0 rj + 1 rj x1j + 2 rj x2j - m log(1+exp{j })

Does pressure matter?

214.)107(.2)24.1|Pr(|

)54.1(Pr :

54.177.*2

05.15),,(max

82.15)0,,(max

2

10

210,,

10,

210

10

Z

valueP

l

l

Page 27: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Model fit.

Are labor times Weibull?

Nest its model in a more general one

Generalized gamma.

0,,, ),exp()(

),,;(1

yyy

yf

Gamma for =1

Weibull for =1

Exponential for ==1

Page 28: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Likelihood results.

max log likelihood:

generalized gamma -250.65

gamma -251.12

Weibull -251.17

gamma vs. generalized gamma

- 2 log like diff:

2(-250.65+251.12) = .94

P-value Pr0 (12 > .94)

= Pr(|Z|>.969)

= 2(.166) = .332

Page 29: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Chi-squared statistics. Pearson's chi-squared

categories 1,...,k

count of cases in category i: Yi

Pr(case in i) = i 0 < i < 1 1k i =1

E(Yi ) = ni

var(Yi ) = i (1 - i )n

cov(Yi ,Yj ) = -i j n i j

E.g. k=2 case cov(Y,n-Y) = -var(Y) = -n1 2

= { (1 ,...,k ): 1k i = 1, 0<1 ,...,k <1}

dimension k-1

Page 30: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Reduced dimension possible?

model i () dim() = p

log like general model:

1k-1 yi log i + yk log[1-1 -...-k-1], 1

k yi = n

nYii /ˆ

log like restricted model:

l() = 1k-1 yi log i() + yk log[1-1()-...-k-1()]

Page 31: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

likelihood ratio statistic:

k

pkiiiy1

2

1~)ˆ(/ˆlog2

if restricted model true

The statistic is sometimes written

W = 2 Oi log(Oi /Ei )

(Oi - Ei )2/Ei

)ˆ(E where i iii nyO

Page 32: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Pearson's chi-squared.

5)ˆ(ntion recommenda

~

)ˆ(/)]ˆ([2

p-1-k

12

i

kiii nnyP

Page 33: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example. Birth data. Poisson?

12.9ˆ 92n arrivalsDaily

Split into k=13 categories

[0,7.5), [7.5,8.5),...[18.5,24] hours

O(bserved) 6 3 3 8 ...

E(xpected) 5.23 4.37 6.26 8.08 ...

P = 4.39

P-value Pr(112 > 4.39) = .96

Page 34: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Two way contingency table.

r rows and c columns

n individuals

Blood groups A, B, AB, O

A, B antigens - substance causing body to produce antibodies

2

2

2

)1)(1(202

26

2)1(35

2)1(179

O

BA

OBB

OAA

O

AB

B

A

group count model I model II

O = 1 - A - B

Page 35: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).
Page 36: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Question. Rows and columns independent?

W = 2 yij log nyij / yi.y.j

with yi. = j yij

~ k-1-p2 = (r-1)c-1)

2

with k=rc p=(r-1)+(c-1)

P = (yij - yi. y.j /n)2 / (yi. y.j /n)

~ (r-1)(c-1)2

Page 37: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Model 1

W = 17.66

Pr(12> 17.66) = Pr(|Z| > 4.202) = 2.646E-05

P = 15.73

Pr(12> 15.73) = Pr(|Z| > 3.966) = 7.309E-05

k-1-p = 4-1-2 = 1

Model 2

W = 3.17

Pr(|Z| > 1.780) = .075

P = 2.82

Pr(|Z|>1.679) = .093

Page 38: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Incorrect model.

True model g(y), fit f(y;)

valuebad"least " :

0D 0);(

yprobabilitin )()f(y; log /)ˆ(

ydiscrepancLiebler -Kullback

)(})f(y;

g(y)log{ );( minimizes

);( log )( maximizes ˆ

g

g

g

j

ffD

dyygnl

dyyggfD

yfl

Page 39: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example 1. Quadratic, fit linear

Page 40: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example 2. True lognormal, but fit exponential

dyyg

nYY

y

ZY

gg

g

)(})f(y;

g(y)log{ Minimizing

/)1(var }2/exp{YE ˆ

}2/exp{

/log :likelog

}exp{ Lognormal

222

2

Page 41: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Large sample distribution.

);()( ifresult mle

))()()(;( ~ ˆ 11

yfyg

IKIN ggggp

Page 42: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Model selection.

Various models:

non-nested

Ockham's razor.

Prefer the simplest model

Page 43: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Formal criteria.

}log)ˆ({2

})ˆ({2

nplBIC

plAIC

Look for minimum

Page 44: Sufficient statistics.    The Poisson and the exponential can be summarized by (n,  ).

Example. Spring failure

Model p AIC BIC

M1 12 744.8* 769.9*

M2 7 771.8 786.5

M3 2 827.8 831.2

M4 2 925.1 929.3

6 stress levels

M1: Weibull - unconnected , at each stress level


Recommended