Mary Immaculate College Research Seminar September 3 rd 2013 Anomalies of the Maximum Likelihood...

Post on 16-Dec-2015

219 views 2 download

Tags:

transcript

Mary Immaculate College Research SeminarSeptember 3rd 2013

Anomalies of the Maximum Likelihood Estimator

Diarmuid O’DriscollMary Immaculate College

Limerick

Donald E RamirezUniversity of Virginia

Charlottesville

Least Squares Regression Line

• Least Squares Regression Line in the Measurement Error Model has wide interest with many applications

• Studied in depth by Carroll et al. (2006) and Fuller (1987)

• OLS(Y|X) assumes that measurement of X is error free

Minimizing Oblique Errors

From ),( ii yx to the fitted line xxhy 10)( ,

we set iii xyv 10 .

The sum of the squares of the oblique lengths from

),( ii yx to )))(()),(()(( 11iiiiii yxhyyhxyh

is

./)1(),,( 2221

2210 iio vvSSE

Standardized Weighted Model

(1) /)1(),,( 2221

2210 ixxiyyo vsvsSSE

 

(2) )/()1()1(/)/( 5.021

231

241

5.12xxyyyyxxyyxx ssssss

Quartic

Minimizing SSEo in (1) , produces the quartic ()

X and Y are random variables with respective finite variances 2X and

2Y , finite fourth moments and have the linear functional relationship

XY 10 .

The observed data { ),( ii yx , ni 1 } are subject to error by

iii Xx ; iii Yy

It is also assumed that

Measurement Error Model

Specific values of lambda

With λ = 1 we recover the minimum squared vertical errors, ver

1 .

With λ = 0 we recover the minimum squared horizontal errors, hor

1 .

The geometric mean estimator xxyygm ss /1

has the oblique parameter λ = 0.5.

The expected value for ver1 is attenuated

towards zero by the attenuating factor )/( 222

XX .

The expected value for hor1 is amplified

towards infinity by the amplifying factor 222 /)( YY .

gm

1 is biased unless 2

2

2

2

X

Y

Bias

The Maximum Likelihood Estimator

If the ratio of the error variances 22 / is assumed finite,

the maximum likelihood estimator for the slope is

(5) 2

4)()( 22

1yyxx

yyxxxxyyxxyymle

ss

ssssss

If = 1, then the MLE (often called the Deming Regression estimator) is equivalent to the perpendicular

estimator, per

1 , first introduced by Adcock (1878).

Bias of the MLE for an incorrect choice of

• In practice, the researcher estimates by with error.

• With and ,

the bias, : , in terms of is

(6)

X has a Uniform Distribution over the interval (0, 20)

XY 1 for varying values of the true slope 1

Both X and Y are subjected to errors (2 ,

2 ) }9,4,1{}9,4,1{

Sample size n is set to 50.

The number of repetitions R is set to 5000.

Monte Carlo Simulation

Table 1A

Percentage Bias of MLE estimator for assumed ratios ~ for varying true values of X is UD(0,20), 1 =1, 0 =0, R=5000, n=50

} , ~{

1:9 1:4 4:9 1:1 4:4 9:9 9:4 4:1 9:1 1:9 0.75 1.58 8.41 2.63 10.14 21.27 24.35 10.76 24.8

2 1:4 -2.25 0.79 4.51 2.03 7.73 17.39 20.61 9.32 22.1 4:9 -5.42 -1.53 0.30 1.38 5.08 11.36 16.34 7.72 18.9

5 1:1 -10.7 -4.32 -6.87 0.23 0.35 0.55 8.37 4.79 12.95 9:4 -15.2 -6.94 -13.1 -0.91 -4.16 -9.23 0.49 1.83 6.71

4:1 -17.6 -8.31 -16.2 -1.55 -6.53 -13.9 -3.66 0.21 3.26 9:1 -19.3 -9.51 -18.7 -2.13 -8.58 -17.7 -7.15 -1.26 0.17

Series expansion of :

The series expansion, :, of the bias may be written in terms of ε as

(7)

Since

Equation (7) shows that : is not alone dependent on the magnitude of but is also dependent on the magnitude of .

Table 1B; :; : with .

bias(:κ)

3 9 0.333 −6.0 −0.0298 −0.0296 −0.0281

5 9 0.555 −4.0 −0.0201 −0.0198 −0.0202

2 4 0.500 −2.0 −0.0100 −0.0100 −0.0103

1 3 0.333 −2.0 −0.0089 −0.0099 −0.0112

1 2 0.500 −1.0 −0.0048 −0.0050 −0.0052

2 1 2.000 1.0 0.0047 0.0050 0.0048

3 1 3.000 2.0 0.0103 0.0101 0.0088

4 2 2.000 2.0 0.0107 0.0101 0.0097

9 5 1.800 4.0 0.0204 0.0202 0.0197

9 3 3.000 6.0 0.0318 0.0304 0.0286

Second and Fourth Moments

From Gillard and Iles (2009), second moment equations are

𝑠𝑥𝑥 = 𝜎𝑋2 + 𝜎𝛿2; 𝑠𝑦𝑦 = 𝛽12𝜎𝑋2 + 𝜎𝜏2 ; 𝑠𝑥𝑦 = 𝛽1𝜎𝑋2

and fourth moment equations are 𝑠𝑥𝑥𝑥𝑦 = 𝛽1𝜇𝑋4 + 3𝛽1𝜎𝑋2𝜎𝛿2; 𝑠𝑥𝑦𝑦𝑦 = 𝛽13𝜇𝑋4 + 3𝛽1𝜎𝑋2𝜎𝜏2. These equations yield the estimators

2~ = 𝑠𝑥𝑥 − 𝑠𝑥𝑦 𝛽1 ; 2~

= 𝑠𝑦𝑦 − 𝛽1𝑠𝑥𝑦.

(3) )~)(~( 222xyyyxx sss

(4) )~3()~())(~3( 22222 xyxyyyxxxyxyxxxy ssssss

The Frisch hyperbola of Van Montfort (1989) is

and the fourth order moment equation

Moment Estimating Equations

True Ratio: Solution:  

True Ratio: Solution:   0.14962

The invertible function ]1,0[],0[: defined by

, / , )1/( )( yyxx ssccc

creates our second estimator lam1 .

Using the solutions from equations (3) and (4) as

estimates for in mle

1 , we introduce the first of

our estimators, kap1 .

Two new oblique estimators

We consider the six estimators { ver1 , gm

1 , hor1 , per

1 , kap1 , lam

1 }

X has a Uniform Distribution over the interval (0, 20)

XY 1 for varying values of the true slope 1

Both X and Y are subjected to errors (2 ,

2 ) }9,4,1{}9,4,1{

Sample size n is set to 100.

The number of repetitions R is set to 1000.

Monte Carlo Simulation

Table 2 X is UD(0,20), 1 = 1, 0 = 0, R =1000, n = 100, =1, = 3

MSE 310 %Bias λ ver1 46.569 -21.189 1 51.76 gm

1 11.897 -9.947 0.500 95.99 hor

1 4.402 2.9572 0 134.17 per

1 15.130 -11.246 0.556 89.93 kap

1 4.625 -1.382 0.169 118.37 lam1 4.442 -0.029 0.237 123.49

Table 3

X is UD(0,20), 1 =1.25, 0 = 0, R =1000, n =100, =1, =3

MSE 310 %Bias λ ver1 70.809 -20.929 1 45.33 gm

1 18.425 -10.036 0.500 83.29 hor

1 5.708 2.413 0 127.99 per

1 15.081 -8.546 0.434 89.90 kap

1 6.304 -1.180 0.171 114.70 lam1 5.847 0.092 0.145 116.62

Table 4 X is UD(0,20), 1 =1, 0 = 0, R =1000, n =100, = 2, = 2

MSE 310 %Bias λ ver1 13.403 -10.688 1 48.23 gm

1 2.117 0.0989 0.500 89.94 hor

1 18.146 12.232 0 131.70 per

1 2.672 0.126 0.500 89.92 kap

1 4.432 0.295 0.495 90.38 lam1 5.962 0.425 0.497 90.14

Table 5

X is UD(0,20), 1 = 0.75, 0 = 0, R =1000, n =100, = 2, = 2

MSE 310 %Bias λ ver1 7.791 -10.518 1 56.13 gm

1 2.603 4.196 0.500 103.99 hor

1 28.487 21.417 0 137.68 per

1 2.041 0.169 0.640 89.96 kap

1 4.233 0.725 0.590 95.55 lam1 5.402 -0.029 0.615 92.97

Table 6 Effective ~ average, X is UD(0,20), 1 =1, 0 = 0, R =1000, n =100

2 =1 2

= 4 2 = 9

2 =1 1.1781 3.3975 6.1251

2 =4 0.3185 0.9169 1.9514

2 =9 0.1701 0.4090 1.1658

Table 7

( 2 , 2

) “bumps”, X is UD(0,20), 1 =1, 0 = 0, R =1000, n =100

2 =1 2

= 4 2 = 9

2 =1 (60, 58) (199, 0) (284, 0)

2 =4 (2, 195) (24, 9) (89, 0)

2 =9 (0, 286) (2, 75) (21, 21)

Bibliography

• Adcock, R. J. (1878). 'A problem in least-squares', The Analyst, 5: 53-54.• Al-Nasser A. (2012). 'On using the maximum entropy median for fitting the un-

replicated functional model between the unemployment rate and the human development index in the Arab states', J. Appl. Sci., 12: 326–335.

• Carroll, R. J., Ruppert, D., Stefanski, L. A., Crainiceanu, C. M. (2006). Measurement Error in Nonlinear Models - A Modern Perspective, Boca Raton: Chapman & Hall/CRC, Second Edition.

• Deming, W. E. (1943). Statistical Adjustment of Data, New York: Wiley.• Fuller, W.A. (1987). Measurement Error Models, New York: Wiley.• Gillard, J., Iles T. (2009). 'Methods of fitting straight lines where both variables are

subject to measurement error', Current Clinical Pharmacology, 4: 164-171. • O'Driscoll, D., Ramirez, D. (2011). 'Geometric View of Measurement Errors',

Communications in Statistics-Simulation and Computation, 40: 1373-1382.• Van Montfort, K., Mooijaart, A., and de Leeuw, J. (1987). 'Regression with errors in

variables', Statist Neerlandica, 41: 223-239. • Yang L. (1999). 'Recent advances on determining the number of real roots of

parametric polynomials', J. Symbolic Computation, 28: 225-242.