+ All Categories
Home > Documents > The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower...

The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower...

Date post: 18-Mar-2018
Category:
Upload: lenga
View: 230 times
Download: 5 times
Share this document with a friend
17
The written Master’s Examination Option Statistics and Probability SPRING 2009 Full points may be obtained for correct answers to 8 questions. Each numbered question (which may have several parts) is worth the same number of points. All answers will be graded, but the score for the examination will be the sum of the scores of your best 8 solutions. Use separate answer sheets for each question. DO NOT PUT YOUR NAME ON YOUR ANSWER SHEETS. When you have finished, insert all your answer sheets into the envelope provided, then seal and print your name on it. Any student whose answers need clarification may be required to submit to an oral examination.
Transcript
Page 1: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

The written Master’s Examination Option Statistics and Probability SPRING 2009 Full points may be obtained for correct answers to 8 questions. Each numbered question (which may have several parts) is worth the same number of points. All answers will be graded, but the score for the examination will be the sum of the scores of your best 8 solutions. Use separate answer sheets for each question. DO NOT PUT YOUR NAME ON YOUR ANSWER SHEETS. When you have finished, insert all your answer sheets into the envelope provided, then seal and print your name on it. Any student whose answers need clarification may be required to submit to an oral examination.

Page 2: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

MS Exam, Option Probability and Statistics, SPRING 2009

1. (Stat 401)

Let U and V be two independent, standard normal random variables. (a) Find the joint distribution of U and UV+ V− .

(b) Find the distribution of 2( )

U VU V+

−.

2. (Stat 411)

Let X1, …, Xn be independent random variables with Xi distributed as N(βai,σ2), for i=1,…,n, where a1,..,an are known (non-random) real numbers at least one of which is non-zero and σ2 is known.

(i) Derive the maximum likelihood estimator (mle) of β. (ii) Is the mle unbiased? Justify your answer. (iii) Find the Rao-Cramer lower bound for unbiased estimators of β. (iv) Is the mle of β efficient (minimum variance unbiased)? Justify your answer.

3. (Stat 411)

Consider one random variable X that has a binomial distribution with 4=n and θ=p . [1] Find the most powerful test of size 16/1≤α based on X only for

75.0: vs.5.0: 10 == θθ HH . [2] Calculate the power of your test. [3] Justify that your test is the most powerful one of size . 16/1

4. (Stat 416)

A chemist wishes to test the effect of four chemical agents on the strength of a particular type of cloth. Because there might be variability from one bolt to another, the chemist decides to use a randomized block design, with the bolts of cloth considered as blocks. She selects five bolts and applies all four chemicals in random order to each bolt. The resulting tensile strengths follow.

Bolt 1 2 3 4 5 Chemical A 66 47 60 60 55 Chemical B 68 48 69 70 55 Chemical C 66 56 71 68 58 Chemical D 60 47 64 63 51

Describe the basic model of the Friedman test, give its test statistic S, and compute S. Explain also why the Friedman test is here more appropriate than the Kruskal-Wallis test.

2

Page 3: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

MS Exam, Option Probability and Statistics, SPRING 2009

5. (Stat 431)

A client has a finite population of 6 units and has funds to survey only 3 units for the purpose of estimating the mean and its related standard deviation. There are two sampling plans to be considered by two survey practitioners. Sampling Plan by Survey Practitioner 1: Is a simple random sampling plan of size 3 without replacement, SRS (6, 3). Recall that this is a uniform sampling plan with all 20 samples of size 3 in its support. Sampling Plan by Survey Practitioner 2: Is a controlled sampling plan with only 14 samples: A- The following six samples are excluded from being surveyed, that is they have zero probability of selection. {1,2,3 }, {1,4,5}, {1,5,6}, {2,3,5}, {2,4,6}, {3,4,6}. B- The following six samples are assigned probability of 2/20 of selection each: {1,2,5}, {1,3,5}, {1,4,6}, {2,3,4}, {2,3,6}, {4,5,6}. C- The remaining 8 samples are assigned probability of 1/20 of selection each. Remark: Note that the total probability over the chosen 14 samples is 1 and the survey is not a uniform sampling plan since six of the samples are assigned twice as much probability as the remaining 8 samples. Suppose both practitioners implemented their survey plans and the sample {1, 2, 5} is selected by both of them and the related survey data are: Y1= 120, Y2 = 96, and Y5 = 123. 1- What are the HT estimators and their values of the population mean under both sampling plans? 2- What are the standard deviations of your estimator under both sampling plans? 3- If you were the consulting statistician, which of the above two sampling plans would you

recommend to your client and why?

6. (Stat 461)

The number of customers entering a store on a given day is Poisson distributed with mean 10=λ . The amount of money spent by a customer is uniformly distributed over . Find the mean and the variance of the amount of money that the store takes in on a given day.

)110,0(

3

Page 4: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

MS Exam, Option Probability and Statistics, SPRING 2009

7. (Stat 471)

Home depot stocks wooden planks of length 19 feet. Home Builders buy such planks and cut them according to their needs. A home builder needs 12 planks of length 7 feet, 20 planks of length 5 feet, 32 planks of length 4 feet. Suppose only the following 3 cutting patterns are used.

4

⎟⎟

7 Feet5 Feet4 Feet

⎛ ⎞⎜⎜⎜ ⎟⎝ ⎠

with cutting patterns 031

⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠

103

⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠

111

⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠

(a). How many 19 feet planks are needed to meet the requirements using only these cutting patterns. What is the total length of the wasted portion? (b). Use revised simplex method to check whether there is a potential cutting pattern that would reduce the number of planks needed. (c). Determine which pattern has to be replaced by the entry of the new cutting pattern? (d). How many 19 feet planks are needed to meet the requirements using only the new cutting patterns after discontinuing one of the old patterns?

8. (Stat 471)

Show that , , is optimal to the linear programming problem: 1 5 / 26x = 2 5 / 2x = 1 27 / 26x =

1 2 3

1 2 3

1 2 3

2

1 2 3

max 9 14 7such that 2 3 6

5 4 122 5

, , unrestricted.

x x xx x xx x x

xx x x

+ ++ + ≤+ + ≤

9. (Stat 473)

On a linear forest path of unit length covered with bushes, Joe chooses secretly a location x to hide and Bob chooses a location to hide. The payoff to Joe from Bob is y 2| | | |x y x y− − −

1,0 1. Show that the

strategy which chooses a random point in [0 with density y ,1] ( )f x y≡ ≤ ≤ gives an expectation for all 1/ 6= x . Find the value of the game.

Page 5: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

MS Exam, Option Probability and Statistics, SPRING 2009

10. (Stat 481)

A dairy company would like to investigate if the milk is contaminated. In order to investigate a possible shipment (batch) effect, the company selects 5 shipments at random. After processing each batch, 6 cartons of milk are selected at random and are stored for several days. Then the square root of the bacteria counts are recorded and denoted by Yij, where i=1,...,5 and j=1,...,6. (a). If the shipment effect is denoted as τ, write down ANOVA model and specify required distribution of random components in the model. Please state the hypotheses. (b). Complete the following table and conclude given significance level 0.01.

Sources df SS MS F Shipment 803.3 Error Total 1360.2

[Given: F(0.01, 5, 30) = 3.70, F(0.01, 5, 25) = 3.85, F(0.01, 4, 25) = 4.177 ] (c). Estimate the variance component(s).

11. (Stat 481)

In linear regression analysis, leave-one-out cross validation (LOOCV) procedure is often used to estimate the prediction accuracy of the fitted regression model. Let ( , ) ( , ); 1,...,i iX Y x y i= = n be a dataset with 1

1(1, ,..., ) pi i ipx x x R += ∈ and iy R∈ , and let be the modified dataset without

the i-th row. A set of prediction values ( )( ,iX Y− ( )i− )

1( ) ( )( ) ( )T

i i ii iy x X X −( ) (iX )T

iY− −− = − − are obtained and the LOOCV estimate is defined as

1 2( ) ( ) ( )

1 with

n

i i i i i i ii

LOOCV n e e y y−− − −

=

= =∑ − .

(a) Show that1 1T T T

1 1( ) ( ) 1

( ) ( )( )1 ( )

T T i ii i T T

i i

X X x x X XX X X Xx X X x

− −

( )− −− − −= +

−.

Hint:1 1

1 11( )

1

TT

T

A zz AA zz Az A z

− −− −

−− = +−

assuming that TA z z− is invertible.

(b) Show that ( ) ; 1,...,1

ii i

ii

ee i n H X=h− = =

−, where , is the i-th diagonal element of H

and .

1( )TX X X− Tiih

1( ,..., ) ( )Tne e e I H Y= = −

5

Page 6: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

Statistics 401&481 – MS Exam (Junhui Wang) Spring Semester 2009

1. Let U and V be two independent, standard normal random variables.

(a) Find the joint distribution of U V+ and U V− .

(b) Find the distribution of 2( )

U VU V+

−. 

[Solution] 

(a) Since  , we have 2

0,

0U

N IV

⎛ ⎞⎛ ⎞ ⎛ ⎞⎜⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠⎝ ⎠∼ ⎟

⎞⎞⎜ ⎟⎟

⎠⎝ ⎠2

1 1 1 1 0 1 1 1 1 0 2 0, ,

1 1 1 1 0 1 1 1 1 0 0 2U V U

N I NU V V+ ⎛ ⎞ ⎛⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛

= ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜− − − − −⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝⎝ ⎠∼ ∼ . 

(b) From (a),  , ( )0, 2U V N+ ∼ ( )0,2U V N− ∼ , and thus 

(1 ( ) 0,2

U V N+ ∼ )1 , and  2 21 ( ) (2

U V χ− ∼ 1) .  

From (a), U  and U  are independent, which implies that V+ V− 1 ( )2

U V+  and 

2)1 (2

U V− are independent as well. Therefore, we have 

 

based on the definition of t distribution. 

22

1 ( )2 (1)1( ) ( )2

U VU V tU V U V

++

=− −

Page 7: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

Stat 411, Estimation problem. Spring 2009

Let X1; : : : ; Xn be independent random variables with Xi distributed asN(�ai,�2), for i=1,. . . ,n, where a1; ::; an are known (non-random) real numbersat least one of which is non-zero and �2 is known.(i) Derive the maximum likelihood estimator (mle) of �.(ii) Is the mle unbiased? Justify your answer.(iii) Find the Rao-Cramer lower bound for unbiased estimators of �.(iv) Is the mle of � e¢ cient (minimum variance unbiased)? Justify your

answer.

Solution:

(i) The likelihood is

L(�) =1

(2�)n=2

�ne�

12�2

�(xi��ai)2

The mle b� minimizesh(�) =

n

�i=1(xi � �ai)2 :

h0(�) = 0 has the solution b� = �aixi�a2i

:

Also, h00(b�) > 0: Hence b� is the mle of �.(ii) E

�b�� = �ai�ai�a2i

= � for all �; hence b� is unbiased:

(iii) The information in the sample X1; :::; Xn is

I (�) = �E�@2 lnL(�)

@2�

�=�a2i�2:

Hence the Rao-Cramer lower bound for the variance of an unbiased estimatorof � is

�2

�a2i:

(iv) V�b�� = �2

�a2i; the Rao-Cramer lower bound. Hence b� is minimum variance

unbiased.

1

Page 8: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

[Stat 411, chap. 8, 9] (Jie Yang) Consider one random variable X that has a binomial distribution with 4=n and θ=p . [1] Find the most powerful test of size 16/1≤α for

75.0: vs.5.0: 10 == θθ HH [2] Calculate the power of your test. [3] Justify that your test is the most powerful one of size . 16/1 [Solution] [1] Since X follows Binomial (4,θ ), the likelihood

4,3,2,1,0,)1(4

);();( 4 =−⎟⎟⎠

⎞⎜⎜⎝

⎛== − x

xxfxL xx θθθθ

Therefore,

4214

);5.0( ⋅⎟⎟⎠

⎞⎜⎜⎝

⎛=

xxL , 44

34);75.0(

x

xxL ⋅⎟⎟

⎞⎜⎜⎝

⎛= , xxL

xL316

);75.0();5.0(=

x 0 1 2 3 4 );75.0(/);5.0( xLxL 16 16/3 16/9 16/27 16/81

)Pr( 0HxX = 1/16 4/16 6/16 4/16 1/16

)Pr( 1HxX = 1/256 12/256 54/256 108/256 81/256 By Neyman-Pearson theorem, the test with rejection region

{ }kxLxLxC ≤= );75.0(/);5.0(:: is the most powerful one of size )Pr( 0HCX ∈ . Let , then leads to the most powerful test of size . 81/16=k { 4: == xxC } 16/1 [2] The power is 256/81)4Pr( 1 == HX . [3] The test is guaranteed by the Neyman-Pearson theorem to be the most powerful test. Alternative way to justify that: Another test with rejection region{ }0: =xx is of size

too. The corresponding power is only 1/256. 16/1

Page 9: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is
Page 10: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

Solution

Statistics 431-MS Exam Spring Semester 2009

A client has a finite population of 6 units and has funds to survey only 3 units for the purpose of estimating the mean and its related standard deviation. There are two sampling plans to be considered by two survey practitioners. Sampling Plan by Survey Practitioner 1: Is a simple random sampling plan of size 3 without replacement, SRS (6, 3). Recall that this is a uniform sampling plan with all 20 samples of size 3 in its support. Sampling Plan by Survey Practitioner 2: Is a controlled sampling plan with only 14 samples: A- The following six samples are excluded from being surveyed, that is they have zero probability of selection. {1,2,3 }, {1,4,5}, {1,5,6}, {2,3,5}, {2,4,6}, {3,4,6}. B- The following six samples are assigned probability of 2/20 of selection each: {1,2,5}, {1,3,5}, {1,4,6}, {2,3,4}, {2,3,6}, {4,5,6}. C- The remaining 8 samples are assigned probability of 1/20 of selection each. Remark: Note that the total probability over the chosen 14 samples is 1 and the survey is not a uniform sampling plan since six of the samples are assigned twice as much probability as the remaining 8 samples. Suppose both practitioners implemented their survey plans and the sample {1, 2, 5} is selected by both of them and the related survey data are: Y1= 120, Y2 = 96, and Y5 = 123. 1- What are the HT estimators and their values of the population mean under both

sampling plans? Answer: We observe that the first inclusion probabilities for both sampling design is constant (3/6) for all 6 units. Thus, the HT estimator of the population mean under both sampling plans is simply the sample mean: (120+96+123)/3 = 113. 2- What are the standard deviations of your estimator under both sampling plans? Answer: We observe also that the second order inclusion probabilities for both sampling plans are constant 3(3-1)/6(6-1) = 5 for all 15 distinct pairs. Thus, thus design unbiased estimator of the variance of HT is simply [(sample variance)/3] [1- 3/6)]. Compute the sample variance of the three observations and plug it in the expression and take its positive square root.

Page 11: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

3- If you were the consulting statistician, which of the above two sampling plans would you recommend to your client and why?

Answer: If there is no need or reason to exclude those samples which survey design 2 has excluded from the support of the sampling plan then the first survey plan which with is SRS(3, 6) is preferred due to its statistical optimality (see the sampling book by Hedayat and Sinha) . In addition the statistician does not have to justify the exclusion of those samples which survey design 2 has done. However, if those samples which have received zero probability of selection under survey 2 are to be excluded from the survey then clearly the second survey design should be recommended.

Page 12: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is
Page 13: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

Solutions to OR Problems in Masters Exam Spring 2009

T.E.S. Raghavan 1

April 7, 2009

Solution to Problem 7: Stat 471

From 19 feet boards we want 12 planks 7 feet long, 20 planks 5 feet long and 32 planks 4

feet long only using cutting patterns

7 feet

5 feet

4 feet

using patterns

0

3

1

,

1

0

3

,

1

1

1

.

Say we need x1, x2, x3 planks cut in respective patterns. Solving

0 1 1

3 0 1

1 1 1

x1

x2

x3

=

12

20

32

and rounding to the next highest integer, we get x̄1 = 6, x̄2 = 5, x̄3 = 19. Since the

fractional sum is itself > 17 we want to see whether we can manage with 18 planks . To

check this we have to find strict improvement and this must correspond to a new cutting

pattern brought in and one of the above pattern eliminated. By revised simplex method

we solve for uB = cB where u is the row vector and cB is the row vector [1, 1, 1]. We get

u = [u1, u2, u3 ] =[

47, u2 = 2

7, 1

7

]. We now look for a cutting pattern

a1

a2

a3

1Department of Mathematics, Statistics and Computer Science, 851, South Morgan Street #517, Univer-

sity of Illinois at Chicago, Chicago, IL 60607-7045 Email: [email protected]

1

Page 14: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

such that47a1 + 2

7a2 + 1

7a3 > 1

7a1 + 5a2 + 4a3 ≤ 19

That is

4a1 + 2a2 + a3 > 7

7a1 + 5a2 + 4a3 ≤ 19

The knapsack algorithm gives weights for a1, a2, a3 as 47, 2

5, 1

4respectively. Thus choosing

a1 = 2, a2 = 1, a3 = 0 gives the new cutting pattern. Using revised simplex method we

determine which pattern of the current has to be dropped out. To do this we need to solve

Bd = a where we solve

0 1 1

3 0 1

1 1 1

d1

d2

d3

=

2

1

0

. We get

d1 = −4

7, d2 = −5

7, d3 =

1

7.

The new solution clearly has to scrap the last cutting pattern as d3 alone is positive. Thus

the new cutting patterns are

7 feet

5 feet

4 feet

using patterns

0

3

1

,

1

0

3

,

2

1

0

.

The new cuts to the nearest integer not below the fractional solution is x̄1 = 6, x̄2 = 9, x̄3 = 3

this meets the demands and we waste (19)(18)− (12)(7)− (20)(5)− (32)(4) = 30. Previously

we wasted (19)(19)− (12)(7)− (20)(5)− (32)(4) = 49 feet

2

Page 15: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

Solution to Problem 8, stat 471

The LP problem

max 9x1 + 14x2 + 7x3

such that

2x1 + x2 + 3x3 ≤ 6

5x1 + 4x2 + x3 ≤ 12

0x1 + 2x2 + 0x3 ≤ 5

x1, x2, x3 unrestricted

has as its dual

min 6y1 + 12y2 + 5y3

such that

2y1 + 5y2 + 0y3 = 9

y1 + 4y2 + 2y3 = 14

3y1 + y2 + 0y3 = 7

y1, y2, y3 ≥ 0

has feasible solution y1 = 2, y2 = 1, y3 = 4. Since x1 = 526

, x2 = 52, x3 = 27

26is feasible for

primal and gives the value of the objective function as 44 and so does the dual solution the

two are optimal for the two problems by the duality theorem.

Solution to Problem 9, Stat 473

Given the payoff function

K(x, y) = |x− y|| − |x− y|2 0 ≤ x, y ≤ 1

the expected payoff to player I using x while player II chooses a y at random using density

f(y) ≡ 1, 0 ≤ y ≤ 1 is given by∫ 1

0K(x, y)dy =

∫ 1

0|x− y||−

∫ 1

0|x− y|2 =

∫ x

0(x− y)dy +

∫ 1

x(y−x)dy−

∫ 1

0(x2−2xy + y2)dy

=

[xy − y2

2

]x

0

+

[y2

2− xy

]1

x

−[x2y − xy2 +

y3

3

]1

x

= x2 − x2

2+

1

2− x− x2

2+ x2 − [x2 − x +

1

3] =

1

6

By the symmetry of the payoff function, the same uniform strategy is optimal for player I

giving an expectation ≡ 16

for all 0 ≤ y ≤ 1. Thus the value of the game is 16.

3

Page 16: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

STAT 481 -Spring 2009 (Jing Wang)

A dairy company would like to investigate if the milk is contaminated. In order toinvestigate a possible shipment (batch) effect, the company selects 5 shipments at random.After processing each batch, 6 cartons of milk are selected at random and are stored forseveral days. Then the square root of the bacteria counts are recorded and denoted by Yij ,

where i = 1, ..., 5 and j = 1, ..., 6.

(a). If the shipment effect is denoted as τ , write down ANOVA model and specify distri-bution of random components in the model. Please state the hypotheses.

Solution: The shipment (batch) effect is a random effect. Hence a random-effect ANOVAmodel can be written as

Yij = µ + τi + εij ,

where i = 1, ..., 5, j = 1, ..., 6. In addition, τi and εij are independently distributed withrespectively normal distributions, τi ∼ N

(0, σ2

τ

)and εij ∼ N

(0, σ2

)for any i and j.

To investigate if the batch effect is significant is equivalent to test

H0 : σ2τ = 0 against H0 : σ2

τ > 0.

(b). Complete the following table and conclude based on 0.01 significance level.Solution:

Sources df SS MS F

Shipment 4 803.0 200.8 9.01Error 25 557.2 22.3Total 29 1360.2

Since the observed statistic Fo = 9.01 > F (0.01, 4, 25) = 4.177, the test is significantat level α = 0.01, i.e. there is considerable variation among the shipments.

(c). Estimate the variance component(s).

Solution: The error variance can be estimated by the MSE directly,

σ̂2 = MSError = 22.3.

The variance component for the random effect is estimated as follows (sample size for eachbatch is the same)

σ̂2τ =

1n

(MSTreatment −MSEroor) =16

(200.8− 22.3) = 29.75.

1

Page 17: The written Master’s Examination written Master’s Examination ... Hence the Rao-Cramer lower bound for the variance of an unbiased estimator of is

10. In linear regression analysis, leave-one-out cross validation (LOOCV) procedure is often used to estimate the prediction accuracy of the fitted regression model. Let ( , ) ( , ); 1,...,i iX Y x y i= = n be a dataset with 1

1(1, ,..., ) pi i ipx x x R += ∈ and iy R∈ , and let

be the modified dataset without the i-th row. A set of prediction values ( ) ( )( , )i iX Y− −

1( ) ( ) ( )( ) ( )T

i i ii iy x X X X−− −− = ( )

Ti Y− −i are obtained and the LOOCV estimate is defined as 

1 2 1( ) ( )

1 1( )

n n

i i i i ii i

LOOCV n e n y y− −− −

= =

= = −∑ ∑ 2 .

(a) Show that1 1

1 1( ) ( ) 1

( ) ( )( ) ( )1 ( )

T T TT T i i

i i T Ti i

X X x x X XX X X Xx X X x

− −− −

− − −= +−

.

Hint:1 1

1 11( )

1

TT

T

A zz AA zz Az A z

− −− −

−− = +−

assuming that TA z z− is invertible.

(b) Show that ( ) ; 1,...,1

ii i

ii

ee i n H X=

)H Y= −

h− = =−

Ti

, where , is the i-th diagonal

element of H and .

1( )TX X X− Tiih

1( ,..., ) (Tne e e I=

[Solution] 

(a) The equality follows  immediately after the fact that ( ) ( )T T

i i iX X X X x− − = − x

Ti

 and the 

formula in the Hint. 

(b) Note that  ( ) ( )T T

i i iX Y X Y x− − = − y T and  , we thus have 1( ) ( )( )T

ii i i i ih x X X x−− −=

1( ) ( ) ( ) ( ) ( )( )

1

1 11

1 1 11 1

( )

( ) ( )

( ) ( )( ) ( )1

( ) ( ) ( )( ) ( )1

T Ti i i i i i i i ii i

T T T Ti i i i i i

T T TT T Ti i

i i i iii

T T T TT T T T Ti i i

i i i iii

e y y y x X X X Y

y x X X x x X Y x y

X X x x X Xy x X X X Y x yh

X X x x X X X X xy x X X X Y X X x y X Yh

−− − − − −−

− −−

− − −− −

= − = −

= − − −

⎛ ⎞= − + −⎜ ⎟−⎝ ⎠

= − − + −−

1

2

( )1

1 11 1 .

1 1 1

T TTii i

ii

ii iii i ii i i i

ii ii

ii i

ii ii ii

x X X x yh

h hy x h y x yh h

ey xh h h

β β

β

−⎛ ⎞⎜ ⎟−⎝ ⎠

⎛ ⎞= − − + −⎜ ⎟− −⎝ ⎠

= − =− − −

where 1( )T TX X X Yβ −= . 


Recommended