+ All Categories
Home > Documents > Solutions to Final1

Solutions to Final1

Date post: 24-Feb-2018
Category:
Upload: joe-bloe
View: 215 times
Download: 0 times
Share this document with a friend

of 12

Transcript
  • 7/25/2019 Solutions to Final1

    1/12

    Sample final exam questions.

    Important: When a question asks for a numerical expression, your answer should be anexpression involving only numbers and operations on numbers. You do not have to sim-plify such an expression but variables should not appear. For example, this is a numerical

    expression:

    891 +

    27.85 + 67.3/9

    147.21 + 93.5

    and this is not

    891x+

    7.85 + 67.3/9b 147.21 + 93.5/ .

    Problems (1) -(19) concern a random sample from the Gamma distribution, where one ofthe Gamma parameters is known. As a reminder, the Gamma distribution has pdf of the

    formf(x|, ) =

    ()x1ex for x >0

    0 otherwise

    This distribution has moments 1=/ and 2 = (+ 1)/2.

    Assume we have iid observations X1, . . . , X n whose distribution is Gamma with known tobe 2, and with unknown, so that the pdf of the form

    f(x

    |) =

    2xex for x >0

    0 otherwisewhere >0 is unknown. It will help to know that E[Xi] = 2/ and E[X

    2i] = 6/

    2.

    (1)What is the expected value ofXin terms of?

    (2)What is Var(Xi) in terms of?

  • 7/25/2019 Solutions to Final1

    2/12

    (3)What is Var(X) in terms ofand n?

    (4) Give the delta method approximation to the bias of = log( 2X

    ) as an estimator of thequantity = log(2/).

    (5)Give the delta method approximation to the variance of in the previous question.

    (6)Let B denote your bias approximation in (4) and letVdenote your bias approximationin(5) and assume these are correct. Give an approximation to the mean squared error of in terms ofB and V.

    (7)Which term in (6) is the more dominant term as ngoes to infinity?Circle one: the bias term/the variance term

  • 7/25/2019 Solutions to Final1

    3/12

    (8)Give a method of moments estimator of.

    (9)Write down the log-likelihood function.

    (10)What is the maximum likelihood estimator of?

  • 7/25/2019 Solutions to Final1

    4/12

    (11)Calculate Fisher information I() = Var(

    log f(X|)

    for this situation.

    (12) Assume you have correctly calculated the Fisher information I() in (11), what isthe limiting distribution of

    n( ) as n ? Be specific about all parameters of the

    distribution. Just giving the name of the distribution will not suffice.

    (13)Give the form of an approximate 95% confidence interval forbased on the approximatedistribution in(12). (Make sure that every term in your confidence interval is either knownor estimated from the data.)

    (14)Suppose has prior distribution with pdf of the form

    (|) =

    e for >00 otherwise

    where is given. What is the posterior distribution of given X1, . . . , X n? If you can, be

    specific about name of the distribution and its parameters.

  • 7/25/2019 Solutions to Final1

    5/12

    (15) Suppose we wish to test the hypothesis H0 : = 1 vs. HA : = 1. Write down thegeneralized likelihood ratio test statistic .

    (16) In (15) do we reject H0 for sufficiently small, or sufficiently large values of the teststatistic ? Circle one: sufficiently small/sufficiently large

    (17)UnderH0, assumingn large, what is the approximate distribution of 2 log in(15)?

    (18)In(17)what is the approximate expected value of2 log ? (Give a number!)

    (19) Suppose the observed value of2 log from the data is 5, and assume that underH0 the approximate distribution of2 log in(17) is continuous with pdf denoted by fDescribe a p-value for the test in (15) as an integral involving f.(You do not need to have

    correctly identified this distribution in (17)to answer this.)

  • 7/25/2019 Solutions to Final1

    6/12

    Problems (20) - (29) concern the following situation. We have two independent samplesX1, . . . , X m and Y1, . . . , Y n. The Xi are iid N(X,

    2) and the Yi are iid N(Y, 2) with

    parameters X, Y, and unknown.

    (20)What is the expected value ofX

    Y in terms of the unknown unknown parameters?

    (21)Are X and Y independent? Circle one: Yes/No

    (22)What is the variance ofX

    Y in terms of the unknown parameters?

    (23)Does X Yhave a normal distribution? Circle one: Yes/No

    (24)What is the distribution ofmi=1(Xi X)2/2?

    (25)Arem

    i=1(Xi X)2 andn

    i=1(Yi Y)2. independent? Circle one: Yes/No

    (26)What is the distribution of

    mi=1(Xi X)2 +

    ni=1(Yi Y)2

    2 ?

  • 7/25/2019 Solutions to Final1

    7/12

    (27)Are X Y and mi=1(Xi X)2 +ni=1(Yi Y)2 independent? Circle one: Yes/No

    (28)Assuming X=Ywhat is the distribution of the quantity

    X YsP

    1/m+ 1/n

    where

    sP =

    mi=1(Xi X)2 +

    ni=1(Yi Y)2

    m+n 2Be precise!

    (29) Suppose m= 3 and n= 4 and when we order the Xi and Yj in the combined samplewe find that the ordering is as follows:

    X3< X1 < Y3< X2< Y1 < Y4< Y2

    Give an exact p-value for testing H0 : X = Y vs. HA : X < Y using the Wilcoxon-Mann-Whitney rank sum test.

  • 7/25/2019 Solutions to Final1

    8/12

    Problems (30) - (36) concern the following situation. A study is conducted in order tounderstand differences in responses to a certain drug used to reduce the size of liver tumorsin rats. Genotypes at two distinct genetic loci on different chromosomes that are thoughtto be relevant to response. The genotype of a rat is either aa, aA, or AA at the first locus,and the genotype is bb, bB, or BB at the second. A study is conducted in which 2 rats

    having liver tumors are sampled from the population of rats with each possible genotypecombination and the percentage reduction (Y )in tumor size as a result of treatement isdetermined for each.

    Here are the data (left panel) and the summary of the corresponding ANOVA table (withsome entries missing) obtained when using the R commands:

    AOVF)geno1 2 2197 1098 55.05 8.96e-06

    geno2 2 4230 2115 106.01 5.55e-07geno1:geno2 4 100 25 1.25 0.357Residuals 180

    If the genotypes are numbered 1,2,3 for aa,aA, and AA, and for bb, bB, BB, let Yijk denotethe reduction obtained for the k-th rat drawn from the population with genotype ofi at thefirst locus, and j at the second locus.

    (30) Write down a numerical expression (i.e. an expression involving only numbers andoperations on them) for

    3i=1

    3j=1

    2k=1(Yijk Y)2.

  • 7/25/2019 Solutions to Final1

    9/12

    (31)Assuming that the data follow the standard two-way ANOVA model with interactions

    Yijk =+i+j+ ij+eijk , for i= 1, 2, 3, j = 1, 2, 3, k= 1, 2,

    where the, i, j,and ijk are unknown constants, what constraint do we typically assumethe constants1, 2, and 3 satisfy?

    (32) In the ANOVA table what are the missing values of Residual Df and Residual MeanSq?

    Residual Df =

    Residual Mean Sq =

    (33)Give an unbiased estimator of2.

    (34)Use the ANOVA table to give a p-value for testing the null hypothesis of additivity ofthe effects of the two genes vs. the alternative hypothesis of non-additivity.

    (35)Suppose instead we fit an ANOVA model without the ij terms, i.e. assume

    Yijk =+i+j+eijk for i= 1, 2, 3, j= 1, 2, 3, k = 1, 2.How many degrees of freedom would we have for estimating 2? (Your answer should be anumber.)

  • 7/25/2019 Solutions to Final1

    10/12

    Questions36 - 47 concern the following situation. Experiments are carried out to determinethe concentration of a certain chemical additive that will produce the strongest possiblematerial when that material is treated with the additive. In each of 10 experimental trials,the concentration of the additive is varied and the resulting material strength is determined.The data are as in the following table:

    conc strength5 5.74

    5.5 8.306 7.47

    6.5 10.147 9.46

    7.5 9.778 8.77

    8.5 8.75

    9 5.239.5 4.1810 1.78

    An additional column of squared concentrations called conc.sqis added to the dataset, andthen a linear model of the form

    Strengthi=0+1conci+2conc.sqi+ei

    is fitted to the data, with the usual assumptions, (in particular ei are independent withei

    N(0, 2)) and the output is as given in the following table

    Coefficients:

    Estimate Std. Error t value Pr(>|t|)

    (Intercept) -36.6280 5.6265 -6.510 0.000186 ***

    conc 13.1614 ______ 8.510 2.79e-05 ***

    conc.sq -0.9334 0.1027 -9.092 1.72e-05 ***

    ---

    Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.7518 on 8 degrees of freedomMultiple R-squared: 0.936,Adjusted R-squared: 0.92

    F-statistic: 58.5 on 2 and 8 DF, p-value: 1.678e-05

    (36)Use the table to give a (numerical) unbiased estimate of1.

  • 7/25/2019 Solutions to Final1

    11/12

    (37)Give a p-value for testing the null hypothesis that the relationship between concentra-tion and expected strength is actually a linear one, vs. the alternative that the relationshipis nonlinear.

    (38)Give a numerical prediction of the strength obtained when the concentration is 5.

    (39)Give a numerical expression for the residualobtained when the concentration is 5.

    (40)Give a numerical expression for the sum of squared residuals.

    (41)Give a numerical value for the square of the correlation coefficient between the observedstrengths and the fitted values.

    (42) Give a numerical estimate of the concentration that gives rise the highest value ofexpected strength.

    (43)Is the answer in (42) unbiased? Circle one (YES/NO)

  • 7/25/2019 Solutions to Final1

    12/12

    (44) What is the numerical value for the conc Std. Error that is missing from the tablesummarizing the model fit?

    (45)True or False. The expected value ofe2i is2.

    (46)True or False. The expected value of the square of the i-th residual e2i is2.

    (47)Write down the design/model matrix for fitting the model.

    (48) Generally speaking, for estimating 2 using an estimator 2 with 2

    2 2 for some

    number of degrees of freedom , is it better to have less or more degrees of freedom?Circle one answer: more/less

    (49) True or False. If 5 confidence intervals are constructed for n different parameters,and each confidence interval has 99% coverage probability, then the chance that all of the

    intervals contain their respective true values is at least 95%.

    (50) True or False. Based on an iid sampleX1, . . . , X n from the N(, 1) distribution with unknown, consider the level test that rejects H0 : = 0 vs. HA : > 0 providedX > z()

    n.The power of this test does not depend on the true value of.


Recommended