+ All Categories
Home > Documents > Last Name : First Name: Student ID - Dept. of Statistics ...mjun/exam1a_sol.pdf · random variable...

Last Name : First Name: Student ID - Dept. of Statistics ...mjun/exam1a_sol.pdf · random variable...

Date post: 06-May-2018
Category:
Upload: trinhdung
View: 213 times
Download: 0 times
Share this document with a friend
8
STAT 211 Section 506, Spring 2006 Exam I-Form A Last Name : First Name: Student ID : DO NOT OPEN THIS EXAM UNTIL YOU ARE INSTRUCTED TO DO SO If there is no correct answer or if multiple answers are correct, select the best answer. Hand in both the exam form and the scantron. Make sure your name is on both the exam form and the scantron. Make sure you mark the appropriate letter for your exam form on your scant- ron. Each question is worth 5 points for a total of 100 points possible. If you are caught cheating or helping someone cheat on this exam, you both will receive a grade of ZERO on the exam. The work you submit must be your own. Good Luck! 1
Transcript

STAT 211 Section 506, Spring 2006

Exam I-Form A

Last Name : First Name: Student ID :

DO NOT OPEN THIS EXAM UNTIL YOU ARE INSTRUCTED TO DO SO

• If there is no correct answer or if multiple answers are correct, select the bestanswer.

• Hand in both the exam form and the scantron.

• Make sure your name is on both the exam form and the scantron.

• Make sure you mark the appropriate letter for your exam form on your scant-ron.

• Each question is worth 5 points for a total of 100 points possible.

• If you are caught cheating or helping someone cheat on this exam, you bothwill receive a grade of ZERO on the exam. The work you submit must beyour own.

Good Luck!

1

1. The inferential branch of statistics deals with:

A. Tables and graphs.

B. Generalization from a sample to a population.

C. Deductive statistics.

D. Discrete and continuous variables.

E. Univariate data sets.

Answer: B. Statistics has two branches: descriptive statistic and inferential statistic. The for-mer refers to techniques for summarizing and describing data, and the latter refers to techniquesfor generalizing from a sample to a population.

The following is a histogram of the number of hits per nine-inning baseball game.

2. According to the above histogram, given a random nine-inning baseball games, approximately what proportionis there of having 9 or 10 hits in that game?

A. 10%

B. 12%

C. 22%

D. 80%

E. 40%

Answer: C. Proportion is the same as relative frequency. 9 hits has a relative frequency a littleabove 10%, and 10 hits has a relative frequency around 10%. So the best approximate is 22%.

3. How would you describe the above histogram?

A. Bimodal and symmetric

B. Unimodal and positively skewed

C. Bimodal and positively skewed

D. Unimodal and negatively skewed

E. Bimodal and negatively skewed

Answer: B. If a histogram has a single peak, it is called unimodal, and Bimodal if it has twopeaks. A histogram is called positively skewed if it is skewed toward larger values than towardsmaller values. Negatively skewed is the other way round. How does the speed of a runner vary

2

over the course of a marathon (42.195km)? Consider determining both the time to run the first 5km and thetime to run between the 35km and 40km points, and then subtracting the former time from the latter time.A positive value of this difference corresponds to a runner slowing down towards the end of the race. Thehistogram below is based on such times.

4. For the above histogram:

A. The mean is about the same as the median.

B. The mean is larger than the median.

C. The median is larger than the mean.

D. The range is smaller than the fourth spread.

E. The range is smaller than the interquartile range.

Answer: B. When a histogram is symmetric, the median equals the mean. Here the histogram ispositively skewed. So the mean is larger than the median due to the effect of large observationsin the data.

A company utilizes two different machines to manufacture parts of a certain type. During a single shift, asample of n = 20 parts produced by each machine is obtained, and the value of a particular critical dimensionfor each part is determined. The comparative boxplot is shown below.

3

Data

Machine 2Machine 1

85

80

75

70

65

60

55

50

Boxplot of Machine 1, Machine 2

5. Which of the following statements best conform to the above boxplots?

A. Machine 1 is more consistent.

B. Both machines are about the same consistent.

C. Machine 2 has a greater median critical dimension

D. Machine 2 is more consistent

E. Machine 2 produces the parts with greatest critical dimension.

Answer: D. A machine with less variability in the values of critical dimension is obviously moreconsistent.

The following is a stem-and-leaf display of a simulated sample from a population distributed as lognormal(µ=0,σ=1).(The values are rounded to have one decimal point.)

The decimal point is at the |0 | 13556991 | 282 |3 |4 |5 |6 | 2

The next three problems are based on the above stem-and-leaf display.

6. Calculate the sample mean and sample standard deviation of the values in the simulated sample above.

A. x̄ = 0, s = 1.00

B. x̄ = 1.65, s = 2.16

C. x̄ = 1.65, s = 4.67

D. x̄ = 1.30, s = 1.69

4

E. x̄ = 1.30, s = 1.79

Answer: E. Note that one can recover the original data from a stem-and-leaf display. Here weread off the data as 0.1, 0.3, 0.5, 0.5, 0.6, 0.9, 0.9 1.2 1.8, 6.2

7. For the above simulated data, compute the 10% trimmed mean.

A. 1.65

B. 0.75

C. 1.30

D. 0.84

E. 0.77

Answer: D. 10% trimmed mean is the average of what is left over after removing 10% of theobservations from either end of the ordered sequence. Here we remove 1 observation fromeither end to get 0.3, 0.5, 0.5, 0.6, 0.9, 0.9 1.2 1.8.

8. If we randomly select another observation from the same population, what is the probability that the newobservation will fall in the observed range? (Hint: use the relationship of the lognormal distribution to thenormal distribution.)

A. 0.05

B. 0.95

C. 0.75

D. 0.50

E. Need more information to determine.

Answer: B. The observed range of the data is (0.1, 6.2). We compute P (0.1 < X < 6.2). Since Xis distributed as lognormal(µ=0,σ=1), P (0.1 < X < 6.2) = P (log 0.1 < log X < log 6.2) = P (−2.30 <Z < 1.82) = 0.95.

9. Suppose X is some random variable having mean 0 and standard deviation 10. Define a new random variable

Y = −1 ∗ (X + 10)

Calculate the mean and the standard deviation of Y .

A. µy = −10, σy = 10

B. µy = 0, σy = 100

C. µy = −10, σy = −10

D. µy = 0, σy = 10

E. Can’t say—we have to know the probability density function of Y .

Answer: E[Y ] = E[−1 ∗ (X + 10)] = −E[X + 10] = −E[X] − 10 = −10. V ar(Y ) = V ar[−1 ∗ (X + 10)] =V ar[X + 10] = V ar[X] = 102. Thus σy = 10.

10. A particular airline has 10 A.M. flights from Chicago to NY, Atlanta, and LA. Let A denote the event thatthe NY flight is full and define events B and C analogously for the other two flights. Suppose P(A)=0.6,P(B)=0.5, P(C)=0.4, and the three events are independent. What is the probability that only the NY flightis full?

A. 0.12

B. 0.88

C. 0.18

D. 0.10

F. 0.60

Answer: C. The event ”only the NY flight is full”=A ∩ Bc ∩ Cc, where Bc and Cc denotethe complement events of B and C, respectively. Then the desired probability=P(A ∩ Bc ∩Cc)=P(A)P(Bc)P(Cc)=0.6 · (1 − 0.5) · (1 − 0.4) = 0.18. Here the reason P(A ∩ Bc ∩ Cc) factors isbecause the three events A,B, and C are independent.

5

11. One percent of all individuals in a certain population are carriers of a particular disease. A diagnostic test forthis disease has a 90% detection rate for carriers and a 5% detection rate for noncarriers. Suppose the test isapplied independently to two different blood samples from the same randomly selected individual. What isthe probability that both tests yield positive result? (Hint: The two events ”The randomly selected person isa carrier” and ”The randomly selected person is a noncarrier” are mutually exclusive and exhaustive events.)

A. 0.01

B. 0.89

C. 0.90

D. 0.95

F. 0.06

Answer: A. Let A =“The randomly selected person is a carrier”, then Ac=“The randomlyselected person is a noncarrier”. Let B =“both tests yield positive result”, then P (B) =P (B|A)P (A) + P (B|Ac)P (Ac). Since P (B|A) = 0.92 and P (B|Ac) = 0.052, then P (B) = P (B|A)P (A) +P (B|Ac)P (Ac) = 0.92 ∗ 0.01 + 0.052 ∗ (1− 0.01) = 0.01.

12. When circuit boards used in the manufacture of compact disc players are tested, the long-run percentage ofdefectives is 5%. Among 25 randomly selected boards, what is the expected value of the number of defectiveboards?

A. 0.05

B. 0.5

C. 1.19

D. 1.25

F. 12.5

Answer: D. The number of defective boards among the 25 randomly selected boards is arandom variable and has Binomial distribution with n = 25, and p = 0.05. The expected numberof defective boards is np = 1.25.

13. A geologist has collected 10 specimens of basaltic rock and 10 specimens of granite. The geologist instructs alab assistant to randomly select 15 of the specimens for analysis. What is the distribution of the number ofgranite specimens selected for analysis?

A. Binomial

B. Hypergeometric

C. Possion

D. Negative binomial

E. Geometric

Answer. B. Refer to the definition of the hypergeometric distribution in the textbook.

14. An insurance company offers its policyholders a number of different premium payment options. For a randomlyselected policyholder, let X= the number of months between successive payments. The CDF of X is as follows.

F (x) =

0 if x < 10.36 if 1 ≤ x < 30.47 if 3 ≤ x < 40.51 if 4 ≤ x < 60.91 if 6 ≤ x < 121 if x ≥ 12

Calculate P (3 ≤ X ≤ 6)

A. 0.55

B. 0.51

C. 0.15

6

D. 0.11

E. 0.47

Answer: A. P (3 ≤ X ≤ 6) = F (6)− F (2) = 0.91− 0.36 = 0.55

15. Let X denote the amount of time for which a book on a 2-hour reserve at a college library is checked out bya randomly selected student. The CDF of checkout duration X is

F (x) =

0 if x < 0x2

4 if 0 ≤ x < 21 if x ≥ 2

If the borrower is charged an amount h(X) = 3X2, compute the expected charge E(h(X))

A. 18.0

B. 2.0

C. 6.0

D. 2.6

E. 4.0

Answer: C. First compute the pdf of X: f(x) = F ′(x) = x2 if 0 ≤ x < 2. By definition,

E(h(X)) = E(3X2) =∫ 2

0

3x2 · f(x)dx =∫ 2

0

3x2 · x

2dx = 6

16. Find the 75th percentile of the distribution of checkout duration in the previous problem

A. 1

B. 1.41

C. 1.73

D. 2

E. Cannot determine.

Answer: C. Let x0.75 denote the 75th percentile of the distribution. Then we solve the followingequation for x0.75

F (x0.75) = 0.75,

which gives us x0.75 = 1.73.

17. Find the interquartile range (75th percentile-25th percentile) for the standard normal distribution using TableA.3. Interpolate where appropriate.

A. 1.350

B. 0.675

C. 0.773

D. 1.546

E. 1.196

Answer: A. The 75th percentile is 0.674, and by symmetry of the standard normal densityfunction, the 25th percentile is -0.674. Thus the interquartile range is 1.348 ≈ 1.35.

18. Suppose the force acting on a column that helps to support a building is normally distributed with mean 15.0kips and standard deviation 1.25 kips. What is the probability that the force differs from 15.0 kips by at most2 standard deviations?

A. 0.954

B. 0.977

C. 0.046

7

D. 0.023

E. 1

Answer: A. Let X denote the form acting on a column. Since X is Normal(µ = 15, σ = 1.25),then Z = X−15

1.25 is Normal(0,1). Thus

P (|X − 15| < 2 ∗ 1.25) = P (|X − 15| < 2 ∗ 1.25) = P (|Z| < 2) = Φ(2)− Φ(−2) ≈ 0.954

19. Let X have a binomial distribution with parameters n = 25 and p = 0.5. Calculate P (15 ≤ X ≤ 20) using theexact distribution from Appendix Table A.1 and using the normal approximation with continuity correction.Report the absolute value of the approximation error.

A. 0.0430

B. 0.0540

C. 0.0962

D. 0.0008

E. 0.0616

Answer: D. X is distributed as Binomial(n = 25, p = 0.5). Look up Table A.1 and obtain P (15 ≤X ≤ 20) =. Since Binomial(n = 25, p = 0.5)can be approximated by the normal distribution withµ = np = 12.5, and variance σ2 = np(−p) = 6.25, we apply continuity correction and approximateP (15 ≤ X ≤ 20) by

P (15− 0.5 ≤ X ≤ 20 + 0.5) = P (14.5− 12.5√

(6.25)≤ X − 12.5√

(6.25)≤ 20.5− 12.5√

(6.25))

= Φ(3.2)− Φ(0.8) ≈ 0.211

.

20. Let X be any random variable which is normally distributed, what is the probability that X is within 1.5SDs of its mean?

A. 0.0668

B. 0.9332

C. 0.8664

D. 0.1336

E. Cannot determine without knowing the mean and the variance.

Answer: C. Let X ∼ N(µ, σ), then Z = X−µσ ∼ N(0, 1)

P (|X − µ| < 1.5σ) = P (|Z| < 1.5) = Φ(1.5)− Φ(−1.5) = 0.8664

.

8


Recommended