+ All Categories
Home > Documents > Lecture 1 Stat Review

Lecture 1 Stat Review

Date post: 23-Feb-2018
Category:
Upload: zeqi-zhu
View: 218 times
Download: 0 times
Share this document with a friend

of 97

Transcript
  • 7/24/2019 Lecture 1 Stat Review

    1/97

    1

    0.

    Introduction

    What is econometrics?

    Econometrics is the application of statistics and economic theory todata in order to test economic hypotheses.

    Economic theory describes relationships between economic variables.

    For example, the law of demand tells us that as prices go down, thequantity demanded will go up.

    However, as the owner of a firm or as a policymaker, we are ofteninterested in the magnitudeof the relationship between two variables.

    For example, if cigarette taxes increase, the quantity demanded falls.By how much? What will be the impact on tax revenues?

    To answer these questions, we need to know something about theempiricalrelationship between cigarette prices and cigarette demand.

  • 7/24/2019 Lecture 1 Stat Review

    2/97

    2

    We could ask a variety of other questions:

    1) What is the impact of education on earnings?

    2)

    How much do increases in government transfers (e.g., TANF)reduce work effort?

    3)

    What is the effect of an increased police force on the amount ofcrime committed in a city?

    Econometrics is also useful for forecasting.

    1)

    Firms forecast revenues and costs.

    2) Governments forecast consumer spending and unemployment rates.

  • 7/24/2019 Lecture 1 Stat Review

    3/97

    3

    Does econometrics always give the right answer?

    Earnings Years of Education

    is the statistical relationship between of years of education and earnings.One more year of education will increase earnings by .

    However, we only have observational datato estimate this statistical

    relationship or correlation.

    We typically will not be analyzing a randomized experiment.

    Does one more year of school really causeearnings to increase?

    Or, do more able people, who would have earned more anyway, get more

    education?

    We will have to rely both on economic theory as well as our understanding

    of econometric theory to interpret our findings.

  • 7/24/2019 Lecture 1 Stat Review

    4/97

    4

    Is econometrics the same as program evaluation?

    Program evaluation undertakes an examination of a program (or policy)through the study of the programs goals, processes, and outcomes.

    For example, an evaluation of the Pittsburgh Promise program, whichprovides scholarships and other college related needs to graduates ofPittsburgh Public Schools, would likely include a study of whether the

    program increased the educational attainment of city school graduates.

    Such an evaluation would implement statistical and econometricmethodologies as part of the study.

    While economists would find the results of this evaluation very useful, they

    would be interested in knowing whether this program also informs us aboutthe relationship between educational attainment and outcomes of interest toeconomists such as wages, crime, intergenerational outcomes, etc.

  • 7/24/2019 Lecture 1 Stat Review

    5/97

    5

    ExampleIn 1973, the Indonesian government decided that it was important to

    provide equity across the countrys provinces.

    Indonesia undertook a massive schooling building program in whichover 61,000 primary schools were built within the next six years.

    The intent of the program was to target new schools in areas whereenrollments were previously low which was likely due, in part, to the

    long distances students had to travel to attend school.

    Between 1973 and 1978, the school enrollment rates of 7 to 12 year oldIndonesians rose from 69 percent to 83 percent.

    From the perspective of whether or not the program increasededucation levels in Indonesian, it appears to have been successful.

  • 7/24/2019 Lecture 1 Stat Review

    6/97

    6

    From an economists viewpoint, this program can be used to ask aquestion of great interest such as does increased education raise wages?

    Duflo (2001) uses the Indonesian schooling building program toanswer precisely this question.

    The idea is that this program is effectively an experiment in that itraised education levels in some parts of Indonesian but not in others.

    In terms of an experiment, children who reside in areas where schoolbuilding increased are the treatment group while those in areas whereno new schools were built are the control group.

    She is able to study whether the increase in education causesan

    increase in the wages of those affected by the program.

  • 7/24/2019 Lecture 1 Stat Review

    7/97

    7

    In addition, we can also use economic theory to think about how theprogram might impact those who were not directly affected by it.

    An increase in the supply of educated workers will shift the laborsupply curve and therefore lead to a new lower equilibrium wage which

    will indirectlyaffect those born before the school building program.

    Duflo (2004) examines the impact of the schooling building programon those born before the program took effect in their province.

    She finds that the increase in educated workers due to the programreduces the wages of workers in older age cohorts by 4 to 10 percent.

    By thinking through the economic theory for how an increased supply

    of workers will affect the economy overall, we can find implicationsfor how those who do not participate in a program may be affected.

  • 7/24/2019 Lecture 1 Stat Review

    8/97

    8

    The goal of this course is to impart a basic understanding ofeconometric theory in order to be able to interpret the findings fromstudies that implement econometric methodologies.

    As mentioned earlier, not all studies will use true experiments or

    natural experimentsto estimate the impact of program or policy.

    As such, we will require a number of assumptions to be maintained inorder for these observational studies to have a causal interpretation.

    Therefore, it is very important to understand the theory behind themethods that we will learn, the assumptions that they require, underwhat circumstances these assumptions are violated, and what, ifanything, we can learn when the assumptions are incorrect.

    The empirical examples in class and empirical exercises in homeworkassignments are aimed to link the theory you learn in class withapplications that illustrate these important issues.

  • 7/24/2019 Lecture 1 Stat Review

    9/97

    9

    I. Statistical Review

    For this course, we will assume that everyone understands basicprobability and statistics.

    However, we will spend the first two or three classes reviewing theseconcepts for two reasons.

    First, we want to be certain that everyone has seen the same topicspresented in a similar manner before moving onto econometrics.

    Second, many of the statistical concepts you have previously seen willbe applied and extended in econometrics.

    By reviewing these concepts, it will be much easier to see the parallelsbetween what you already know and how they are applied.

  • 7/24/2019 Lecture 1 Stat Review

    10/97

    10

    Appendix B Fundamentals of ProbabilitySection B.1 Random Variables and Their Probability Distributions

    A random variable is a variable whose value is determined by the

    outcome of an experiment.

    A discrete random variable takes on a finite or a countably infinite

    number of values.

    Examples

    Tossing a coin, rolling a pair of dice, drawing a card

    A discrete random variable,, is described by its probability densityfunction (pdf), denoted as, which is a list of all of the values therandom variable can take on and the associated probabilities where

    , 1,2, . . . ,

    where can be one of kpossible values.

  • 7/24/2019 Lecture 1 Stat Review

    11/97

    11

    A continuous random variable has a sample space that contains anuncountably infinite number of outcomes.

    Examples

    Temperature, height, and an amount of time.

    However, the probability that a continuous random variable takes onany particular value exactlyis zero.

    Thus, for continuous random variables, we work with the cumulativedistribution function (cdf) which is written as

    whereis the continuous pdf.

  • 7/24/2019 Lecture 1 Stat Review

    12/97

    12

    Section B.2 Joint and Conditional Distributions, and IndependenceLetXand Ybe discrete random variables.

    The joint distribution of

    and

    is fully described by their joint

    probability density function,, , The random variables

    and

    are independent if and only if their

    joint pdf can be written as

    ,, whereandare the marginal pdfs forand ,respectively.

    We will not examine the joint pdf of continuous random variablesthis semester which is why it is not discussed here.

  • 7/24/2019 Lecture 1 Stat Review

    13/97

    13

    In economics, we are often interested in the pdf of one randomvariable given a particular value of another random variable.

    We write the conditional PDF of givenis defined as||,,

    Notice that||is only defined if 0.When both random variables are discrete we can write|| | which is read as the probability

    given that

    .

    Whenand are independent, then knowing the value ofprovides no information of and vice versa such that

    |

    |

    and

    |

    |

  • 7/24/2019 Lecture 1 Stat Review

    14/97

    14

    Section B.3 Features of Probability Distribution FunctionsExpected Value

    The expected value, or mean, of a probability distributionfunction that takes on kdiscrete values

    , , , is

    where

    is the pdf of

    .

    Ifis a continuous random variable then

    We write , or sometimes , and refer to as the population mean.

  • 7/24/2019 Lecture 1 Stat Review

    15/97

    15

    We can also compute the expected value of a function, ,of the random variable.Ifis a discrete random variable then the expected value ofthe random variable is given by

    Ifis a continuous random then the expected value of therandom variable

    is

  • 7/24/2019 Lecture 1 Stat Review

    16/97

    16

    Properties of the Expected Value, 1)

    For any constant c, .2)

    For any constants aand b, .3)

    If , , , are constants and , , , arerandom variables, then

    Alternatively, we can write this expression as

  • 7/24/2019 Lecture 1 Stat Review

    17/97

    17

    Variance

    The variance measures the dispersion of a pdf.

    The variance is the expected value of the (squared) differencea value ofXand the mean of the distribution.

    We can apply the formulas for the expected value of afunction ofXto compute the variance.

    For example, the variance of a discrete random variable is

  • 7/24/2019 Lecture 1 Stat Review

    18/97

    18

    Properties of 1)If cis constant, then

    0.

    2)

    If aand bare constants, then .One issue with using the variance is that its units are the

    squareof the units of the random variable.

    For example, if the random variableXis measured in feet thenis measured in feet squared.In some instances it is useful to work with the positive squareroot of the variance which is known as the standard deviationand is denoted as .

  • 7/24/2019 Lecture 1 Stat Review

    19/97

    19

    Section B.4 Features of Joint and Conditional DistributionsCovariance

    The covariance is a measure of how much two randomvariables move together (co-vary).

    ,

    Notice that if

    tends to be above its mean when

    is above its

    mean, then

    , 0.

    Similarly, iftends to be below its mean when is above itsmean, or vice versa, then , 0.

  • 7/24/2019 Lecture 1 Stat Review

    20/97

  • 7/24/2019 Lecture 1 Stat Review

    21/97

    21

    Correlation CoefficientThe correlation coefficient offers an advantage over thecovariance since it is on a rather intuitive scale.

    , , Notice that , will have the same sign as , .In addition, 1 , 1.Whereas, , can take on any real value, , allows us to scale the degree to which two variables co-vary.

    +1 meansXand Yare perfectly positively correlated.

    -1 meansXand Yare perfectly negatively correlated.

  • 7/24/2019 Lecture 1 Stat Review

    22/97

  • 7/24/2019 Lecture 1 Stat Review

    23/97

    23

    Conditional Expectation

    While the covariance and correlation treat the relationship betweenXand Ysymmetrically, in many instances we will be interested inexplaining a variable in terms of another variable.

    For example, we may be interested in knowing whether earningsdepend upon an individuals level of education.

    One set of statistics we might compute is the expected amount of

    earnings for people conditional on their levels of education.

    The conditional expectation for a discrete random variable givenwhere takes on mdifferent values , , , is| |

  • 7/24/2019 Lecture 1 Stat Review

    24/97

  • 7/24/2019 Lecture 1 Stat Review

    25/97

    25

    5)

    | |, |

    This property is a more general version of the law of iteratedexpectations.

    6)

    If | , then , 0(, 0).Moreover, every function ofXis uncorrelated with Y.Note that the converse of this last property is nottrue; if

    , 0, then it is possible that

    |depends on

    .

    Combining these last two properties notice that if andare random variables where | 0then

    i.

    0since

    | 0 0

    ii. , 0, i.e. UandXare uncorrelated since| 0.

  • 7/24/2019 Lecture 1 Stat Review

    26/97

  • 7/24/2019 Lecture 1 Stat Review

    27/97

  • 7/24/2019 Lecture 1 Stat Review

    28/97

    28

    4)Any linear combination of independent, identically distributednormal random variables has a normal distribution.

    This last property has implications for the average ofindependent, identically distributed normal random variables.

    If , , , are independent random variables, each ofwhich is distributed , , then the average of the randomvariables,

    is normally distributed.

    Furthermore,

    1

    1

    1 1

  • 7/24/2019 Lecture 1 Stat Review

    29/97

  • 7/24/2019 Lecture 1 Stat Review

    30/97

    30

    The Chi-Square Distribution

    Let where , , , are independent standardnormal random variables.

    ThenXfollows the chi square distribution with ndegrees offreedom (which is a special case of the gamma distribution)

    which we write as~.Degrees of freedom generally refers to the number of

    independent pieces of information to create a random variable.

    We will use the abbreviation d.f.to refer to degrees of freedom.

    If a random variableXis distributed

    ~, then it has an

    expected value of nand a variance of 2n.

  • 7/24/2019 Lecture 1 Stat Review

    31/97

  • 7/24/2019 Lecture 1 Stat Review

    32/97

    32

    The FDistribution

    Suppose Uand Vare independent chi square random variableswith n and m degrees of freedom, respectively.

    A random variable of the form is said to have an Fdistribution with m and n degrees of freedom.

    We will use the notation

    , to denote an Frandom variable with mand ndegrees of freedom.

  • 7/24/2019 Lecture 1 Stat Review

    33/97

  • 7/24/2019 Lecture 1 Stat Review

    34/97

    34

    The tDistribution

    Let be a standard normal random variable.Let be a chi square random variable independent of whichhas

    degrees of freedom.

    The Students tratio with ndegrees of freedom is denoted

    The tdistribution with degrees of freedom has an expectedvalue of zero and a variance of

    2.

  • 7/24/2019 Lecture 1 Stat Review

    35/97

    35

    The standard normal distribution and the tdistribution havesimilar shape.

    Both have an expected value of zero and the variance of thetdistribution, , converges to 1 as .

    00.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    0.45

    3 2 1 0 1 2 3

    Probability

    Dens

    ity

    Zratio tratio,4df tratio,10df

  • 7/24/2019 Lecture 1 Stat Review

    36/97

  • 7/24/2019 Lecture 1 Stat Review

    37/97

  • 7/24/2019 Lecture 1 Stat Review

    38/97

    38

    Sampling

    In many instances, we will be interested in knowing the valueof one or more population parameters.

    For example, if we want to know about the degree of incomeinequality in society, we would be curious to know about theexpected value and variance of the population incomedistribution.

    If we have a Census, then we would be able to learn the truecharacteristics of the income distribution.

    However, interviewing everyone in the population is a verycostly exercise in terms of both time and money.

  • 7/24/2019 Lecture 1 Stat Review

    39/97

    39

    Random Sampling

    Instead, we will observe a sample of the population and usethe sample to generate our best guess as to what the true

    characteristics of the population distribution actually are.

    Suppose that is a random variable with a probability densityfunction; where is an unknown parameter.A random sample from; is observations, , , , ,that are drawn independently from the pdf; .We sometimes refer to the random sample , , , asindependent, identically distributed (i.i.d.) random variables.

  • 7/24/2019 Lecture 1 Stat Review

    40/97

    40

    Section C.2 Finite Sample Properties of Estimators

    We now turn to estimators of population parameters and note thatthere are two types of properties of these estimators.

    The first set of properties is finite sample properties which aresometimes referred to as small sample properties.

    The latter title is somewhat misleading since it refers to samples ofany size whether the number of observations is small or large.

    The second set of properties is asymptotic properties which refer tothe behavior of estimators as the sample size approaches infinity.

  • 7/24/2019 Lecture 1 Stat Review

    41/97

    41

    Estimators and Estimates

    Any function of a random sample whose objective is toapproximate a parameter is called an estimator.

    Example

    Suppose that , , , is a random sample from apopulation with a mean of .The sample average,

    , is an estimator of the

    unknown population mean, .After we collect the actual data, , , , , and wecompute the estimator by using the values that we measure in

    the sample, the resulting value is known as an estimate.

  • 7/24/2019 Lecture 1 Stat Review

    42/97

  • 7/24/2019 Lecture 1 Stat Review

    43/97

  • 7/24/2019 Lecture 1 Stat Review

    44/97

    44

    We define the bias of an estimator as

    Example

    For the sample average, , we have alreadyseen that .Therefore, we can compute the bias of

    .

    0The bias of the sample average is 0whichmeans, as we have already seen, that is unbiased.

  • 7/24/2019 Lecture 1 Stat Review

    45/97

  • 7/24/2019 Lecture 1 Stat Review

    46/97

  • 7/24/2019 Lecture 1 Stat Review

    47/97

    47

    Example

    We can compute the sampling variance of the sample

    average, .

    1

    1 Notice that the sampling variance of gets smaller as thesample size, , gets larger.

  • 7/24/2019 Lecture 1 Stat Review

    48/97

  • 7/24/2019 Lecture 1 Stat Review

    49/97

  • 7/24/2019 Lecture 1 Stat Review

    50/97

    50

    The following graph compares the sampling variance of two

    estimators where the estimator with the smaller samplingvariance is shown with the solid red line.

    Notice that for the given interval , , the estimatorwith the smaller variance has more probability in this range.

  • 7/24/2019 Lecture 1 Stat Review

    51/97

    51

    Example

    As we have seen, the sample average , is anunbiased estimator for and has a sampling variance of

    The alternative estimator using only the first observationof the random sample,

    , is also unbiased estimator for

    .

    The variance of the alternative estimator is Therefore,

    is more efficient than

    since

  • 7/24/2019 Lecture 1 Stat Review

    52/97

    52

    Section C.3 Asymptotic or Larger Sample Properties of Estimators

    Another useful set of properties of estimators areasymptotic or larger sample properties of the estimators.

    One useful reason for investigating the asymptotic properties

    of estimators is that we can examine the performance of anestimator as the sample size grows which gives us anotherway to choose between estimators.

    Another useful reason for examining asymptotic properties isthat determining the sampling distribution in finite samples israther difficult for some estimators.

    However, in many cases, it is easier to determine the

    asymptotic sampling distribution and to use it as anapproximation in order to draw inferences.

  • 7/24/2019 Lecture 1 Stat Review

    53/97

    53

    Consistency

    One useful property for an estimator is that as the samplegrows infinitely large, the estimator equals the true parameter.

    Formally, if

    is an estimator of

    with a sample size n, then

    is a consistent estimator of if for every 0,lim 1If is not consistent for then we say it is inconsistent.In addition, if is consistent, then we say that is the

    probability limit of

    which is written as

  • 7/24/2019 Lecture 1 Stat Review

    54/97

    54

    A useful illustration of consistency is the sample average,

    ,

    from a population with mean and variance .We have already seen that is unbiased for and, in addition,we saw that

    Notice that as , 0.Therefore, is a consistent estimator of .Thus, if , , , are independent and identicallydistributed random variables with mean

    , then

    ,which is known as the law of large numbers.

  • 7/24/2019 Lecture 1 Stat Review

    55/97

    55

    A (biased) alternative estimator for the population mean is

    1 1

    1 Notice that as , .In addition, we can show that 1

    Notice that as , 0.It can be shown that is a consistent (but biased) estimator of .

  • 7/24/2019 Lecture 1 Stat Review

    56/97

    56

    Asymptotic Normality

    In order to draw inferences, we need to know not only theestimator, but we also need to know information about thesampling distribution of the estimator.

    Many econometric estimators are approximated by the normaldistribution as the sample size gets large.

    Let

    : 1,2, be a sequence of random variables such

    that for all numbersz,

    where is the standard normal cdf.Then, is said to have an asymptotic standard normaldistribution which we write as ~ 0,1where the astands for either asymptotically or approximately.

  • 7/24/2019 Lecture 1 Stat Review

    57/97

  • 7/24/2019 Lecture 1 Stat Review

    58/97

    58

    Section C.5 Interval Estimation and Confidence Intervals

    While estimation of a population parameter generally yields asingle number as an estimate, that overlooks the fact that thereis uncertainty about the true parameter.

    Example

    The sample average yields a point estimate, , of the truepopulation average,.However, simply reporting this point estimate ignores thefact that has a sampling distribution.

    However, we can instead generate an interval estimate whichis range in which the true parameter lies.

  • 7/24/2019 Lecture 1 Stat Review

    59/97

    59

    Example

    Suppose that , , , are independent randomvariables, each of which is distributed , .We have already seen that

    ~0,1As we have seen, the sample average, , is an unbiasedpoint estimate for the population mean, .How can we use this information to create an interval

    estimate for the true population mean, ?

  • 7/24/2019 Lecture 1 Stat Review

    60/97

    60

    ~0,1We can create an interval that has a 95% probability ofcontaining the population mean,

    .

    We call such an interval a 95% confidence interval.

    In general, we can create a 100 1 %confidenceinterval by choosing a level of significance,

    .

    The smaller the value of that we choose, the higher ourlevel of confidence.

    However, to increase our confidence, we will need alarger interval.

  • 7/24/2019 Lecture 1 Stat Review

    61/97

  • 7/24/2019 Lecture 1 Stat Review

    62/97

    62

    . . 0.95For example, using textbook Appendix Table G.1., we seethat the probability that a standard normal randomvariable falls between -1.96 and +1.96 is 0.95, or

    1.96 1.96 0.95

    0

    0.1

    0.2

    0.3

    0.4

    z_0.025 z_0.025

    PDF

    Z

  • 7/24/2019 Lecture 1 Stat Review

    63/97

    63

    The cdf for the standard normal distribution shown below

    is similar to Appendix Table G.1 for 3.1 1.8.Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010-2.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233

    -1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294

    The values ofZare determined by 0.XYwhereXis thetenths place shown in the rows and Yis the hundredths

    place in the columns.

    As the table shows, 1.96 0.025.

    Th df f h d d l di ib i h b l

  • 7/24/2019 Lecture 1 Stat Review

    64/97

    64

    The cdf for the standard normal distribution shown below

    is similar to Appendix Table G.1 for 1.8 3.1.Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.091.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.97061.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.97672.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.98172.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.98572.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.98902.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.99162.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.99362.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.99522.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.99642.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.99742.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.99812.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986

    3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990

    As the table shows, 1.96 0.975.Therefore,

    1.96 1.96 1.96 1.96 0.9750.025 0.95

  • 7/24/2019 Lecture 1 Stat Review

    65/97

    65

    1.96 1.96 0.95

    Since is a standard normal random variable, theprobability is 0.95 that it falls between -1.96 and +1.96, or

    1.96 1.96 0.95We can then re-write the expression inside of tofind the 95% confidence for

    .

    1.96 1.96 0.95

    1.96 1.96

    0.95

    1.96 1.96 0.95

    E l

  • 7/24/2019 Lecture 1 Stat Review

    66/97

    66

    Example

    The height of white females who registered to vote in

    Allegheny County, PA during the 1960s is normally

    distributed with a variance of 6.25 (in inches).

    If a random sample of 9women is selected and thesample average is height 65.5, construct a 95%confidence interval for the true average height, .

    Noting that 6.25 2.5, the 95%confidence interval for is

    1.96

    1.96

    65.5 1.96

    2.5

    9 65.5 1.96

    2.5

    963.87 67.13

  • 7/24/2019 Lecture 1 Stat Review

    67/97

    67

    Confidence Intervals for the Mean from a Normally DistributedPopulation

    Assuming that we have a random sample from a normallydistributed population andthat we know the variance of the

    distribution, we can use the approach on the preceding slidesto construct a confidence interval, .In situations in which we have much prior experience with theitems being sampled, such as the manufacturer of a product

    who has detailed knowledge of the weight of their product,then we may know the variance of the distribution.

    However, in many instances we will not know the variance of

    the distribution so that we cannot use the above methods.

    In order to construct a confidence interval for a random

  • 7/24/2019 Lecture 1 Stat Review

    68/97

    68

    In order to construct a confidence interval for a random

    sample that is drawn from a normal distribution but withunknown variance, we must first estimate the variance.

    The sample variance for a sample of size nis computed as 1

    It can be shown that is an unbiased estimator of the truepopulation variance,

    .

    If the random sample is drawn from a normal distribution,

    then the ratio follows the chi square distribution with

    1degrees of freedom.

    It can also be shown that if is a random sample

  • 7/24/2019 Lecture 1 Stat Review

    69/97

    69

    It can also be shown that if

    , , , is a random sample

    from the normal distribution with mean , and variance ,then the tratio

    has a Student tdistribution with 1degrees of freedomwhere is the square root of the sample variance, .We can create a

    100 1 %confidence interval for

    using an approach similar to the method used earlier forZ.

    Thus, for the 95% confidence interval, we must find theappropriate values such that

    ., ., 0.95

  • 7/24/2019 Lecture 1 Stat Review

    70/97

  • 7/24/2019 Lecture 1 Stat Review

    71/97

    Example

  • 7/24/2019 Lecture 1 Stat Review

    72/97

    72

    Example

    Returning to the height example of white females whoregistered to vote in Allegheny County, PA during the1960s where height is normally distributed.

    Suppose that for the random sample of 9women,we compute the sample average height 65.5.However, we do not know the variance of height in the

    population,

    , but are able to compute the sample

    variance of height, 8.5.Construct a 95% confidence interval for the true averageheight,

    , among white female registered voters.

  • 7/24/2019 Lecture 1 Stat Review

    73/97

    73

    ., ., 0.95

    First, notice that with 1 8d.f., ., 2.306.df 0.10 0.05 0.025 0.01 0.005

    8 1.397 1.860 2.306 2.896 3.355

    Inserting the appropriate values into the expression belowyields the 95% confidence interval for, .

    ., ., 65.52.306 8.59

    65.5 2.306 8.59

    63.26 67.74

  • 7/24/2019 Lecture 1 Stat Review

    74/97

  • 7/24/2019 Lecture 1 Stat Review

    75/97

    75

    Section C.6 Hypothesis Testing

    Suppose that we have limited information about a populationparameter.

    We may develop an idea or a hypothesis about the trueparameter value.

    If we are able to randomly sample from our population, wecould then test our hypothesis.

    When testing our hypothesis, we call our hypothesis the nullhypothesis ().

    Example

  • 7/24/2019 Lecture 1 Stat Review

    76/97

    76

    In our height example, we may formulate a nullhypothesis that the true mean height of white femaleregistered voters in Allegheny County is 63 inches.

    We would write : 63We test the null hypothesis against an alternative

    hypothesis () of which there are multiple options.: 63: 63 are called one-sided alternative hypotheses.: 63is a two-sided alternative hypothesis.

  • 7/24/2019 Lecture 1 Stat Review

    77/97

    77

    To test the null hypothesis against one of these alternativehypotheses, we need to develop a decision rule.

    Once we have done so, we then use this decision rule to decide

    if we will rejectour null hypothesis or if wefail to rejectit.

    There are two approaches to forming our decision rules forhypothesis testing:

    1)

    using confidence intervals

    2)using the test of signficance

  • 7/24/2019 Lecture 1 Stat Review

    78/97

  • 7/24/2019 Lecture 1 Stat Review

    79/97

    We are now ready to test our hypothesis.

  • 7/24/2019 Lecture 1 Stat Review

    80/97

    80

    We previously found for our example that the 95% confidenceinterval for is

    63.26 67.74

    The 95% confidence interval does not contain our nullhypothesis value of 63.

    We therefore reject our null hypothesis at the 95% level of

    confidence.

    In hypothesis testing, the confidence interval is also called theacceptance region.

    The area outside the region is the critical region and the limitsof the regions are the critical values.

  • 7/24/2019 Lecture 1 Stat Review

    81/97

  • 7/24/2019 Lecture 1 Stat Review

    82/97

  • 7/24/2019 Lecture 1 Stat Review

    83/97

    Examining the tdistribution table with d.f.,

  • 7/24/2019 Lecture 1 Stat Review

    84/97

    84

    1 8df 0.10 0.05 0.025 0.01 0.0058 1.397 1.860 2.306 2.896 3.355we see that the critical value is

    .,

    1.860.

    The decision rule is to reject if 1.860.We form the tratio as before and we previously found the tratio to be 2.57.

    The difference is now the critical value we use in our decisionrule which is 1.860 as opposed to 2.306.

    Since the tratio is still greater than our critical value, we reject

    the null hypothesis.

  • 7/24/2019 Lecture 1 Stat Review

    85/97

    How can we think about Type I and Type II errors?

  • 7/24/2019 Lecture 1 Stat Review

    86/97

    86

    In a trial, the null hypothesis is that the defendant is not guiltywhile the alternative hypothesis is that the defendant is guilty.

    A Type I error is to reject the null hypothesis when it is true.

    In a trial, that would mean finding a defendant guilty when sheis really not guilty.

    A Type II error is to accept the null when it is false.

    In a trial, that would mean finding a defendant not guilty whenshe is really guilty.

    Most jurists prefer to reduce the number of not guilty personswho are sent to prison (i.e. reduce the number of Type I errors).

    P-values

    O h h i i d b fi di i i l l

  • 7/24/2019 Lecture 1 Stat Review

    87/97

    87

    Our hypothesis testing proceeds by finding a critical valueand then testing whether either the sample average lies withinthe confidence interval or if the t-statistic exceeds a threshold.

    Another approach is to ask the question How likely is thatwe would observe the sample mean, , that we find in oursample if the population mean is really ?In our height example, we would ask how likely is it that we

    would observe 65.5in our sample if the true populationmean is 63(the null hypothesis).We proceed as before when we were using the test ofsignificance approach.

    65.5 632.92 9 2.57

    Using the tdistribution table and where ,

  • 7/24/2019 Lecture 1 Stat Review

    88/97

    88

    2.57df 0.10 0.05 0.025 0.01 0.0058 1.397 1.860 2.306 2.896 3.355we see that the estimated tratio of 2.57 falls between 2.306which corresponds to a probability of p=0.05 (in both tails)and 2.896 which corresponds to a probability of p=0.02.

    Thus, the probability that we would observe 65.5 underthe null hypothesis that

    63is between 0.05 and 0.02.

    We call this probability the p-value.

    Why concern ourselves with p-values?

    Instead of arbitrarily choosing a level of significance as we didbefore, we are now reporting the probability that we wouldfind the sample mean if the null hypothesis is in fact true.

    Testing the Equality of Two Population Means

    S th t i di id l b l t f t l ti

  • 7/24/2019 Lecture 1 Stat Review

    89/97

    89

    Suppose that individuals can belong to one of two populationsand we are interested in knowing if the means of the two

    populations are the same.

    Observations from both populations are normally distributedwith~, and ~ , .The null hypothesis is

    :

    which we can also write as: 0The alternative hypothesis is

    : or : 0

  • 7/24/2019 Lecture 1 Stat Review

    90/97

  • 7/24/2019 Lecture 1 Stat Review

    91/97

    91

    From the sample of size mfrom the first population we can

    calculate the sample mean and the sample variance .From the second sample, with a sample size n, we can

    determine and .The test statistic that we use is a tratio which is

    The degrees of freedom subscript is intentionally left off since,

    as discussed above, we are assuming that we have a large samplesuch that this statistic has a standard normal distribution.

  • 7/24/2019 Lecture 1 Stat Review

    92/97

    92

    Notice the similarity between the tratio for the test of equalityof two means and the tratio that we used previously to test ahypothesis about a population mean

    The difference is that the sample mean and the populationmean from the initial tratio are replaced by the differencesinthe sample means and the population means while the variance

    of the sample mean is replaced by the variance of thedifferencein means.

    Example

    We can test whether mean height differs between male and

  • 7/24/2019 Lecture 1 Stat Review

    93/97

    93

    We can test whether mean height differs between male andfemale voters in Allegheny County.

    We draw a new sample of 36 male and 36 female voters wherethe means of the sample are

    summarize height sex

    Variable | Obs Mean Std. Dev. Min Max

    -------------+--------------------------------------------------------

    height | 72 67.29861 3.850791 59.5 75

    sex | 0

    Notice that the variable sexappears to be missing.

    However, since it is a character variable, not a numericvariable, we must use the tabulatecommand.

    tabulate sex

    sex | Freq. Percent Cum.------------+-----------------------------------

    F | 36 50.00 50.00

    M | 36 50.00 100.00

    ------------+-----------------------------------

    Total | 72 100.00

    To test the null hypothesis that the mean heights of men and

    women are equal we need to compute the means and standardd i i l f d

  • 7/24/2019 Lecture 1 Stat Review

    94/97

    94

    women are equal, we need to compute the means and standarddeviations separately for men and women.

    summarize height if sex=="F"

    Variable | Obs Mean Std. Dev. Min Max

    -------------+--------------------------------------------------------

    height | 36 64.52778 2.850926 59.5 72

    summarize height if sex=="M"

    Variable | Obs Mean Std. Dev. Min Max

    -------------+--------------------------------------------------------

    height | 36 70.06944 2.481799 64 75

    The bysortcommand yields the same result.

    bysort sex: summarize height

    -----------------------------------------------------------------------------------

    -> sex = F

    Variable | Obs Mean Std. Dev. Min Max

    -------------+--------------------------------------------------------

    height | 36 64.52778 2.850926 59.5 72

    -----------------------------------------------------------------------------------

    -> sex = M

    Variable | Obs Mean Std. Dev. Min Max

    -------------+--------------------------------------------------------

    height | 36 70.06944 2.481799 64 75

    bysort sex: summarize height

    -----------------------------------------------------------------------------------

    -> sex = F

  • 7/24/2019 Lecture 1 Stat Review

    95/97

    95

    Variable | Obs Mean Std. Dev. Min Max

    -------------+--------------------------------------------------------

    height | 36 64.52778 2.850926 59.5 72

    -----------------------------------------------------------------------------------

    -> sex = M

    Variable | Obs Mean Std. Dev. Min Max

    -------------+--------------------------------------------------------

    height | 36 70.06944 2.481799 64 75

    The sample statistics are for women 64.5, 2.85, and

    8.12and for men

    70.1,

    2.48, and

    6.15.

    Since our null hypothesis is : 0while ouralternative hypothesis is : 0, we will perform atwo tailed test.

    At the 0.05level of significance, the critical tvalue is1.96 so our decision rule is to reject if || 1.96.

  • 7/24/2019 Lecture 1 Stat Review

    96/97

  • 7/24/2019 Lecture 1 Stat Review

    97/97


Recommended