8.3.1 When sigma Is Known: The One-Sample z...

Post on 10-Apr-2020

0 views 0 download

transcript

8.3.1WhensigmaIsKnown:TheOne-SamplezIntervalforaPopulationMeanTocalculatea95%confidenceintervalforμ,weuseourfamiliarformula:

statistic±(criticalvalue)·(standarddeviationofstatistic)Thecriticalvalue,z*=1.96,tellsushowmanystandardizedunitsweneedtogoouttocatchthemiddle95%ofthesamplingdistribution.Wecallsuchanintervalaone-samplezintervalforapopulationmean.Whenevertheconditionsforinference(Random,Normal,Independent)aresatisfiedandthepopulationstandarddeviationσisknown,wecanusethismethodtoconstructaconfidenceintervalforμ.Thismethodisn’tveryusefulinpractice,however.Inmostreal-worldsettings,ifwedon’tknowthepopulationmeanμ,thenwedon’tknowthepopulationstandarddeviationσeither.Butwecanusetheone-samplezintervalforapopulationmeantoestimatethesamplesizeneededtoachieveaspecifiedmarginoferror.TheprocessmimicswhatwedidforapopulationproportioninSection8.2.

8.3.2ChoosingtheSampleSizeAwiseuserofstatisticsneverplansdatacollectionwithoutplanningtheinferenceatthesametime.Youcanarrangetohavebothhighconfidenceandasmallmarginoferrorbytakingenoughobservations.ThemarginoferrorMEoftheconfidenceintervalforthepopulationmeanμis:TodeterminethesamplesizeforadesiredmarginoferrorME,substitutethevalueofz*foryourdesiredconfidencelevel.Useareasonableestimateforthepopulationstandarddeviationσfromasimilarstudythatwasdoneinthepastorfromasmall-scalepilotstudy.ThensettheexpressionforMElessthanorequaltothespecifiedmarginoferrorandsolveforn.Hereisasummaryofthisstrategy.Example–HowManyMonkeysDeterminingsamplesizefromamarginorerrorResearcherswouldliketoestimatethemeancholesterollevelμofaparticularvarietyofmonkeythatisoftenusedinlaboratoryexperiments.Theywouldliketheirestimatetobewithin1milligramperdeciliter(mg/dl)ofthetruevalueofμata95%confidencelevel.Apreviousstudyinvolvingthisvarietyofmonkeysuggeststhatthestandarddeviationofcholesterollevelisabout5mg/dl.Obtainingmonkeysistime-consumingandexpensive,sotheresearcherswanttoknowtheminimumnumberofmonkeystheywillneedtogenerateasatisfactoryestimate.For95%confidence,z*=1.96.Wewilluseσ=5asourbestguessforthestandarddeviationofthemonkeys’cholesterollevel.Settheexpressionforthemarginoferrortobeatmost1andsolveforn:

Because96monkeyswouldgiveaslightlylargermarginoferrorthandesired,theresearcherswouldneed97monkeystoestimatethecholesterollevelstotheirsatisfaction.(Onlearningthecostofgettingthismanymonkeys,theresearchersmightwanttoconsiderstudyingratsinstead!)Taking observations costs time and money. The required sample size may be impossibly expensive. Notice that it is the size of the sample that determines the margin of error. The size of the population does not influence the sample size we need. This is true as long as the population is much larger than the sample. CHECKYOURUNDERSTANDING1. To assess the accuracy of a laboratory scale, a standard weight known to weigh 10 grams is weighed repeatedly. The scale readings are Normally distributed with unknown mean (this mean is 10 grams if the scale has no bias). In previous studies, the standard deviation of the scale readings has been about 0.0002 gram. How many measurements must be averaged to get a margin of error of 0.0001 with 98% confidence? Show your work.

8.3.3WhensigmaIsUnknown:ThetDistributionsWhenthesamplingdistributionofx-barisclosetoNormal,wecanfindprobabilitiesinvolvingx-barbystandardizing:

Recallthatthesamplingdistributionofx-barhasmeanμandstandarddeviation ,asshowninfigure(a)below.Whataretheshape,center,andspreadofthesamplingdistributionofthenewstatisticz?Fromwhatwelearnedinchapter6,subtractingtheconstantμfromthevaluesoftherandomvariablex-barshiftsthedistributionleftbyμunits,makingthemean0.Thistransformationdoesn’taffecttheshapeorspreadof

thedistribution.Dividingbytheconstant keepsthemeanat0,makesthestandarddeviation1,andleavestheshapeunchanged.Asshowninfigure(b),zhasthestandardNormaldistributionN(0,1).Therefore,wecanusethez-tableoracalculatortofindtherelatedprobabilityinvolvingz.That’showwehavegottenthecriticalvaluesforourconfidenceintervalssofar.Whenwedon’tknowσ,weestimateitusingthesamplestandarddeviationsx.Whathappensnowwhenwestandardize?Thisstatistichasadistributionthatisnewtous,calledatdistribution.IthasadifferentshapethanthestandardNormalcurve:stillsymmetricwithasinglepeakat0,butwithmuchmoreareainthetails.

Thestatisticthasthesameinterpretationasanystandardizedstatistic:itsayshowfarx-barisfromitsmeanμinstandarddeviationunits.Thereisadifferenttdistributionforeachsamplesize.Wespecifyaparticulartdistributionbygivingitsdegreesoffreedom(df).Whenweperforminferenceaboutapopulationmeanμusingatdistribution,theappropriatedegreesoffreedomarefoundbysubtracting1fromthesamplesizen,makingdf=n−1.Wewillwritethetdistributionwithn−1degreesoffreedomastn−1forshort.ThefigurebelowcomparesthedensitycurvesofthestandardNormaldistributionandthetdistributionswith2and9degreesoffreedom.Thefigureillustratesthesefactsaboutthetdistributions:

• ThedensitycurvesofthetdistributionsaresimilarinshapetothestandardNormalcurve.Theyaresymmetricabout0,single-peaked,andbell-shaped.

• ThespreadofthetdistributionsisabitgreaterthanthatofthestandardNormaldistribution.ThetdistributionsinthefigureabovehavemoreprobabilityinthetailsandlessinthecenterthandoesthestandardNormal.Thisistruebecausesubstitutingtheestimatesxforthefixedparameterσintroducesmorevariationintothestatistic.

• Asthedegreesoffreedomincrease,thetdensitycurveapproachesthestandardNormalcurveevermoreclosely.Thishappensbecausesxestimatesσmoreaccuratelyasthesamplesizeincreases.Sousingsxinplaceofσcauseslittleextravariationwhenthesampleislarge.

Thet-tablegivescriticalvaluest*forthetdistributions.Eachrowinthetablecontainscriticalvaluesforthetdistributionwhosedegreesoffreedomappearattheleftoftherow.Forconvenience,severalofthemorecommonconfidencelevelsC(inpercents)aregivenatthebottomofthetable.Bylookingdownanycolumn,youcancheckthatthetcriticalvaluesapproachtheNormalcriticalvaluesz*asthedegreesoffreedomincrease.

Example–Findingt*Usingthet-tableSupposeyouwanttoconstructa95%confidenceintervalforthemeanμofaNormalpopulationbasedonanSRSofsizen=12.Whatcriticalvaluet*shouldyouuse?Usingthet-table,weconsulttherowcorrespondingtodf=n−1=11.Wemoveacrossthatrowtotheentrythatisdirectlyabove95%confidencelevelonthebottomofthechart.Thedesiredcriticalvalueist*=2.201.Inthepreviousexample,noticethatthecorrespondingstandardNormalcriticalvaluefor95%confidenceisz*=1.96.Wehavetogooutfartherthan1.96standarddeviationstocapturethecentral95%ofthetdistributionwith11degreesoffreedom.AswiththestandardNormaltable,technologyoftenmakesthet-tableunnecessary.LearnInversetonthecalculator

CHECKYOURUNDERSTANDING

Use the t-table to find the critical value t* that you would use for a confidence interval for a population mean µ in each of the following situations. If possible, check your answer with technology.

(a) A 98% confidence interval based on n = 22 observations.

(b) A 90% confidence interval from an SRS of 10 observations. (c) A 95% confidence interval from a sample of size 7.

8.3.4ConstructingaConfidenceIntervalforμWhentheconditionsforinferencearesatisfied,thesamplingdistributionofx-barhasroughlyaNormal

distributionwithmeanμandstandarddeviation .Becausewedon’tknowσ,weestimateitbythesamplestandarddeviationsx.Aswithproportions,somebooksrefertothestandarddeviationofthesamplingdistributionofx-barasthe“standarderror”andwhatwecallthestandarderrorofthemeanasthe“estimatedstandarderror.”ThestandarderrorofthemeanisoftenabbreviatedSEM.

Wethenestimatethestandarddeviationofthesamplingdistributionby .Thisvalueiscalledthestandarderrorofthesamplemeanx-bar,orjustthestandarderrorofthemean.

Standarderrorofthesamplemean-Thestandarderrorofthesamplemean is ,wheresxisthe

samplestandarddeviation.Itdescribeshowfar willbefromμ,onaverage,inrepeatedSRSsofsizen.

Toconstructaconfidenceintervalforμ,replacethestandarddeviation ofx-barbyitsstandard

error intheformulafortheone-samplezintervalforapopulationmean.Usecriticalvaluesfromthetdistributionwithn−1degreesoffreedominplaceofthezcriticalvalues.Thatis,

Thisone-sampletintervalforapopulationmeanissimilarinbothreasoningandcomputationaldetailtotheone-samplezintervalforapopulationproportionofsection8.2.Sowewillnowpaymoreattentiontoquestionsaboutusingthesemethodsinpractice.

Asbefore,wehavetoverifythreeimportantconditionsbeforeweestimateapopulationmean.Whenwedoinferenceinpractice,verifyingtheconditionsisoftenabitmorecomplicated.Thefollowingexampleshowsyouhowtoconstructaconfidenceintervalforapopulationmeanwhenσisunknown.Bynow,youshouldrecognizethefour-stepprocess.Sinceyouareexpectedtoincludethesefourstepswheneveryouperforminference,wewillstopsaying“followthefour-stepprocess”inexamplesandexercises.Wewillalsolimitouruseoftheicontoexamplesfromthispointforward.Example–VideoScreenTensionConstructingaconfidenceintervalforμAmanufacturerofhigh-resolutionvideoterminalsmustcontrolthetensiononthemeshoffinewiresthatliesbehindthesurfaceoftheviewingscreen.Toomuchtensionwilltearthemesh,andtoolittlewillallowwrinkles.Thetensionismeasuredbyanelectricaldevicewithoutputreadingsinmillivolts(mV).Somevariationisinherentintheproductionprocess.Herearethetensionreadingsfromarandomsampleof20screensfromasingleday’sproduction:

Constructandinterpreta90%confidenceintervalforthemeantensionμofallthescreensproducedonthisday.State:Plan:

Do:Conclude:Nowthatwe’vecalculatedourfirstconfidenceintervalforapopulationmeanμ,it’stimetomakeasimpleobservation.Inferenceforproportionsusesz;inferenceformeansusest.That’sonereasonwhydistinguishingcategoricalfromquantitativevariablesissoimportant.

Example–AutoPollutionAone-sampletintervalforμEnvironmentalists,governmentofficials,andvehiclemanufacturersareallinterestedinstudyingtheautoexhaustemissionsproducedbymotorvehicles.Themajorpollutantsinautoexhaustfromgasolineenginesarehydrocarbons,carbonmonoxide,andnitrogenoxides(NOX).ResearcherscollecteddataontheNOXlevels(ingrams/mile)forarandomsampleof40light-dutyenginesofthesametype.ThemeanNOXreadingwas1.2675andthestandarddeviationwas0.3332.(a)Constructandinterpreta95%confidenceintervalforthemeanamountofNOXemittedbylight-dutyenginesofthistype.State:Plan:Do:

Conclude:(b)TheenvironmentalProtectionAgency(EPA)setsalimitof1.0gram/mileforNOXemissions.AreyouconvincedthatthistypeofenginehasameanNOXlevelof1.0orless?Useyourintervalfrom(a)tosupportyouranswer.

CHECKYOURUNDERSTANDING Biologists studying the healing of skin wounds measured the rate at which new cells closed a cut made in the skin of an anesthetized newt. Here are data from a random sample of 18 newts, measured in micrometers (millionths of a meter) per hour:

We want to estimate the mean healing rate µ with a 95% confidence interval.

1. Define the parameter of interest.

2. What inference method will you use? Check that the conditions for using this procedure are met. 3. Construct a 95% confidence interval for µ. Show your method. 4. Interpret your interval in context.

8.3.5UsingtProceduresWiselyThestatedconfidencelevelofaone-sampletintervalforμisexactlycorrectwhenthepopulationdistributionisexactlyNormal.NopopulationofrealdataisexactlyNormal.TheusefulnessofthetproceduresinpracticethereforedependsonhowstronglytheyareaffectedbylackofNormality.Proceduresthatarenotstronglyaffectedwhenaconditionforusingthemisviolatedarecalledrobust.Robustprocedures-Aninferenceprocedureiscalledrobustiftheprobabilitycalculationsinvolvedinthatprocedureremainfairlyaccuratewhenaconditionforusingtheprocedureisviolated.Forconfidenceintervals,“robust”meansthatthestatedconfidencelevelisstillprettyaccurate.Thatis,ifweusetheproceduretocalculatemany95%confidenceintervals,about95%ofthoseintervalswouldcapturethepopulationmeanμ.Iftheprocedureisn’trobust,thentheactualcaptureratemightbeverydifferentfrom95%.Ifoutliersarepresentinthesample,thenthepopulationmaynotbeNormal.Thetproceduresarenotrobustagainstoutliers,becausex-barandsxarenotresistanttooutliers.Example–MoreAutoPollutiontproceduresnotrobustagainstoutliersEnvironmentalists,governmentofficials,andvehiclemanufacturersareallinterestedinstudyingtheautoexhaustemissionsproducedbymotorvehicles.Themajorpollutantsinautoexhaustfromgasolineenginesarehydrocarbons,carbonmonoxide,andnitrogenoxides(NOX).ResearcherscollecteddataontheNOXlevels(ingrams/mile)forarandomsampleof40light-dutyenginesofthesametype.ThemeanNOXreadingwas1.2675andthestandarddeviationwas0.3332.WeconstructedaconfidenceintervalforthemeanlevelofNOXemittedbyaspecifictypeoflight-dutycarengine.Theoriginalrandomsampleactuallyincluded41engines,butoneofthemrecordedanunusuallyhighamount(2.94grams/mile)ofNOX.Uponfurtherinspection,thisenginehadamechanicaldefect.Sotheresearchersdecidedtoremovethisvaluefromthedataset.TheMinitabcomputeroutputbelowgivessomenumericalsummariesforNOXemissionsintheoriginalsample.DescriptiveStatistics:NOX

Theconfidenceintervalbasedonthissampleof41engineswouldbe(usingdf=40fromthet-table)

Ournewconfidenceintervaliswiderandiscenteredatahighervaluethanouroriginalintervalof1.1599to1.3751.

Fortunately,thetproceduresarequiterobustagainstnon-Normalityofthepopulationexceptwhenoutliersorstrongskewnessarepresent.LargersamplesimprovetheaccuracyofcriticalvaluesfromthetdistributionswhenthepopulationisnotNormal.Thisistruefortworeasons:

1. ThesamplingdistributionofthesamplemeanXfromalargesampleisclosetoNormal(that’sthecentrallimittheorem).Normalityoftheindividualobservationsisoflittleconcernwhenthesamplesizeislarge.

2. Asthesamplesizengrows,thesamplestandarddeviationsxwillbeanaccurateestimateofswhetherornotthepopulationhasaNormaldistribution.

3. Alwaysmakeaplottocheckforskewnessandoutliersbeforeyouusethetproceduresforsmallsamples.Formostpurposes,youcansafelyusetheone-sampletprocedureswhenn≥15unlessanoutlierorstrongskewnessispresent.Exceptinthecaseofsmallsamples,theconditionthatthedatacomefromarandomsampleorrandomizedexperimentismoreimportantthantheconditionthatthepopulationdistributionisNormal.HerearepracticalguidelinesfortheNormalconditionwhenperforminginferenceaboutapopulationmean.

Ifyoursampledatawouldgiveabiasedestimateforsomereason,thenyoushouldn’tbothercomputingatinterval.Orifthedatayouhavearetheentirepopulationofinterest,thenthere’snoneedtoperforminference(becauseyouwouldknowthetrueparametervalue).Example–People,Trees,andFlowersCanweuset?Determinewhetherwecansafelyuseaone-sampletintervaltoestimatethepopulationmeanineachofthefollowingsettings.(a)Thefigurebelowisahistogramofthepercentofeachstate’sresidentswhoareatleast65yearsofage.

(b)Thefigurebelowisastemplotoftheforcerequiredtopullapart20piecesofDouglasfir.(c)Thefigurebelowisastemplotofthelengthsof23specimensoftheredvarietyofthetropicalflowerHeliconia.Yourcalculatorwillcomputeaone-sampletintervalforapopulationmeanfromsampledataorsummarystatistics.LearnOne-sampletintervalsforμonthecalculator