+ All Categories
Home > Documents > Year 13 Mathematics IAS 3 - Nulake 3.9 Sample.pdf · 2015-11-30 · Bivariate Data • Achievement...

Year 13 Mathematics IAS 3 - Nulake 3.9 Sample.pdf · 2015-11-30 · Bivariate Data • Achievement...

Date post: 04-Jul-2018
Category:
Upload: hoangtruc
View: 218 times
Download: 0 times
Share this document with a friend
9
Contents uLake Ltd Innovative Publisher of Mathematics Texts Achievement Standard .................................................. 2 Bivariate Data .......................................................... 3 Scatter Plots ............................................................ 4 Relationships ........................................................... 7 Correlation Coefficient ................................................... 12 Regression Analysis ..................................................... 24 Outliers ............................................................... 35 Residuals .............................................................. 40 Residual Plots .......................................................... 41 Causation ............................................................. 46 Non-Linear Regression ................................................... 47 Bivariate Investigation ..................... .............................. 52 Practice Internal Assessment ............................................. 54 Answers ............................................................... 57
Transcript

Year 13Mathematics

Contents

uLake Ltdu a e tduLake LtdInnovative Publisher of Mathematics Texts

Robert Lakeland & Carl NugentBivariate Data

• AchievementStandard .................................................. 2

• BivariateData .......................................................... 3

• ScatterPlots............................................................ 4

• Relationships........................................................... 7

• CorrelationCoefficient................................................... 12

• RegressionAnalysis ..................................................... 24

• Outliers............................................................... 35

• Residuals.............................................................. 40

• ResidualPlots.......................................................... 41

• Causation ............................................................. 46

• Non-LinearRegression................................................... 47

• BivariateInvestigation................................................... 52

• PracticeInternalAssessment ............................................. 54

• Answers............................................................... 57

IAS 3.9

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

2 IAS 3.9 – Bivariate Data

Thisachievementstandardinvolvesstudentsinvestigatingbivariatemeasurementdata.

◆ ThisachievementstandardisderivedfromLevel8ofTheNewZealandCurriculumandisrelatedto theachievementobjectives Carryoutinvestigationsofphenomena,usingthestatisticalenquirycycle ❖ usingexistingdatasets ❖ finding,using,andassessingappropriatemodels(includinglinearregressionforbivariate data),seekingexplanations,andmakingpredictions ❖ usinginformedcontextualknowledgeandstatisticalinference ❖ communicatingfindingsandevaluatingallstagesofthecycle intheStatisticsstrandoftheMathematicsandStatisticsLearningArea.

◆ Investigatebivariatemeasurementdatainvolvesshowingevidenceofusingeachcomponentofthe statisticalenquirycycle.

◆ Investigatebivariatemeasurementdata,withjustification involveslinkingcomponentsofthe statisticalenquirycycletothecontext,andreferringtoevidencesuchasstatistics,datavalues,trends, orfeaturesofvisualdisplaysinsupportofstatementsmade.

◆ Investigatebivariatemeasurementdata,withstatisticalinsight involvesintegratingstatisticaland contextualknowledgethroughouttheinvestigationprocess,andmayincludereflectingaboutthe process;consideringotherrelevantvariables;evaluatingtheadequacyofanymodels,orshowinga deeperunderstandingofthemodels.

◆ Usingthestatisticalenquirycycletoinvestigatebivariatemeasurementdatainvolves: ❖ posinganappropriaterelationshipquestionusingagivenmultivariatedataset

❖ selectingandusingappropriatedisplay(s) ❖ identifyingfeaturesinthedata ❖ findinganappropriatemodel ❖ describingthenatureandstrengthoftherelationshipandrelatingthistothecontext ❖ usingthemodeltomakeaprediction ❖ communicatingfindingsinaconclusion.

Measurementdatacaneitherbediscreteorcontinuousinnature.Inregressionanalysisthe y-variable,orresponsevariable,mustbeacontinuousvariable.Thex-variableorexplanatory variablecanbeeitheradiscreteorcontinuousvariable.Therelationshipmaybenon-linear.

◆ UseandinterpretationofR2isnotexpectedatthislevel.

Achievement Achievement with Merit Achievement with Excellence• Investigatebivariate

measurementdata.• Investigatebivariate

measurementdata,withjustification

• Investigatebivariatemeasurementdata,withstatisticalinsight.

NCEA 3 Internal Achievement Standard 3.9 – Bivariate Data

3IAS 3.9 – Bivariate Data

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

Bivariate Data

IntroductionInstatisticsweareofteninterestedinidentifyingrelationshipsinvolvingmorethanonevariable.InthisbookletwestudyBivariateDatawhichdealswiththestudyoftwoquantitativevariables(pairsofvariables)andidentifyingrelationshipsbetweenthem.Weaimtoidentifytherelationshipbetweenthevariables,graphicallywiththeuseofascatterplotandquantitativelybylookingatcorrelation(thedegreetowhichtwoormorequantitiesarelinearlyassociated).Inadditionweuseregressiontoenableustomakequantitativepredictions(interpolation)ofonevariablefromtheother.MuchoftheemphasisinthisAchievementStandardisonthevisualinterpretationofscatterplotsaswellaslinkingstatisticalknowledgetothecontextofthequestionandusingappropriatereasoningandreflection.TothisendwearesuggestingthatstudentshaveaccesstoiNZight“a simple data analysis system which encourages exploring what data is saying without the distractions of driving complex software (iNZight website)”.INZightcanbedownloadedfromhttp://www.stat.auckland.ac.nz/~wild/iNZight/dlw.htmlandcanbeinstalledoneitheraMacorWindowscomputer.

Alldatasetsusedinthisbookletcanbedownloadedfromourwebsiteatwww.nulake.co.nzunderthe‘Downloads’linkandimporteddirectlyintoiNZightorExcel.

©uLake LtduLake Ltd Innovative Publisher of Mathematics Texts

7IAS 3.9 – Bivariate Data

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

Whenwedrawascatterplottheshapeoftheplotisusuallydefinedaslinearornon-linear(curved).Acorrelationexistsbetweentwovariableswhenoneofthemisrelatedorcanbeinfluencedbytheotherinsomeway.Thestrengthofarelationshipcanbeidentifiedvisually,onascatterplot,byhowtightlyorspreadouttheplottedpointsareandthedirectionoftherelationshipcanbeidentifiedbythegeneralslopeofthepatternofthedata.Studythescatterplotsbelowandnotethevisualdescriptionofshape,strengthanddirectionoftherelationshipbetweenxandy.

Relationships

Inspecting a Scatter Plot

y

xlinear, strong positive relationship

y

xlinear, moderate positive relationship

y

xnon-linear, strong positive relationship

y

x

non-linear,moderatenegativerelationship

y

x

norelationship

Whenwearegivenascatterplotandaskedtocommentontherelationshipbetweenthevariablesourfirstapproachisavisualone.Thefirststepistomakesureweunderstandtheprecisemeaningofthevariablesbeingusedaswellastheunitsappliedtothevariables.Sometimesitmaybenecessarytoresearchthemeaningofthevariablessothatyouhaveabetterunderstandingpriortoinvestigatingapossiblerelationship.Nextlookatthedegreeofscatterofthepoints.Arethepointsclumpedtogetherorspreadout?Arethepointsinanumberofdistinctclustersoristherejustasingleclusterofpointswithoneortwopointsattheextremesofthescatterplot?Arethereanyunusualobservationsthatgoagainstthegeneraltrendofthescatterplot?Arethesepointsstillrealistic,i.e.cantheybeexplained?Whatisthevariationofscatterinthescatterplot?Arethepointsclosetoeachotheroristheresignificantgapsbetweenthem?Dothepointsfollowadiscerniblepatternortrend,e.g.followingastraightline?

Correlation

y

xlinear, weak positive relationship

y

x

linear,strongnegativerelationship

y

x

linear,moderatenegativerelationship

y

x

linear,weaknegativerelationship

Anoutlierisoneormorepointsthatdonotfollowthetrend.

Whendeterminingwhetherarelationshipexistsbetweentwovariableswealsowanttobeabletomeasurethestrengthanddirectionofarelationshipquantitatively,andtodothisforlinearrelationshipswecancalculatethe(linear)correlationcoefficient(r).Thisisexplainedindetailonpage12.

y

xstrongrelationshipwithoutliers

©uLake LtduLake Ltd Innovative Publisher of Mathematics Texts

29IAS 3.9 – Bivariate Data

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

Example

Aheateristurnedoninacoldroomandthetemperature(˚C)oftheroomisrecordedevery fiveminutes.Theresultsaregivenbelow.FindtheleastsquaresregressionlineusingiNZight.

Time(x) C(y)0 0.35 1.810 3.715 5.420 8.125 10.230 11.435 13.840 15.1

Impor t thedataset‘Example’intoiNZightandthendrawascatterplotoftimeversustemperature.

Addalineartrendline(regressionline)tothescatterplotbyclickingonthebutton‘AddtoPlot’andthenontheradiobutton‘Addtrendcurves’.Clickonthecheckbox‘linear’followedby‘ShowChanges’and‘Done’.

Tofindtheregressionlineclickonthe‘GetSummary’button.Boththecorrelationcoefficientandtheregressionlinearedisplayedinthewindow.

Theregressionlineiswrittenintermsofthevariablenamesratherthanxandy, i.e.Degrees.C(y)=0.38467*Time.mins(x)+0.06

©uLake LtduLake Ltd Innovative Publisher of Mathematics Texts

37IAS 3.9 – Bivariate Data

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

48. DownloadthedatasetcalledQuestion48.csvfromunderthe‘Downloads’linkonourwebsite(www.nulake.co.nz). ImportthefileintoiNZight.

a) UsingiNZightproduceascatterplotof xversusy. Visuallydescribeyourscatterplotand identifyanyfeaturesthatstandout,e.g. strength(degreeofscatter),groupings (clusters),unusualobservations,and thevariationofscatteretc.

Commentonthelinearcorrelation coefficientforyourscatterplotandfindthe leastsquaresregressionline.

Predictthevalueofywhenx=5.0.

b) ImportthefileQuestion48b.csvwhich isthesamedatasetasabovewithsome additionalpointsincluded.Drawascatter plotofthedata.Isthereanypointthat youcouldconsiderapossibleoutlier?Ifso identifythepoint.

c) Whateffectdoesyourchosenpointinb) haveonthelinearcorrelationcoefficientand regressionline?

d) Wouldyourchosenpointinpartb)be betteridentifiedasaninfluential pointratherthananoutlierorboth? Justifyyouranswer.

e) ImportthefileQuestion48e.csvwhich isthesamedatasetasabovewithsome additionalpointsincluded.Drawascatter plotofthedata.Isthereanypointthat youcouldconsiderapossibleoutlier?Ifso identifythepoint.

f) Whateffectdoesyourchosenpointine) haveonthelinearcorrelationcoefficientand regressionline?

g) Wouldyourchosenpointinparte)bebetter identifiedasaninfluentialpointor anoutlierorbothorneither?Justifyyour answer.

Achievement – Answerthefollowingquestions.

©uLake LtduLake Ltd Innovative Publisher of Mathematics Texts

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

42 IAS 3.9 – Bivariate Data

ExampleOpenthedatasetQuestion16inExcel,whichcomprisesthemeanlifespan(inyears),metabolicrate(cm3/g/h),gestationperiod(days)andbrainweight(g)of27mammals.ByeitherusingtheAnalysisToolpakwhichcanbeinstalledusing‘Add-Ins’forExcel2010orStatPlusmacLEwhichisaseparateprogramthatcanbedownloadedfromwww.analystsoft.com/en/products/statplusmacle/forMac,produceaplotoftheresidualsforgestationperiodversuslifespanofthe27mammalsandcommentonwhattheplottellsusaboutusingalinearmodelfortheassociationbetweenthesetwovariables.

IfyouareusingStatPlusmacLEonaMacbeginbybootingtheapplicationStatPlus.WhenStatPlusbootsitwillalsobootExcel.OpenanewworksheetinExcel.

UsingExcelopenthedatasetQuestion16.csv.CopythetwocolumnsCandD,i.e.LifespanandGestationPeriodfromtheQuestion16.csvspreadsheetintocolumnsAandBofyourblankworksheet.NowclosetheQuestion16.csvspreadsheet.

IntheStatPlusapplicationchoosethemenuoption‘Statistics’,then‘Regression’,then‘MultipleLinearRegression’.Thefollowingwindowappears.

Clickonthebuttontotherightofthe‘Dependentvariable’field.Rememberthedependentvariableistheresponseoryvariable.TheExcelwindowwillbeactivatedsoyoucanselectthecellsrepresentingthedependentvariableofourdatapair,i.e.‘Lifespan’.WhenyoureturntoStatPlus[Workbook1]Sheet1!$S$1:$A$28shouldappear.Clickonthebuttontotherightofthe‘Independentvariable’field.Theindependentvariableistheexplanatoryorxvariable.TheExcelwindowwillbeactivatedsoyoucanselectthecellsrepresenting

theindependentvariableofourdatapair,thatis,‘GestationPeriod’.WhenyoureturntoStatPlus[Workbook1]Sheet1!$B$1:$B$28shouldappear.Nowclickthecheckboxes‘Plotresidualsvsfitted’and‘Plotlinefit’followedby‘OK‘.

Generatedaretheresiduals,aplotoftheresidualsandascatterplotinanExcelspreadsheet.Thespreadsheetincludesalargeamountofinformationincludingtheleastsquaresregressionline,(Lifespan=9.44087+0.10247*Gestationperiod),thelinearcorrelationcoefficient(r=0.69011)aswellasatableofresiduals(seebelowandthenextpage).

cont...

Thepredictedyvaluesarethosegeneratedbysubstitutingtheappropriategestationperiodintotheleastsquaresregressionline.Forobservation1thegestationperiodis14.SubstitutingthisintoLifespan=9.44087+0.10247*Gestationperiodgivesusapredictedvalue,i.e.Lifespan(y)of10.8754.Theobservedpointforagestationperiodof14daysisalifespanof50yearssothedifferencebetween50and10.8754,i.e.39.1246isourresidual.

Instructions for StatPlus mac LE are given below. See the following page for information on the instructions for using the Analysis ToolPak.

Correlationcoefficient

Regressionequation

©uLake LtduLake Ltd Innovative Publisher of Mathematics Texts

49IAS 3.9 – Bivariate Data

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

Non-Linear Regression cont...Analternativetofittingalogarithmicmodelcouldbetoinvestigateusinganegativeparabola,especiallyifyouareusingiNZightwhereyouhavelimitedoptions(seebelow).

Theequationforthequadratictrendis Strength=1016.18Day–19.35Day2+12520.5kPawhichgivesastrengthof14475kPafor2dayscuringandastrengthof20747kPafor10dayscuring.Comparingthethreetrends(linear,logarithmicandquadratic)thelinearinthelongterm(>25days)willbecomelessreliableasitwillcontinuetoincreaseovertime.Thesamewillbetrueofthequadraticbecauseitwillbegintodecreaseoncethemaximumpointofthequadraticisreached.Inthelongtermthelogarithmicmodelislikelytobethebestfitasitapproachesanasymptote.Anotheroptionistoadoptapiecewisefunctionwhichisacombinationofthedifferentfunctionsoverdifferentsubsetsofx.

iNZight at present only allows quadratic and cubic models for non-linear functions. If you require exponential, logarithmic or power models the authors suggest you use Excel. You can start from the scatter plots produced by the Analysis Toolpak or the StatPlus mac LE. ©

uLake LtduLake Ltd Innovative Publisher of Mathematics Texts

IAS 3.9 – Year 13 Mathematics and Statistics – Published by NuLake Ltd New Zealand © Robert Lakeland & Carl Nugent

58 IAS 3.9 – Bivariate Data

Page 10 Q16. cont...

g) Thereappearstobea moderatepositivelinear associationbetweenthe twovariables. Mammalswithalonger gestationperiodhavea longerlifespan.Two unusualobservationsare theEchnidawhichhasa gestationperiodofonly 14daysbutalonglifespan of50yearsandmanwith agestationperiodof267 daysandaverylong lifespanof86years. Mostpointsfall withinthegestationrange of0to<200days andlifespanof0to20 years.

h) Metabolicrate (explanatory)andgestation period(response). Investigatinghow metabolicratepredicts gestationperiodnotthe otherwayround.

i)

Page 11 Q16. cont...

j) Theyhavealowgestation period.

k) Notanobviouslinearone althoughgreatermetabolic ratedoesresultinalower gestationperiod.More likelyanegativenon-linear relationship.

Page 11 Q16. cont...

l) Thereappearstobeweaknegativenon-linearassociationbetweenthetwovariables.Mammalswithagreatermetabolicratehaveashortergestationperiod.Mostpointsareclusteredinthemetabolicrange0to<1.0andgestationperiodof0to<300days.ExtremepointsarethatoftheAsianelephant,littlebrownbatandE.Americanmolealthoughtheywouldstillfitanon-linearmodel.

m)

n) Gestationperiodisa reasonablepredictorof lifespanalthough manandtheE.American molegoagainstthetrend. Alinearmodelcouldbe suitableforthesetwo variables. Metabolicrateversus lifespanwouldbe bettersuitedtoa non-linearmodel. AgaintheE.American molegoesagainst thetrend.Asuitable non-linearmodelwould probablybeagood predictoroflifespan. Withbrainweight versuslifespanalinear modelisaffected bytheAsian elephantandE.American mole.

Page 11 Q16 n) cont...

Overallgestationlooks likethebetterpredictor fromalinearperspective butmetabolicratefrom anon-linearperspective. Brainweightisnotasgood astheothertwo.

Page 16

17.

r=–0.912

18.

r=0.975

19. r=0.920

Page 17 (Answers may vary)

20. 1.Constantdifference.

21. Closeto–1.Theolderacarthelessitsvalue.

22. Somewhatnegative.Lesswelleducatedpeoplesmoke.

23. Closeto1.Peoplewithbigfeetaregenerallytaller.

24. Somewhatnegative.Morecigarettessmokedthelowertheleveloffitness.

25. –1.Themoreyouspendthelessyousave.

26. Closeto1.Tallmenoftenchoosetallwives.

27. Somewhatpositive.Morelandareagreaterprice.

x y xy x2 y2

5 40.1 200.5 25 1608.0115 32.2 483 225 1036.8418 35.1 631.8 324 1232.0120 34.3 686 400 1176.4925 23.6 590 625 556.9630 26.9 807 900 723.6138 24.1 915.8 1444 580.8150 20.0 1000 2500 400201 236.3 5314.1 6443 7314.73

x y xy x2 y2

1 3.1 3.1 1 9.612 4.4 8.8 4 19.363 7.2 21.6 9 51.844 6.6 26.4 16 43.565 15 75 25 2256 14.1 84.6 36 198.819 20.3 182.7 81 412.0910 25.3 253 100 640.0940 96.0 655.2 272 1600.36

©uLake LtduLake Ltd Innovative Publisher of Mathematics Texts


Recommended