+ All Categories
Home > Documents > The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... ·...

The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... ·...

Date post: 24-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
12
The storm of the century! Promoting student enthusiasm for applied statistics Lee Fawcett School of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne, UK e-mail: [email protected] Keith Newman School of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne, UK Summary This article describes a hands-on activity that has been used with students aged 1218 years to promote the study of Statistics. We believe there is evidence to suggest an increase in student enthusiasm for Statistics at school, within the Mathematics curriculum, but also within other subjects such as Geography. We also believe that the use of such activities has resulted in some students giving more serious thought to studying Statistics at University. The activity described here is supported with a web-based application to allow younger or less experienced students to engage with the material. Keywords: Teaching statistics; extreme values; applied statistics; student motivation; Shiny . BACKGROUND AND MOTIVATION It is our experience that new undergraduate Mathematicians/Statisticians often have a rather dim view of Statistics, and it is not until they study it at University that they begin to appreciate the very practical, hands-on nature of the subject. At Newcastle University, students are offered courses in Clinical Trials, Survival Analysis, Environmental Extremes and Financial Modelling, to name but a few; they often remark that when they take such courses they nally see the relevance of Statistics and can see its place in the real world. Through the rst authors role on the outreach and recruitment team at Newcastle University and associated visits to local schools, it has also become apparent that students often do not see the relevance of Statistics to other subjects, such as Geography, Biology and Psychology. These school visits have helped to shed some light on the rather depressing scenario that such a practical subjectused in most areas of science and so having many exciting applicationsis seen by many students as dry; shown below are comments taken directly from a questionnaire completed by 1418 year olds on the subject of Statistics, distributed by the rst author during outreach visits to local schools over the last 3 years: Boring boring boring. Wish this wasnt part of my Maths course at schoolAll we seem to do is ip coins and roll fair six- sided dice.Who cares about the chances of pulling a green sock from a drawer? Loads of rubbish examples are used and theyre boringSpent 3 whole classes on frequency density in histograms. Thats as exciting as it getsThere was a question about John being late for school. How did they know [the probability of] this was 0.35?Some of these quotes might correspond to what Taleb (2007) refers to as the Ludic Fallacy , in which naive statistical assumptions underpin the model- ling of complex scenarios. Students have often made comments about how unrealistic their study of Probability and Statistics is, and it is our belief that this could have a negative impact on their overall opinion of the subject and its place in the real world (and hence other subjects studied). The classroom activities described in this article aim to dispel such concerns by bringing to life parts of the Probability and Statistics curriculum Original Article 2 © 2016 Teaching Statistics Trust, 39, 1, pp 213
Transcript
Page 1: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

The storm of the century! Promoting studententhusiasm for applied statisticsLee FawcettSchool of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne, UKe-mail: [email protected]

Keith NewmanSchool of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne, UK

Summary This article describes a hands-on activity that has been used with students aged 12–18years to promote the study of Statistics. We believe there is evidence to suggestan increase in student enthusiasm for Statistics at school, within the Mathematicscurriculum, but also within other subjects such as Geography. We also believe thatthe use of such activities has resulted in some students giving more serious thoughtto studying Statistics at University. The activity described here is supported with aweb-based application to allow younger or less experienced students to engage withthe material.

Keywords: Teaching statistics; extreme values; applied statistics; student motivation; Shiny.

BACKGROUND AND MOTIVATION

It is our experience that new undergraduateMathematicians/Statisticians often have a ratherdim view of Statistics, and it is not until they studyit at University that they begin to appreciate thevery practical, hands-on nature of the subject.At Newcastle University, students are offeredcourses in Clinical Trials, Survival Analysis,Environmental Extremes and Financial Modelling,to name but a few; they often remark that whenthey take such courses they finally see therelevance of Statistics and can see its place inthe real world. Through the first author’s role onthe outreach and recruitment team at NewcastleUniversity and associated visits to local schools,it has also become apparent that students oftendo not see the relevance of Statistics to othersubjects, such as Geography, Biology andPsychology.

These school visits have helped to shed somelight on the rather depressing scenario that sucha practical subject—used in most areas of scienceand so havingmany exciting applications—is seenby many students as dry; shown below arecomments taken directly from a questionnairecompleted by 14–18year olds on the subject ofStatistics, distributed by the first author during

outreach visits to local schools over the last3years:

“Boring boring boring. Wish this wasn’t part ofmy Maths course at school”

“All we seem to do is flip coins and roll fair six-sided dice.”

“Who cares about the chances of pulling a greensock from a drawer? Loads of rubbish examplesare used and they’re boring”

“Spent 3 whole classes on frequency density inhistograms. That’s as exciting as it gets”

“There was a question about John being late forschool. How did they know [the probability of]this was 0.35?”

Some of these quotes might correspond to whatTaleb (2007) refers to as the Ludic Fallacy, in whichnaive statistical assumptions underpin the model-ling of complex scenarios. Students have oftenmade comments about how unrealistic their studyof Probability and Statistics is, and it is our beliefthat this could have a negative impact on theiroverall opinion of the subject and its place in thereal world (and hence other subjects studied).

The classroom activities described in this articleaim to dispel such concerns by bringing to lifeparts of the Probability and Statistics curriculum

Original Article

2 © 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 2: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

followed in schools in the UK. The activities usereal-life data on annual maximum wave heights(AMWH) taken from hourly records at a locationin the Gulf of Mexico, and the practical aim is veryclear: to use Statistics to quantify the likelihoodof extreme sea levels and hence better preparefor life-threatening flood events. The activitieshave been used for outreach and engagementpurposes with students as young as 12 to18year olds considering studying Mathematicsat University. The Storm of the Century!, as theoverall activity is often advertised, has alwaysbeen well-received by students, and teachershave often requested permission to use thematerials in class to engage students with theirStatistics curriculum. In some cases, teachers ofother subjects, including Geography, have alsoasked to use these activities with their students.

In this paper we describe The Storm of the Cen-tury! activities and attempt to assess their successas tools for promoting student enthusiasm for ap-plied Statistics. We encourage teachers to use theresources we have developed in their own classesto help engage students with the following topics:relative frequency probability, interpretingprobability, basic statistical modelling and extrap-olation, transposition of formulae, the Normal dis-tribution and non-standard probability models. Allmaterials are available to view at a dedicatedwebpage: www.mas.ncl.ac.uk/~nlf8/outreach.

A web-based application has also been devel-oped to enable students and teachers to interactwith the material without having to get embroiledtoo deeply in the mathematics. For this, we haveused the Shiny add-on package for the (opensource) R software environment for statisticalcomputing; see Chang et al. (2015). The applica-tion itself is hosted online and can be accessedwith any modern web browser. It can also be usedwithout an internet connection, subject to theinstallation of R. Our supporting webpage (linkabove) includes full access details for theapplication, as well as installation details for thesoftware should running the application locallybe required. Readers are invited to take a look atthis and provide any comments or feedback.

THE STORM OF THE CENTURY! ACTIVITY

In this section, we describe the five main partsof The Storm of the Century! activity, plus anextension to the Normal distribution for moreexperienced teachers and students. A two-pagehandout accompanies some slides for an

interactive presentation that typically takes be-tween 60 and 90 minute to complete, althoughthis depends on the level of participation andthe amount of assistance the students require.We recommend trained classroom assistants ifattempting some of the more challenging partswith younger students. Parts 1–3 should bemanageable with students from 12years ofage; part 4 requires some careful thought aboutthe practical interpretation of probabilities; part5 might require more confidence with algebra.An optional extra, probably only to be used withA level students (or equivalent), requires use ofthe Normal distribution (part 6). The full presen-tation and handout are available for readers toview on our webpage.

Part 1 - Motivation

After a five minute icebreaker, students areimmediately told about the links between thestatistical study of extremes and scientists whoneed to use such statistical methods: we talkabout hydrologists, seismologists and oceanogra-phers. We mention that the study of extremes,rather than averages, is a very specialized areaof Statistics and one most students will notencounter until University Statistics courses.However, we explain that it is very important tothese scientists, as extreme observations onvariables such as rainfall, wind speeds and seis-mic activity (for example) are more likely to resultin disasters such as floods andmajor earthquakesthan are observations close to the average. In thisactivity the data we use are, by construction,extreme observations (see Table 1). Althoughthe model we present for these data (see Part 3)originates from a rather niche area of Statistics,it is the probabilities that this model generateswhich provide the main focus of the activity. Forolder students who have studied the Normal dis-tribution, we mention that we will be making useof the mean and standard deviation—of our ex-treme observations—later.

Table 1. Annual maximumwave heights (feet) taken from hourlyobservations at Shell Beach, Louisiana, 1955–2004

8.5 8.9 9.1 8.9 8.4 9.7 9.19.6 8.7 9.3 9.6 9.3 8.7 9.08.8 8.9 8.9 12.2 7.8 7.7 8.38.1 7.3 6.8 6.7 7.3 7.6 8.28.6 9.8 9.5 7.4 7.3 10.2 10.310.4 8.8 9.7 10.0 10.8 11.1 12.711.5 11.8 12.6 13.0 10.5 10.5 10.09.4

Highlighted values are those which exceed 8.75 ft, for use inEq. (1).

10.411.59.4

9.88.8

11.8

9.59.7

12.610.013.0

10.810.5

10.211.110.5

10.312.710.0

9.19.09.6

8.8 8.9

8.9 9.19.38.9

8.99.6

12.29.3

9.7

3The storm of the century! Promoting student enthusiasm for applied statistics

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 3: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

We show lots of motivating pictures from recentearthquake, tsunami and hurricane events; weupdate these pictures every year to make surethere are some that the students recognize fromrecent news stories. Our experience of deliveringmany outreach and engagement activities hastold us that a non-mathematical introduction tosuch an activity, using topical, thought-provokingand often dramatic visual stimuli helps to engageand enthuse the audience about any forthcominghands-on activities. Figure 1 shows some of thepictures used, including hypothetical scenarios inNew York and London. The class are then givensome facts and figures about Hurricane Katrinaand are asked to write some of these down inthe space provided on their handout (seesupporting webpage), including:

• AMWH during Katrina reached 14.4 ft• Katrina was billed as the Storm of the Century

The remainder of the presentation focuses onAMWH data collected at a location in the Gulf ofMexico not far from New Orleans, Louisiana(Table 1). Notice the data span the 50years upto, and including, 2004—the year before Katrinastruck. Throughout the talk students are told toimagine themselves as the mathematician/statistician working as part of a scientific team in-vestigating the design of a new sea wall beingbuilt to protect the city of New Orleans. The mainpremise of the activity is to think about how wecan use historical data on extremes to estimatethe likelihood of future AMWH larger than thoseever recorded before—notice that the largestheight in Table 1 is 13 ft, 1.4 ft lower than that ob-served during Katrina. On the Data Preview pageof the Shiny application there is a drop-downmenu from which various built-in datasets canbe selected, one of which is the AMWH datashown in Table 1. Amap showing the geographical

Fig. 1. Visual stimuli used to motivate the study of environmental extremes. Top row: Hurricane Katrina;middle row: Hypothetical flooding in New York and London as a result of climate change; bottom row: theBoxing Day Tsunami in the Indian Ocean, 2004. The top-right photograph is the first author’s own, takenduring a research visit to New Orleans (2011)

4 Lee Fawcett and Keith Newman

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 4: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

location, with the raw data and variousgraphical/numerical summaries, is automaticallydisplayed, along with a dataset description; seeFigure 2.

Part 2 - Basic activity using relative frequencies

The first statistical activity requires students toestimate the probability that the AMWH for anyrandomly selected year in the future–possibly2005–exceeds 8.75 ft, using a relative frequencyapproach. From the data in Table 1, we can seethat 33 observations exceed 8.75 ft (highlighted)and so, expressing this as a proportion of the totalnumber of AMWH we have, gives:

P AMWH > 8:75 feetð Þ ¼ 3350

¼ 0:66: (1)

Students are also asked to find, in the samemanner, exceedance probabilities for 11.25 and14 ft, respectively giving:

P AMWH > 11:25 feetð Þ ¼ 650

¼ 0:12;

P AMWH > 14 feetð Þ ¼ 050

¼ 0:(2)

After engaging with the class about the practi-cal interpretations of these probabilities, withreference to the probability scale, the studentsare then asked to critique their interpretation ofthese probabilities—especially that given by Eq.(2). Using past data alone implies that an AMWHgreater than 14 ft is impossible, although weknow that wave heights did exceed this levelduring Katrina—and students are asked to recallthis fact from earlier. The task of estimating suchan event seems impossible, unless we extrapo-late using a suitable probability model (see part3 of this activity).

Although obtaining these relative frequencies isvery simple we find it is an extremely accessibleway to begin the activity. Almost all students cancomplete this task without assistance and—moreimportantly—can see the relevance of simpleideas of probability from their school Statisticscurriculum to the real world. To finish the firsttask, students are then asked to complete, byhand, the graph shown in Figure 3. Here, theymust plot exceedance probabilities for AMWH of6.5, 7.0, …,12.5 ft, these probabilities beingobtained using the relative frequency approachas before.

Clicking on the Relative frequency tab of ourShiny application reproduces the plot shown in

Figure 3 within the application itself, as well as atable of results showing relative frequencyexceedance probabilities across a range of valuesof the variable being studied; see the screenshotin Figure 4. For the AMWH at the location beingstudied here, the table produced covers theexceedance probabilities the students are askedto calculate for themselves in the handout andgives the probabilities shown in the plot inFigure 3.

Part 3 - Moving on: A probability model forextremes

We now discuss simple ideas of modelling and,given the aim of estimating probabilities of veryrare events, we attempt to justify the need for awell-fitting model from which to extrapolate. Wediscuss that this requires a leap of faith in that amodel which describes our observed data wellcan be extended beyond the reach of our data,and such uncertainty means we are more reliantthan ever on the model we choose (and, ofcourse, we do always choose a model, there isno correct model).

After discussing some history surrounding thedevelopment of Extreme Value Theory, studentsare introduced to Gumbel’s model for exceedanceprobabilities. All notions of density and distribu-tion functions are avoided, the Gumbel modelbeing simply presented as the survival functionof the Type I extreme value distribution (Coles,2001), this function giving model-basedestimates of the relative frequencies obtainedempirically from the data (e.g. Eq. (1) and Eq.(2)). See the Appendix for full details of theGumbel model (and in particular Eq. (4)).

At this point, the teacher/facilitator has twooptions, depending on their own level ofconfidence and the student audience: (i) workdirectly with the formula in Eq. (4) in theAppendix, demonstrating the use of Gumbel’smodel for the AMWH data with a scientific calcula-tor and allowing the students to try this out forthemselves; (ii) use our Shiny web application toautomate the calculations in Gumbel’s model,allowing students and facilitators alike to engagewith the ideas behind the model without gettingembroiled in the mathematics. In the discussionbelow, we focus mainly on approach (ii) as webelieve many teachers/facilitators and studentswould be most comfortable with this; however,we also briefly discuss how we have usedapproach (i) with older students at recent schoolvisits.

(2)

5The storm of the century! Promoting student enthusiasm for applied statistics

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 5: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

Fig. 2. Data preview page of the Shiny web application, showing the data selection menu, geographicallocation of the data collection site, the raw data and associated numerical and graphical summaries

6 Lee Fawcett and Keith Newman

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 6: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

The Probability Model page of our Shinyapplication allows the user to obtain model-basedestimates of relative frequency exceedanceprobabilities from the Gumbel model by selectingthe option Two-parameter Gumbel Model. A boxappears, giving the functional form of the modeland a description of the parameters in the modelwith their (maximum likelihood) estimates.However, this feature can be ignored, if desired,and attention immediately given to the Table ofprobabilities and Plot of probabilities underneath;see the screenshot in Figure 5. The model-basedestimates shown here can be compared directlyto those estimates obtained from the data on theRelative Frequency page of the application(Figure 4) or, analogously, to those obtained byhand by the students (e.g. Figure 3). Notice fromFigure 5 that the user has the option to switchbetween a variety of commonly-used probabilitymodels; the Normal Model, in particular, isconsidered in the extension activity (see part 6below). Underneath the table and plot anotherfeature of the application is a slider which can beadjusted to allow model-based exceedance prob-abilities for any value of interest to be returned.

If the teacher/facilitator chooses to work withGumbel’s formula directly with the students, usinga scientific calculator instead of our Shiny applica-tion, the estimates of the model parameters (asprovided by the application) should be given. Werecommend the use of trained classroom assis-tants to help the students perform the calculationson a scientific calculator. With a more experiencedaudience discussion of the exponential functioncan be made here. We often allow students to

work in pairs at this point, and although they findthe calculator work challenging they are often ex-tremely satisfied when they realize they can do it!

Discussion surrounding best-fitting models isusually made, supported by the Comparisonspage of the Shiny application; see the screenshotin Figure 6. Students are reminded of theimportance of a good-fitting model as a basis forextrapolation. An informal assessment of theGumbel model, relative to other probabilitymodels, can be made by visually comparing thecurves shown in Figure 6 to the relative frequen-cies from the data. More formally, the goodness-of-fit table in the application gives the sum ofthe squared vertical distances between themodel-based estimates and the relative frequen-cies in the plot—the smaller this value, the better.

Students are asked to return model-basedestimates of the three exceedance probabilitiesconsidered in the first part of the activity andcomplete Table 2. For the Gumbel model, theyare advised to take a reading from the curveshown in the plot in Figures 5 or 6, or to use theslider in Figure 5 to obtain a more accurateestimate. We explain the relevance of theestimates based on the Normal Model in theextension activity (part 6).

Part 4 - Practical interpretation

Students are asked to think about why, accordingto our analysis and results in Table 2, Katrinamight be considered the Storm of the Century.The ensuing discussion is often very interesting.Some students make the connection between the

Fig. 3. Plot of relative frequencies completed by students, with annual maximum wave heights on the x-axisand the associated relative frequency exceedance probabilities on the y-axis

7The storm of the century! Promoting student enthusiasm for applied statistics

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 7: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

Fig. 4. Plot of relative frequency exceedance probabilities in the Shinyweb application, along with a slider bar,which can be used to obtain these probabilities. A slider bar is also included, which uses the inverse function toobtain empirical quantiles given a particular exceedance probability

8 Lee Fawcett and Keith Newman

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 8: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

Fig. 5. Exceedance probabilities based on the fitted Gumbel model as shown in the Shiny web application. Theslider bars allow model-based exceedance probabilities for any chosen value, as well as quantiles obtained oninversion of the fitted Gumbel model

9The storm of the century! Promoting student enthusiasm for applied statistics

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 9: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

Fig. 6. The comparisons page of the Shiny web application, in this case comparing relative frequencyexceedance probabilities to those obtained from the Gumbel and Normal models. The goodness-of-fit figuresshow the squared vertical discrepancies between the model-based and empirical exceedance probabilities foreach model

10 Lee Fawcett and Keith Newman

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 10: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

probability 0.01 and once in a hundred yearsstraight away, whilst others do not make such adirect link but correctly remark that a probabilityof 0.01 “…means this sort of event is extremely un-likely”. Once explained,most students are often ex-tremely satisfied with the title of the activity, andsome are even excited by the fact that they cansee the place of Statistics in the formation of suchheadlines and that they have managed to do thecalculations themselves! At this point, we discussthe process of extrapolation thatwe referred to ear-lier, and the reliance—more than ever—on a well-fitting model for past observations on extremes.

Part 5 - Structural design: transposition offormulae

With older students, we now usually discuss apotential application of the fitted Gumbel model:its use as a tool for assisting the design of a newsea wall. We discuss the trade-offs betweensafety and cost—the higher the sea wall, thegreater the level of safety afforded to a town orcity, but also the greater the construction costsincurred. The following question is posed:

How tall should the sea wall be to protectagainst the AMWH we might expect to see,on average, once every 500 years?

Students are now left to ponder how they mightbe able to use the fitted Gumbel model to helpanswer this question. Classroom assistants oftenwander round the room, asking students if theywould know where to start with this. Usually, veryfew do. However, the following prompt is usuallyenough for some students to begin to tackle theproblem:

P AMWH > xð Þ ¼ 1500

: (3)

Replacing the left-hand-side of Eq. (3) with thefitted Gumbel model (see Eq. (4) in the Appendix)and then solving for x is usually manageable forstudents who are both algebraically confidentand familiar with exponentials/natural loga-rithms. For those who are not—especially

younger students—we refer to the box at thebottom-right of the Probability Model page in theShiny application (Figure 5). Here, the calcula-tions are performed automatically depending onthe exceedance probability determined by thevalue selected on the slider (500 in this example).Thus, the height of the sea-wall offering protec-tion against the AMWH we might expect to seeonce (on average) every 500years is 16.56 ft (to2d.p.). This is known as the 500year return levelestimate (the screenshot in Figure 5 shows theestimated 100year return level).

Part 6 - Extension task: Comparison to theNormal distribution

The extension task here might prove useful forolder students who have some experience ofworking with the Normal distribution. Interestedteachers should read on; otherwise, the activitiescan end with the tasks in parts 4 or 5.

Students who are taking/have taken Statisticsat a more advanced level might already befamiliar with some basic models for probability.To complete this activity and so students cancontextualize this work with their own study ofprobability models, we compare estimates ofexceedance probabilities and quantiles, such asthose given by the Gumbel model in Table 2, withthose from a model they are more familiar with:the Normal N(μ, σ2) distribution. This requiresestimation of the mean μ and variance σ2 fromthe data in Table 1—an exercise in summary sta-tistics in its own right—but also the use of tablesof cumulative probabilities from the standardnormal distribution and quantiles from thisdistribution. Doing so gives the exceedanceprobabilities shown in the bottom row of Table 2,and an estimate of the 500-year return level of13.71 ft, considerably smaller than thatsuggested by the Gumbel model. The exceedanceprobability associated with 14 ft is also 10 timessmaller than that suggested by the Gumbelmodel. Of course, probabilities and quantiles fromthe Normal distribution can be obtained automat-ically from the Probability Model page of the Shinyapplication without the need to perform calcula-tions by hand using statistical tables.

Students are then asked:

• What might be the consequences of using theNormal distribution instead of the Gumbel model?

• Which model would you trust?

Of course, in a practical setting, using theNormal model relative to the Gumbel modelresults in an under-estimate of quantities such

Table 2. Model-based estimates of some exceedanceprobabilities, with associated empirical estimates

Probabilities

Exceeds

8.75 ft 11.25 ft 14 ft

Relative frequency 0.66 0.12 0Gumbel model 0.575 0.1 0.01Normal model 0.653 0.105 0.001

11The storm of the century! Promoting student enthusiasm for applied statistics

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 11: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

as the 500-year return level, possibly leading tosubstantial under-protection of a town or city ifsuch a model were used to inform the design ofa sea wall. Simple graphs of the data, such ashistograms or boxlots, reveal the unsuitability ofthe Normal distribution for the AMWH data, withsome positive skew; see Figure 2. More detaileddiscussions can lead to the consideration ofstandard errors for estimates of return levels,and perhaps confidence intervals, although thecomputation of such is often beyond the abilityand experience of most students in the age rangefor which this activity is intended. Within theShiny application, a check box can be selected ifestimated standard errors are to be displayed.

FURTHER DISCUSSION

Although we provide a practical motivation for thestudy of extremes (rather than averages, forexample) and explain that filtering out a set ofannual maximamight be a good way of classifyingobservations as extreme, we also explain that thisapproach is wasteful of data; we might have daily,or even hourly records (as is the case with theAMWH used in our activities here), and we discardall but the largest value in each year. With olderstudents, we explain that the procedures we useassume that our observations are independentand identically distributed. In the case of ourAMWH data, the largest hourly observations eachyear usually occur at some point during thehurricane season—often August or September—and so successive values in the series of annualmaxima are usually far enough apart to be deemedindependent. Using daily, weekly or monthlymaxima would give us more data to work with,but in doing so, we are likely to encounter issuesof dependence between consecutive maxima andother issues associated with non-stationarity,including seasonal variability. Studies have shownthat violations of the assumption of independentand identically distributed observations can leadto biased estimates of return levels (e.g. Fawcettand Walshaw, 2016).

With older students, we occasionally discussclimate change. In part 3, we discuss theimportance of a well-fitting model for historicalobservations as a basis for making predictions offuture levels of AMWH; of course, any knowledgeabout how our variable is changing through time,perhaps as a result of climate change, shouldbe utilized to provide more realistic estimatesof return levels. Occasionally, and where

appropriate, we explain how models like theGumbel model can be adapted to account forchanges in the underlying level of the extremesof our variable. For example, a simple way to ac-count for trend might be to allow the location pa-rameter in the Gumbel model to depend linearlyon time. Recent studies examining AMWH at loca-tions in the Gulf of Mexico extol the merits of suchan approach, and there is evidence to suggest anincreasing trend in the location parameter of theGumbel model for the AMWH studied here.

EVALUATION

Webelieve that the activities discussed in this paperhave had a positive impact on students’ enthusiasmfor Statistics. It is evidentwhenwe run the activitiesthat students are generally engaged with the topicand many seem to genuinely enjoy taking part inthe work. School teachers have often been evenmore enthusiastic, asking our permission to usethematerials in classwith other students and askingif any follow-upmaterial exists. A key to the successof these activities, we have been told, is not justtheir demonstration of very practical applicationsof Statistics, but the fact that the material isdirectly related to our own personal research; assuch, we are always extremely enthusiastic aboutthe material. The merits of research-informedteaching and learning are discussed in, for example,Griffiths (2004) and Healey (2005), although as faras we are aware there is little in the way of evaluat-ing the success of suchmethods in engagement andoutreach activities.

Although rather anecdotal, teachers have toldus that their students have become much morereceptive to using Statistics in subjects other thanMathematics at school. After taking part in theactivities discussed in this paper, some have alsoshown an increased enthusiasm for studyingScience, Technology, Engineering and Mathemat-ics subjects after their school study. Otherevidence of the success of our activities comesfrom student evaluation questionnaires givenout at the end of our sessions. For example, at arecent student conference held at our University,at which The Storm of the Century! activities wereused, 65% of respondents (aged 16–17years)said they would be more likely to studyMathematics/Statistics at University after havingtaken part in the sessions; 75% said they feltmore enthusiastic about their school study of thesubject. Other open-ended comments fromrecent school visits include:

12 Lee Fawcett and Keith Newman

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13

Page 12: The storm of the century! Promoting student enthusiasm for ...nlf8/publications/TS_Fawcett... · attempting some of the more challenging parts with younger students. Parts 1–3 should

“Wow! Loved the Storm of the Century. Didn’tknow the stuff we learn at school could be so in-teresting”

“… ticked the boxes for me, I like to see how thisstuff can actually be used in the real world”

“Storm of Century is great as I love Maths andGeography and didn’t know the two could belinked”

“Started off really easy but then we were doinghigh level stuff in no time, I never thought I’dbe able to do that stuff.”

CONCLUSIONS

We have outlined some hands-on classroomactivities centred around the analysis of annualmaximum wave height data to enthuse studentsabout real-world applications of Statistics. Theseactivities have been used with students as youngas 12years old, although extra layers of complex-ity can be added on to include material relevant,and challenging, for older students. The activitieshave always been popular, and there is someevidence to suggest they have been successfulin promoting the study of Statistics and its use inother school subjects (such as Geography).Interested readers are invited to take a closerlook at the materials used for the activitiesdiscussed, available to download from ourwebpage, and use them where they deemappropriate. Other activities are also availablefrom this webpage, including Speed CamerasSave Lives?, The Pepsi Challenge, The LieDetector Test and The Game Show Problem(Revisited). The Shiny application can be usedremotely by following the link from our webpage.

Appendix

In this paper, and in the Storm of the Century!activity, we focus primarily on modellingextremes using the Gumbel distribution. In fact,this is just one of three extreme value distribu-tions, which can be shown, due to the ExtremalTypes Theorem, to be the limiting distributionsfor re-scaled maxima (Mn�bn)/an, where

Mn ¼ max X1;X2;…;Xnð Þand Xi, i=1, 2,…, n, are independent and identi-cally distributed random variables. The other two

extreme value distributions are often referred toas the Fréchet and Weibull distributions. Thegeneralized extreme value (GEV) distributionunifies the three extreme value distributions, withthe value of the shape parameter in this distribu-tion controlling the tail heaviness and reducingthe GEV to the Gumbel/Fréchet/Weibull modelswhen it is zero/positive/negative. The Gumbelmodel, as used in this activity, has survival func-tion:

P X > xð Þ ¼ 1� exp �exp � x � μσ

� �h in o; (4)

where X is the random variable, x represents aspecific value of this random variable, and μ andσ are parameters of location and scale,respectively. It is common practice to estimatethese parameters via maximum likelihood; see,for example, Coles (2001, Ch. 2). In this activity,no description of maximum likelihood is given,and the estimates are simply provided by theShiny application (which are, incidentally,μ¼8:636 and σ¼1:275).

References

Chang, W., Cheng, J., Allaire, J.J., Xie, Y. andMcPherson, J. (2015). Shiny. R Package Version0.12.1. Available at: https://cran.r-project.org/web/packages/shiny/shiny.pdf

Coles, S.G. (2001). An Introduction to StatisticalModeling of Extreme Values, London:Springer-Verlag London.

Fawcett, L. and Walshaw, D. (2016). Sea-surgeand wind speed extremes: optimal estimationstrategies for planners and engineers.Stochastic Environmental Research and RiskAssessment, 30, 463–480.

Griffiths, R. (2004). Knowledge production andthe research-teaching nexus: the case of thebuilt environment disciplines. Studies in HigherEducation, 29(6), 709–726.

Healey, M. (2005). Linking research andteaching: exploring disciplinary spaces and therole of inquiry-based learning. In: Barnett, R.(ed.) Reshaping the university: new relation-ships between research scholarship and teach-ing, pp. 67–78. Maidenhead: McGraw-Hill/Open University Press.

Taleb, N. (2007). The Black Swan, pp. 309. NewYork: Random House.

13The storm of the century! Promoting student enthusiasm for applied statistics

© 2016 Teaching Statistics Trust, 39, 1, pp 2–13


Recommended