U 1: I L 1: D , ,
S 101
Nicole DalzellDuke University
May 13, 2015
Welcome to Stat 101!
Welcome!
Professor: Nicole Dalzell
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 2 / 1
Welcome to Stat 101! Introduction to Inference
So...what is statistics?
Statistics is the art and science of learning from data.
Data are a set of measurements taken on a set of individual units
Steps for Statistical Inference/ Scientific Inquiry1 Identify a hypothesis or research question2 Collect relevant data3 Analyze the data4 Form a conclusion5 Communicate the Results6 Present your data
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 3 / 1
Welcome to Stat 101! Introduction to Inference
Step 1 : Identify a Hypothesis or Research Question
A well formed hypothesis will clearly identify a population andassociated parameters of interest.
Population: group of individuals or subjects to whom we can makeinference.Parameters: “True” values of characteristics in the population wewant to study.
How many names given to newborn babies in 2012 in the UnitedStates begin with the letter ”j”?
Population ?Parameter ?
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 4 / 1
Welcome to Stat 101! Introduction to Inference
Step 2: Collect the data
Each year the Social Security Administration collects andreleases data on the how many babies are given a certain name.
They released these data for years 1880 to 2013 for eachgender.
For privacy reasons they restrict the list of names to those with atleast 5 occurrences.
We often store and present such data in data sets , comprised ofvariables measured on individual cases.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 5 / 1
Welcome to Stat 101! Introduction to Inference
Data Sets
dataset ordatamatrix⇒
variable↓
type price · · · weight
1 small 15.9 · · · 27052 midsize 33.9 · · · 3560 ← observation...
......
......
54 midsize 26.7 · · · 3245
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 6 / 1
Welcome to Stat 101! Introduction to Inference
Baby Names Data Set
Besides looking at the frequency of first initials, what else could welearn from this data set?
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 7 / 1
Welcome to Stat 101! Introduction to Inference
Visualize the Data: Rank Table
Top Baby Names in 2012Rank Male Female1 Jacob Sophia2 Mason Emma3 Ethan Isabella4 Noah Olivia5 William Ava6 Liam Emily7 Michael Abigail8 Jayden Mia9 Alexander Madison10 Aiden Elizabeth
http:// www.ssa.gov/ oact/ babynames
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 8 / 1
Welcome to Stat 101! Introduction to Inference
Visualize the Data: Rank Table
Top Baby Names in 2013Rank Male Female1 Noah Sophia2 Liam Emma3 Jacob Olivia4 Mason Isabella5 William Ava6 Ethan Mia7 Michael Emily8 Alexander Abigail9 Jayden Madison10 Daniel Elizabeth
http:// www.ssa.gov/ oact/ babynames
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 9 / 1
Welcome to Stat 101! Introduction to Inference
Visualize the Data: Time Dependencies
How has the popularity of a name changed over time?http:// www.babynamewizard.com/ voyager#prefix=\&sw=both\&exact=false
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 10 / 1
Welcome to Stat 101! Introduction to Inference
Visualize the Data: Time Dependencies
http:// www.babynamewizard.com
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 11 / 1
Welcome to Stat 101! Introduction to Inference
What about the first initials?
1 Obtain data from SS website: name, gender, frequency.d <- read.csv("yob2012.txt")
2 Use an R function (substring) to extract the initial of the name.d$initial = substring(d[,1],1,1)
3 Make a barplot of the initials, by gender if desired.barplot(table(d$initial))
barplot(table(d$initial[d$gender == "M"]))
barplot(table(d$initial[d$gender == "F"]))
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 12 / 1
Welcome to Stat 101! Introduction to Inference
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Initials − All names in 20120
1000
2000
3000
4000
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 13 / 1
Welcome to Stat 101! Introduction to Inference
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Initials − All names in 201 (M)
020
040
060
080
012
00
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Initials − All names in 2012 (F)
050
010
0015
0020
0025
0030
00
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 14 / 1
Welcome to Stat 101! Introduction to Inference
Step 4: Form a conclusion
In 2012, newborn babies in the US were given 3,000 unique namesthat began with the letter ”j” based on the data from the SocialSecurity database
The list of babies from the Social Security data set is a sample, agroup of individuals taken from the entire population.
The number of individuals in the sample is usually denoted withthe letter n.
A statistic is any function of the data collected in the sample(e.g., mean, median, etc).
So, the count of the names in the Social Security data set for2012 which begin with ”j” is a statistic.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 15 / 1
Welcome to Stat 101! Populations and Samples
Data Collection
Be aware that there exist “bad” samples.“There are three kinds of lies: lies, damned lies, andstatistics.”
If poor sampling techniques are utilized, then the observedstatistics will not be applicable to the true population of interest.
Example Data Collection:
Raise your hand if you have been on an airplane in the past twoyears.What does this tell us about how many 17-30 year olds haveridden an airplane in the past two years?
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 16 / 1
Welcome to Stat 101! Sampling from a population
Census
Wouldn’t it be better to just include everyone and “sample” the entirepopulation, i.e. conduct a census?
Some individuals are hard to locate or hard to measure. Andthese difficult-to-find people may have certain characteristics thatdistinguish them from the rest of the population.Populations rarely stand still. Even if you could take a census,the population changes constantly, so it’s never possible to get aperfect measure.
http:// www.npr.org/ templates/ story/ story.php?storyId=125380052Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 17 / 1
Welcome to Stat 101! Sampling from a population
Exploratory analysis to inference
Sampling is natural.Think about sampling something you are cooking - you taste(examine) a small part of what you’re cooking to get an ideaabout the dish as a whole.When you taste a spoonful of chili and decide the spoonful youtasted isn’t spicy enough, that’s exploratory analysis.If you generalize and conclude that your entire chili needs chilipowder, that’s an inference.For your inference to be valid, the spoonful you tasted (thesample) needs to be representative of the entire pot (thepopulation).
If your spoonful comes only from the surface and the chili powderis collected at the bottom of the pot, what you tasted is probablynot representative of the whole pot.If you first stir the chili thoroughly before you taste, your spoonfulwill more likely be representative of the whole pot.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 18 / 1
Welcome to Stat 101! Sampling bias
Landon vs. FDR
A historical example of a biased sample yielding misleading results:
In 1936, Landonsought theRepublicanpresidentialnomination opposingthe re-election ofFDR.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 19 / 1
Welcome to Stat 101! Sampling bias
The Literary Digest Poll
The Literary Digest polled about 10 millionAmericans, and got responses from about2.4 million.
The poll showed that Landon would likelybe the overwhelming winner and FDRwould get only 43% of the votes.
Election result: FDR won, with 62% of thevotes.
The magazine was completely discredited because of the poll,and was soon discontinued.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 20 / 1
Welcome to Stat 101! Sampling bias
The Literary Digest Poll - what went wrong?
The magazine had surveyed
its own readers,registered automobile owners, andregistered telephone users.
These groups had incomes well above the national average ofthe day (remember, this is Great Depression era) which resultedin lists of voters far more likely to support Republicans than atruly typical voter of the time, i.e. the sample was notrepresentative of the American population at the time.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 21 / 1
Welcome to Stat 101! Sampling bias
Large samples are preferable, but...
The Literary Digest election poll was based on a sample size of2.4 million, which is huge, but since the sample was biased, thesample did not yield an accurate prediction.
Back to the chili analogy: If the chili is not well stirred, it doesn’tmatter how large a spoon you have, it will still not taste right. Ifthe chili is well stirred, a small spoon will suffice to test the chili.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 22 / 1
Welcome to Stat 101! Sampling bias
A few sources of biasNon-response: If only a (non-random) fraction of the randomlysampled people choose to respond to a survey, the sample mayno longer be representative of the population.Voluntary response: Occurs when the sample consists of peoplewho volunteer to respond because they have strong opinions onthe issue, and hence is not representative of the population.
edition.com, Aug 29, 2013
Convenience sample: Individuals who are easily accessible aremore likely to be included in the sample.
What type of bias do reviews on Amazon.com have? What about re-views on RateMyProfessor.com?
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 23 / 1
Welcome to Stat 101! Sampling bias
Participation question
A school district is considering whether it will no longer allow high schoolstudents to park at school after two recent accidents where students wereseverely injured. As a first step, they survey parents by mail, asking themwhether or not the parents would object to this policy change. Of 6,000 sur-veys that go out, 1,200 are returned. Of these 1,200 surveys that were com-pleted, 960 agreed with the policy change and 240 disagreed. Which of thefollowing statements are true?
I. Some of the mailings may have never reached the parents.
II. The school district has strong support from parents to move forwardwith the policy approval.
III. It is possible that majority of the parents of high school studentsdisagree with the policy change.
IV. The survey results are unlikely to be biased because all parents weremailed a survey.
(a) Only I (b) I and II (c) I and III (d) III and IV (e) Only IV
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 24 / 1
Welcome to Stat 101! Sampling bias
A picture’s worth a lot, but...
A lot of the time we only have part of the story.
BabyCenter: ”Our data comes from nearly half a million parentswho shared their baby’s name with us in 2014.”http:// www.babycenter.com/ top-baby-names-2014
1 ”The Netflix effect”Orange is the new Black : Galina, Piper, Nicky, Alex, GloriaHouse of Cards : Garrett, Claire, Robin, Wright
2 ”A blizzard of Frozen names” (Elsa, Hans, Kristin)
Are we comfortable making decisions about these name trends basedon this data? The ”name Elsa soared 29 percent on our list of namesfor baby girls”. Is this sample statistic enough for us to conclude thatthe population parameter of the percent of newborn girls in the UnitedStates who are named Elsa has increased from 2013 to 2014?http:// www.babycenter.com/ top-baby-names-2014
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 25 / 1
Welcome to Stat 101! Observational studies and experiments
Causality versus Correlation
1 Is there an increase in the popularity of the number of baby girlsnamed Elsa from 2013 to 2014?
2 Has the popularity in Frozen caused an increase in the numberof baby girls that were named Elsa?
Causal Effect3 Is the popularity in Frozen related to the increase in the number
of baby girls that were named Elsa?Correlation, or relationship
We collect our data differently depending on the type of relationship(causal or correlation) that we are interested in.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 26 / 1
Welcome to Stat 101! Observational studies and experiments
Observational studies and experiments
An experimental study is a controlled study in which theresearchers impose treatments upon the subjects.
Experiments are the preferred method of data collection becauseoften results can be attributed as causal. I.e., we can concludethat the treatments caused the response of the study.Subjects are assigned to control and treatment groups usingrandom assignment.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 27 / 1
Welcome to Stat 101! Observational studies and experiments
Blocking
We would like to design an experiment toinvestigate if energy gels makes you run faster:
Treatment: energy gelControl: no energy gel
It is suspected that energy gels might affect proand amateur athletes differently, therefore weblock for pro status:
Divide the sample to pro and amateurRandomly assign pro athletes to treatment andcontrol groupsRandomly assign amateur athletes totreatment and control groupsPro/amateur status is equally represented inthe resulting treatment and control groups
Why is this important? Can you think of other variables to block for?
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 28 / 1
Welcome to Stat 101! Observational studies and experiments
Observational studies and experiments
An experimental study is a controlled study in which theresearchers impose treatments upon the subjects.
Subjects are assigned to control and treatment groups usingrandom assignment.Experiments are the preferred method of data collection becauseoften results can be attributed as causal. I.e., we can concludethat the treatments caused the response of the study.In some cases experiments are not always feasible or ethical.
An observational study is a study in which the researchers didnot assign the subjects to treatments.
Observational studies retain the notion of treatment and controlgroups.Observational studies still require the researcher to clearly definea research question. This requires identification of the responsevariable that they will measure on each subject in the study.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 29 / 1
Welcome to Stat 101! Cereal breakfast
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 30 / 1
Welcome to Stat 101! Cereal breakfast
What type of study is this, observational study or an experiment?“Girls who regularly ate breakfast, particularly one that includes cereal, were slimmer
than those who skipped the morning meal, according to a study that tracked nearly
2,400 girls for 10 years. [...] As part of the survey, the girls were asked once a year
what they had eaten during the previous three days.”
What is the conclusion of the study?
Who sponsored the study?
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 31 / 1
Welcome to Stat 101! Cereal breakfast
3 possible explanations:
1 Eating breakfast causes girls to be thinner.
2 Being thin causes girls to eat breakfast.
3 A third variable is responsible for both. What could it be?An extraneous variable that affects both the explanatory and theresponse variable and that make it seem like there is arelationship between the two are called confounding variables.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 32 / 1
Welcome to Stat 101! Cereal breakfast
Observational studies and experiments (Recap)
Observational study: Researchers collect data in a way that doesnot directly interfere with how the data arise, i.e. they merely“observe”, and can only establish an association between theexplanatory and response variables.Experiment: Researchers randomly assign subjects to varioustreatments in order to establish causal connections between theexplanatory and response variables.If you’re going to walk away with one thing from this class, let itbe “correlation does not imply causation”.
http:// xkcd.com/ 552/
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 33 / 1
Welcome to Stat 101! Cereal breakfast
Random assignment vs. random sampling
Random assignment
No random assignment
Random sampling
Causal conclusion, generalized to the whole
population.
No causal conclusion, correlation statement
generalized to the whole population.
Generalizability
No random sampling
Causal conclusion, only for the sample.
No causal conclusion, correlation statement only
for the sample.No
generalizability
Causation Correlation
ideal experiment
most experiments
most observational
studies
bad observational
studies
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 34 / 1
Welcome to Stat 101! Observations and variables
Types of variables
all variables
numerical categorical
continuous discreteregular
categorical ordinal
measured counted unorderedcategories
orderedcategories
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 35 / 1
Welcome to Stat 101! Observations and variables
Types of variables (cont.)
type: small, midsize or large.
price: average price in $1000’s
mpgCity: cite mileage per gallon
drivetrain: front, rear, 4WD
passengers: passenger capacity
weight: car weight in pounds
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 36 / 1
Welcome to Stat 101! Principles of experimental design
Participation question
A study is designed to test the effect of light level and noise level onexam performance of students. The researcher also believes that lightand noise levels might have different effects on males and females,so wants to make sure both genders are represented equally underdifferent conditions. Which of the below is correct?
(a) There are 3 explanatory variables (light, noise, gender) and 1response variable (exam performance)
(b) There are 2 explanatory variables (light and noise), 1 blockingvariable (gender), and 1 response variable (exam performance)
(c) There is 1 explanatory variable (gender) and 3 response variables(light, noise, exam performance)
(d) There are 2 blocking variables (light and noise), 1 explanatoryvariable (gender), and 1 response variable (exam performance)
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 37 / 1
Welcome to Stat 101! Principles of experimental design
Difference between blocking and explanatory variables
Factors are conditions we can impose on the experimental units.
Blocking variables are characteristics that the experimental unitscome with, that we would like to control for.
Blocking is like stratifying, except used in experimental settingswhen randomly assigning, as opposed to when sampling.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 38 / 1
Welcome to Stat 101! Principles of experimental design
Principles of experimental design
Reading: Open Intro Chapter 1.5: Experiments
1 Control: Compare treatment of interest to a control group.2 Randomize: Randomly assign subjects to treatments.3 Replicate: Within a study, replicate by collecting a sufficiently
large sample. Or replicate the entire study.4 Block: If there are variables that are known or suspected to affect
the response variable, first group subjects into blocks based onthese variables, and then randomize cases within each block totreatment groups.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 39 / 1
Welcome to Stat 101! Principles of experimental design
More experimental design terminology...
Placebo: fake treatment, often used as the control group formedical studies
Placebo effect: experimental units showing improvement simplybecause they believe they are receiving a special treatment
Blinding: when experimental units do not know whether they arein the control or treatment group
Double-blind: when both the experimental units and theresearchers do not know who is in the control and who is in thetreatment group
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 40 / 1
Welcome to Stat 101! Recap
Participation question
What is the main difference between observational studies and exper-iments?
(a) Experiments take place in a lab while observational studies donot need to.
(b) In an observational study we only look at what happened in thepast.
(c) Most experiments use random assignment while observationalstudies do not.
(d) Observational studies are completely useless since no causalinference can be made based on their findings.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 41 / 1
Welcome to Stat 101! Recap
More...
Want more baby name analysis?
Freakonomics podcast: How Much Does Your Name Matter?
http:// freakonomics.com/ 2013/ 04/ 08/how-much-does-your-name-matter-a-new-freakonomics-radio-podcast/
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 42 / 1
Syllabus & policies Logistics
General Info
Instructor: Nicole Dalzell - [email protected] Chemistry 214
Lecture: MTuWThF 12:30 AM - 1:45 PMPerkins Classroom 5
Lab: TuWTh 2 PM - 3 PMOld Chemistry 101
OH: Tentative: Monday 2:30 PM - 3:30 PMWednesday 10-11 AMOr by appointment
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 42 / 1
Syllabus & policies Logistics
Required materials
Textbook OpenIntro StatisticsDiez, Barr, Cetinkaya-RundelCreateSpace, 2nd Edition, 2012ISBN: 978-1478217206
Calculator (Optional) You might need a four function calcu-lator that can do square roots for this class. Nolimitation on the type of calculator you can use.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 43 / 1
Syllabus & policies Logistics
Webpage
https:// stat.duke.edu/∼nmd16/ courses/ Summer15/ sta101.001-1/
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 44 / 1
Syllabus & policies Goals and topics
Inference
Design of studies
Probability
Bayesian inference
Frequentist inference(CLT & simulation)
Modeling (numerical response)
1 explanatory
numerical
categorical
one mean & median
one proportion
many explanatory
Exploratory data
analysistwo means & mediansmany means
two proportionsmany proportions
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 45 / 1
Syllabus & policies Details
Course structure
Seven learning units.
Set of learning objectives and required and suggested readings,videos, etc. for each unit.
Prior to beginning the unit, complete the readings and familiarizeyourselves with the learning objectives.
Class time: split between lecture, discussion/application.
Computing labs.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 46 / 1
Syllabus & policies Details
Class - duration of unit
Slides will be posted on the course webpage (under schedule)on the day of the course.
Discussion of concepts as well as hands on activities andexercises to complement them.
Attend class to keep up with the pace and not fall behind + tocontribute to application activities completed in teams.
You are responsible for all the material covered in all componentsof the course, not just the class. Please ask questions in class,office-hours or by e-mail if you are struggling (or just curious), donot wait until just before an exam when it may be too late.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 47 / 1
Syllabus & policies Details
Participation questions: attendance and participation
Objective: Make you an active participant and help me pace the class.
On new material being discussed in class that day.
Credit for participation, regardless of whether you have thecorrect answer.
Up to two unexcused late arrivals or absences will not affect yourparticipation grade.
While I might sometimes call on you during the class discussion,it is your responsibility to be an active participant without beingcalled on.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 48 / 1
Syllabus & policies Details
Problem sets and labs
Problem sets:Objective: Help you develop a more in-depth understanding ofthe material and help you prepare for exams and projects.
Individual: collaborate but don’t copy! – submit in class, show allwork.
Labs:Objective: Give you hands on experience with data analysisusing a statistical software and provide you with tools for theprojects.
In partners – turn in lab report on Sakai by the following day at 5PM.
Lowest score dropped for both.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 49 / 1
Syllabus & policies Details
Project
Objective: Give you independent applied research experience usingreal data and statistical methods.
individual
statistical inference exploring the distributional characteristics ofone variable or relationship between two variables
choose a research question, find data, analyze it, write up yourresults
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 50 / 1
Syllabus & policies Details
Exams
Midterm: Monday, June 1, in class
Final: Wednesday, June 24th (9:00 AM - 12:00 PM) (Cumulative)
Exam dates cannot be changed. No make-up exams will begiven. If you cannot take the exams on these dates you shoulddrop this class.
You must bring a calculator to the exams (no cell phones, iPods,etc.) and you are also allowed to bring one sheet of notes(“cheat sheet”). This sheet must be no larger than 81
2” × 11” andmust be prepared by you (no photocopies). You may use bothsides of the sheet.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 51 / 1
Syllabus & policies Details
Grading
In Class Participation/Activities:5%
Quizzes: 5%
Problem sets: 15%
Labs: 10%
Project: 20%
Midterm: 20%
Final: 25%
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 52 / 1
Syllabus & policies Support
I will regularly send announcements by email, so make sure tocheck your email daily.
While email is the quickest way to reach me outside of class, it ismuch more efficient to answer most statistical questions inperson.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 53 / 1
Syllabus & policies Support
Piazza on Sakai
Content related questions should be posted on Piazza, whichyou may access through the course Sakai site.
Title your questions.
Check if your question has already been answered beforeposting a new question.
I will be answering questions on Piazza daily and all students areexpected to answer questions as well.
“Watch” to be notified when a new question is posted.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 54 / 1
Syllabus & policies Support
Office hours
Instructor Mondays 2:00 - 3:00 PMWednesdays 10-11am
You are highly encouraged to stop by with any questions orcomments about the class, or just to say hi and introduceyourself.
You must attempt problems sets before office hours.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 55 / 1
Syllabus & policies Policies
Policies
Late work policy for problem sets and labs reports:
late but submitted duringclass: lose 10% of pointsafter class on due date: lose20% of points
next day: lose 40% of points
later than next day: lose allpoints
Late work policy for project: 10% off for each day (24-hourperiod) late.
No make-ups
Regrade requests: within one week, no regrade for number ofpoints deducted for a mistake, no regrade after the final
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 56 / 1
Syllabus & policies Policies
Academic Dishonesty
Any form of academic dishonesty will result in an immediate 0 on thegiven assignment and will be reported to the Office of StudentConduct. Additional penalties may also be assessed if deemedappropriate. If you have any questions about whether something is oris not allowed, ask me beforehand.
Some examples:
Use of disallowed materials (including any form ofcommunication with classmates or accessing the web) duringexams and readiness assessments.
Plagiarism of any kind.
Use of outside answer keys or solution manuals for thehomework.
If you have any questions about whether something is or is notallowed, ask me beforehand.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 57 / 1
Syllabus & policies Tips
Tips for success
1 Complete the reading before a new unit begins, and then reviewagain after the unit is over.
2 Be an active participant during lectures and labs.3 Ask questions - during class or office hours, or by email. Ask me
and your classmates.4 Do the problem sets - start early and make sure you attempt and
understand all questions.5 Start your project early and and allow adequate time to complete
it.6 Give yourself plenty of time to prepare a good cheat sheet for
exams. This requires going through the material and taking thetime to review the concepts that you’re not comfortable with.
7 Do not procrastinate - don’t let a unit go by with unansweredquestions as it will just make the following unit’s material evenmore difficult to follow.
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 58 / 1
To do
To do
1 Download or purchase the textbook.www.openintro.org
2 Read the syllabus and let me know if you have any questions.3 Start reviewing the resources for Unit 1 – .
https:// stat.duke.edu/∼nmd16/ courses/ Summer15/ sta101.001-1/ resources/ unit1.html
Sta 101 (N.Dalzell– Duke University) U1 - L1: Data coll., obs. studies, experiments May 13, 2015 59 / 1