+ All Categories
Home > Documents > Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Date post: 14-Dec-2015
Category:
Upload: estefany-points
View: 232 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
Introduction to Statistics Topics 1 - 5 Nellie Hedrick
Transcript
Page 1: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Introduction to Statistics

Topics 1 - 5 Nellie Hedrick

Page 2: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

StatisticsStatistics is the Study of Data, it is science of reasoning from

data.

What does it mean by the term data?You will find that data vary and variability abounds in everyday

life.

• Observational unit – are the objects described by a set of data.

• Variability – phenomenon of a variable taking on different values or categories from observational unit to observational unit.

• Quantitative Variables – take the numerical values which numerical operation makes sense. Such as height, weight, time, …

• Categorical Variables – places an individual into one of several group or categories. Such as gender, cities in Oklahoma, states in USA, …

• Binary variables – categorical variable that can only take two possible outcome. Male/female, Yes/No, …

• Research Question – often looks for patterns in a variable or compares a variable across different groups or looks for a relationship between variables

Page 3: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

More on Observational Units and Variables:

• Distinction between categorical and quantitative variables is very important determines which statistical tools to use for analyzing a given data set.

• Determine if data measured either quantitatively or categorically • How many hours you slept in the past 24-hours• Whether you have slept for at least 7 hours in the past 24-

hours• Determine a variable that takes numerical values that are

really just category labels, such as zip-code, …• Watch out:

• to determine whether something is actually a variable, ask yourself whether or not it represents a question that can be asked of each observational unit and

• Whether the values can potentially vary from observational unit to observational unit.

Page 4: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

More on Wrap up -• Statistics is the science of data• Data are not mere numbers• Data are collected with purpose and have

meaning in some context• Fundamental concept of statistics is variability• As we go through the course you will understand

to classify variables and determine which statistical tools to apply to the data• Always consider data in context and anticipate

reasonable values for the data collected and analyzed

• Variable is characteristic that varies from one person to another (observational unit)

• Identify variables as categorical, quantitative or binary

Page 5: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 1-6 page 11

Activity 1-9 page 12

Activity 1-13 page 12

Page 6: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Topic 2 – Data and Distributions and the Graphing Calculator

Picturing Distributions with Graph

The Distribution of a variable tells us what values it takes and how often it takes these values. We are looking for pattern of variation.

• Categorical Variables – places an individual into one of several group or categories.

• Quantitative Variables – take the numerical values which numerical operation makes sense

• Distribution of a variable – what values it takes and how often,.

Page 7: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Graphical Representations of DataCategorical Variable

Distribution of Percent of Students Attended Training by Class

0

10

20

30

40

50

60

70

80

90

100

Freshman Sophomore Junior Senior

Per

cent

of s

tude

nts

• Bar Chart

Class Frequency (%)Freshman 14.3

Sophomore 42.9Junior 7.1Senior 35.7

100

Page 8: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 2-2 hand washing (page 17)In August 2005, researchers for the American Society for Microbiology and the Soap and Detergent Association monitored the behavior of more than 6300 users of public restrooms. They observed people in public venues such as Turner Field in Atlanta and Grand Central Station in New York City. They found that 2393 of 3206 men washed their hands, compared to 2802 of 3130 women.a. What proportion of the men washed their hands? What

proportion of the women washed their hands? b. Are these proportions consistent with the following pair of bar

graphs?c. Comment on what your calculations and the bar graph reveal

about whether or not one gender is more likely to wash their hands after using a public restroom.

d. For each city, estimate the proportion of people who washed their hands as accurately as you can from the graph. Atlanta: Chicago: New York: San Francisco:

e. Comment on what the bar graphs reveal about how these cities compare with regard to hand washing.

Page 9: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 2-2 hand washing (page 17)Studying people washing their hand after

using restroom• We can look at % of all data collected

whether or not they are washing their hands• Look at variation between men and women• Variation between people in different state

whether or not washing their hands• Variation between men and women in each

state washing their hand

Page 10: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 2-4: Buckle Up (page 19)

The National Highway Traffic Safety Administration ( NHTSA) reports the percentage of residents in each state who regularly wear a seatbelt in a car and also whether or not the state has a primary or secondary type of seatbelt law. A primary law means that motorists can be stopped based solely on belt usage, while a secondary law means that the motorist can be stopped only for another reason. The 2005 data appear in the next table ( s secondary, p primary, and * not known):a. What are the observational units for these data? b. Classify each of the variables in the table as categorical ( also binary)

or quantitative.c. What would you estimate is a typical usage percentage for a state with

a primary- type seatbelt law? How about a state with a secondary- type law? ( Do not perform any calculations; base your answers on a casual reading of the dotplots.) Primary: Secondary:

d. Does a state with a primary law always have a higher usage percentage than a state with a secondary law? Explain. If not, identify a pair of states for which the state with a primary law has a lower usage percentage than the state with a secondary law.

e. Do states with a primary law tend to have higher usage percentages than states with a secondary law? Explain how you can tell from the dotplots.

f. Do the data seem to support the contention that tougher ( primary) laws lead to more seatbelt usage? Can you draw this conclusion definitively? Explain.

Page 11: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 2-4: Buckle Up (page 19)• What type of variable?• Create visual display DOTPLOT, useful

method for displaying small datasets of quantitative variable

• Label the axis, specially if more than one group

• Bar or dot plot usually more illuminating when we are comparing the distribution of variables between two or more groups

• Statistical tendency- when comparing 2 or more groups or analyzing dataset• Use words like tend to, on average, lead to

in order to express the results.

Page 12: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Watch out and In Brief• Bar or dot plot usually more illuminating when we

are comparing the distribution of variables between two or more groups

• Statistical tendency- when comparing 2 or more groups or analyzing dataset. But it is not a hard-and-fast rule for categorical and quantitative variables. Be careful with your language. This is also true for cause-and-effect conclusions.

• Label your graphs• Be careful, when it is asked proportion(0-1) or

percent(0% - 100%) • Bar graph are easier to compare than comparing

raw data. • Always relate your comments to the context of

the data and ideally to the question of the interest.

Page 13: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Watch out and Wrap up continued• Consistency refer to how variable or

spread out, the values in a data sets are for a quantitative variables.

• When describing a distribution refer to both center (tendency) and spread (consistency)

Page 14: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Exercises 2-9 page 27

Exercises 2-16 page 30

Exercises 2-12 page 28

Page 15: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Topic 3: Drawing Conclusions from Studies

• Data gives you insight into interesting questions.• Idea of generalizing the results of the study to a

larger group than those you used in the study itself.• Population – in a study refers to the entire group of

people or objects (observational unit) of interest• Sample – is typically small part of the population

from whom or about what data are gathered to learn about the population. If sample is selected carefully (representative of the population) you can learn a lot about the population.

• Sample size – the number of observational units (people or objects) studied in a sample.

• Sampling Bias – sampling procedures if it tends systematically to over represent certain segments of the population and under represents others.

Page 16: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

More Definition – Activity 3-1 page 35

• Convenience samples – sample selected due to convenience of being available.

• Voluntary response – sample selected in a such a way that members of the population decide for themselves whether or not to be part of the study.

• Non-response – problem could rise when the observational unit does not respond to the study

• Sampling frame – list used to select the subjects does not represent all variation in the population

• Parameter – number that describe the population (P-P)• Statistics – number that describe the sample (S-S)

Page 17: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 3-1 page 34Elvis Presley is reported to have died in his Graceland mansion on

August 16, 1977. On the 12th anniversary of this event, a Dallas record company wanted to learn the opinions of all adult Americans on the issue of whether Elvis was really dead. But of course they could not ask every adult American this question, so they sponsored a national call- in survey. Listeners of more than 100 radio stations were asked to call a 1- 900 number ( at a charge of $ 2.50) to voice an opinion concerning whether Elvis was really dead. It turned out that 56% of the callers thought that Elvis was alive. This scenario is very common in statistics: wanting to learn about a large group based on data from a smaller group.

Page 18: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 3-1 page 34 (cont)• In 1936, Literary Digest magazine conducted the most extensive

( to that date) public opinion poll in history. They mailed out questionnaires to over 10 million people whose names and addresses they had obtained from telephone books and vehicle registration lists. More than 2.4 million people responded, with 57% indicating they would vote for Republican Alf Landon in the upcoming presidential election. ( Incumbent Democrat Franklin Roosevelt won the actual election, carrying 63% of the popular vote.)

Page 19: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

More Definition – Activity 3-4 page 39• Explanatory variable – The variable whose effect

you want to study.• Response variable – The variable that you suspect

is effected by the other variable, explanatory variable

• Observational Study – when researcher passively observe and record information about observational units.

• Lurking variables – when observational does not includes the possible effects of a variable. Unrecorded variable is called lurking variable.

• Confounding variable – is a lurking variable whose effects on the response variable indistinguishable from the effects of the explanatory variable.

Page 20: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

• Activity 3-4 page 39• Exercise 3-8 page 46

Page 21: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Wrap Up:Key questions to consider• What are the two things can prevent you

from drawing certain conclusion in the study?• Bias and compounding

• To what population can you reasonably generalize the results of a study?• Depends to how you have selected your

data• Can you reasonably draw a cause-and-

effect connection between the explanatory and response variables?• Depends on whether or not explanatory

variable was assigned to the observational units

Page 22: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Topic 4 – Random Sampling• One way to avoid a biased sampling

method is to give every member of the population the same chance of being selected for the sample.

• Your selection method should ensure that every possible sample (of the desired sample size) has an equal chance of being the sample ultimately selected.

• Such a sampling is called Simple Random Sampling (SRS)

• Unbiased – A statistic is said to provide unbiased estimates of a population parameter if values of the statistics from different random samples are centered at the actual parameter value

Page 23: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Definition• Sampling variability – an important

statistical property knows as sampling variability refers to the fact the values of sample statistics vary from sample to sample.

• Precision – of a sample statistics refers to how much the values vary from sample to sample• The bigger the sample size the more

precise and closer together than those with the smaller sample size

• Statistics provides more accurate estimate of the corresponding population parameter

Page 24: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Activity 4-1 page 54Activity 4-2 page 57Exercise 4-18 page 73

Page 25: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Wrap up• Do not confuse the difference between

sample size and the number of sample done in a study.

• Although the role of the sample is crucial to assessing how a sample statistics varies from one sample to sample.

• The size of the sample will not effects the sampling variability.

• As long as the population is large relative to the sample size (at least 10 times as large), the precision of a sample statistics depends on the sample size and not on the population size.

Page 26: Introduction to Statistics Topics 1 - 5 Nellie Hedrick.

Topic 5: Designing Experiments • SELF STUDY• QUIZ – Assignment 1


Recommended