+ All Categories
Home > Documents > Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern...

Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern...

Date post: 25-Mar-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
27
1 Chapter 1 - Introduction Learning Objectives 1. Develop an understanding of the basic terminology used in statistics. 2. Realize the importance that statistics plays in our lives. 3. Establish a working terminology which will allow for technical discussion. 4. Understand the different scales used in the measurement of data. 5. Understand the different types of variables used in statistical analysis. Why should we study statistics? 1. The study of statistics is nothing more than the understanding of data analysis. 2. The analysis of data affects our lives on a daily basis: - Interest rates - Unemployment rate - Stocks and bonds - Medical research Basic Terms and Definitions Population: Census: Sample: Variable: Data: (singular)
Transcript
Page 1: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

1

Chapter 1 - Introduction

Learning Objectives

1. Develop an understanding of the basic terminology used in statistics.

2. Realize the importance that statistics plays in our lives.

3. Establish a working terminology which will allow for technical discussion.

4. Understand the different scales used in the measurement of data.

5. Understand the different types of variables used in statistical analysis.

Why should we study statistics?

1. The study of statistics is nothing more than the understanding of data analysis.

2. The analysis of data affects our lives on a daily basis:

- Interest rates

- Unemployment rate

- Stocks and bonds

- Medical research

Basic Terms and Definitions

Population:

Census:

Sample:

Variable:

Data: (singular)

Page 2: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

2

Data: (plural)

Parameter: A numerical value summarizing .

Sample Statistic: A numerical value summarizing .

What is the study of statistics?

Statistics (the study of): The collection, analysis, summary and presentation of data. From the Greek

word Stata, meaning the state of affairs.

Descriptive Statistics:

Inferential Statistics:

Page 3: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

3

Level of Measurement (Measurement Scales)

Nominal Scale

1.

2.

3.

4.

Examples: gender, race, hair color, eye color

Ordinal Scale

1.

2.

3.

4.

Examples: Military rank, likert scale, grades

Interval Scale

1.

2.

Page 4: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

4

Examples:

Temperature:

Historical Time:

Abraham Lincoln was assassinated in 1865.

John F. Kennedy was assassinated in 1963.

Ratio Scale

1.

2.

3.

Examples: Weight, Time to complete a task, Distance (length)

Page 5: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

5

Quantitative vs. Qualitative Variables

Qualitative variable: (or attribute)

Nominal and Ordinal Variables are Qualitative.

Quantitative variable: (or numerical)

Interval and Ratio Scaled Variables are Quantitative

Discrete and Continuous Variables

Discrete:

Continuous:

Page 6: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

1

Chapter 2 - Experimental Design and Data Collection

Learning Objectives

1. Understand the different sources of error that occur when measurements are made.

2. Recognize the different types of experiments conducted.

3. Understand the importance of randomization in data collection.

4. Utilize different sampling techniques in the data collection process.

What is Good Research?

Good research generates dependable data.

-

-

Appropriate research ethics is paramount.

Employ the Scientific Method

We can help reduce the probability of faulty research by maintaining the standards of the scientific

method.

1.

2.

3.

4.

5.

Page 7: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

2

Junk Science

Junk science is the use of faulty scientific data and/or faulty analysis of otherwise good data.

Commonly, junk science is used intentionally by various components of the media to instill the "shock

factor" for a story, politicians to "prove" their point, and business to bolster their product and/or attack a

competitors product.

Sources of Measurement Error

The true value of an object being measured can be represented as:

Bias:

A biased sample is:

Error due to Chance (random error):

Page 8: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

3

Types of Design

There are two primary types of designs that we will be addressing here. The first design is a controlled

experiment and the second is an observational study.

Controlled Experiment

In a controlled experiment, the experimenter "controls" or manipulates the environment then records the

results. Controlled experiments are typically broken down into two major groups:

1.

2.

Blind experiment:

Placebo:

Double blind experiment:

Prospective Study:

.

Controlled Experiment (prospective study) - Aspirin and Heart Attacks Example

The following information was taken from research completed by the Physician's Health Study Research

Group at Harvard Medical School.

The Physician's Health Study was a five year study testing whether regular usage of aspirin reduces

mortality from cardiovascular disease. Every day, physicians participating in the study took one aspirin

tablet or a placebo. The study was a blind study. Of the 11,034 physicians taking the placebo, 189

experienced myocardial infarction (heart attack) whereas only 104 of the 11,037 taking aspirin suffered

a myocardial infarction. Based on these results, it can be shown that persons taking the placebo were

approximately 1.83 times more likely to suffer myocardial infarction.

Page 9: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

4

Source: Preliminary Report: Findings from the Aspirin Component of the Ongoing Physician's Health Study. New

England Journal of Medicine, 318: 262-262 (1988).

Observational Study

Retrospective Study: Study of past events.

Cross Sectional Study:

Randomization is a key element in experimental design.

One of the main objectives of randomization is the reduction of bias by insuring the groups (if making a

comparison between a treatment group and a control or placebo group) are similar to begin with.

Randomization also helps guard against unknown confounding factors in control vs. experimental

groups.

Page 10: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

5

Confounding factor: (sometimes referred to as a " ") A hidden

factor that has an effect on the response we are attempting to measure.

Observational Study - Pediatric Coccidiodomycosis (Valley Fever)

Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley fever)

epidemic from 1991 to 1993. During this period 56 children, 33 males and 23 females, with ages ranging

from 3 months to 17 years, were diagnosed and treated at the local county hospital. Analysis revealed

that 50 cases had primary coccidioidomycosis while 6 were disseminated, with 2 having meningitis

(inflammation of the membranes of the brain or spinal cord) and 4 with osseous coccidioidomycosis

(disseminated to the bones). The disease was predominant in Hispanic children - in 36/56 (64%) vs. non-

Hispanic whites - 7/56 (13%) compared to the hospital population of 48 percent Hispanic and 40 percent

non-Hispanic.

Source: Proceedings of the 5th International Conference of Coccidiodomycosis - Pediatric Coccidiodomycosis

During the Kern County-California Epidemic 1991-93, Amin N, Castroverde E, Khandaker N, Kim J, Pham D,

Patel S, Rangel S, Jean B, Sporer R, Peck R, & Pershadsingh H.

Data Collection

Simple random sample:

Generating random numbers with your TI83/84.

Systematic 1-in-k Random Sampling:

1.

2.

3.

Page 11: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

6

4.

5.

Stratified Random Sampling:

1.

2.

Page 12: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

7

Convenience Samples

• Mail Surveys

• Telephone Opinion Polls

Page 13: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

1

Chapter 3 – Graphical Displays of Univariate Data

Learning Objectives

1. Construct and interpret appropriate graphical displays for both qualitative and

quantitative data.

2. Recognize and classify shapes of distributions when appropriate.

.

Frequency Distribution Table (Grades Example)

Now produce a bar graph by hand.

Page 14: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

2

Frequency Distribution Table (Class Exam Score Example)

Stem-and-Leaf Display: Use the same data as above to produce a stem-and-Leaf display.

62 91 85 97 44

73 80 86 70 79

99 87 96 94 72

93 84 91 72 65

Page 15: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

3

Produce a Frequency Distribution Table and Bar Graph of Class Hair Color

Page 16: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

4

Shapes of Distributions

Page 17: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

5

Scatter Plots

YEAR Pollution

Index Deaths

1987 2.5 147

1988 2.6 130

1989 8.3 210

1990 3.4 130

1991 1.3 114

1992 3.8 162

1993 11.6 208

1994 6.4 178

Page 18: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

1

Chapter 4 – Measurements of Location and Position

Learning Objectives

1. Compute and interpret the location of the center of a data set along the number line.

2. Compute and interpret the position of a data value relative to the other data values for a data

set.

3. Identify the appropriate measure of center for a data set by distributional shape.

Measurements of Location

The Sample Mean

English Parameter Statistic What does it measure?

1.

2.

3.

Page 19: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

2

Calculating the Sample Mean: Consider the data values 32, 1, 30

The Sample Median

English Parameter Statistic What does it measure?

Calculating the Median

Page 20: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

3

Mean vs. Median

The Mode

1.

2.

3.

Category Frequency

A 8

B 10

C 11

D 6

F 5

A A B A C F D F C D

C C F D A C B B F B

B A C B D A C C B B

B A C D C A D F B C

Page 21: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

4

Exam Scores

92 83 31 56 87 71 62 74 86 77

73 98 82 70 67 75 46 91 80 88

72 74 81 77 91 83 67 51 79 91

Measurements of Position: Quartiles and Percentiles

Measurement scales applicable:

English Parameter Statistic What does it measure?

Class

Frequency

1

1

2

3

10

8

5

Page 22: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

5

Quartiles: Divides the data into “quarters”.

Combined, these make up the Five Number Summary.

• Minimum

• Q1

• Median

• Q3

• Maximum

The Five Number Summary is used to create a box-and-whisker plot.

Consider the following exam score data.

62 91 85 97 44

73 80 86 70 79

99 87 96 94 72

93 84 91 72 65

Page 23: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

6

Percentiles

1.

2.

Procedure For Finding Percentiles

1. Sort the data from smallest value to largest value.

2. Find the location by multiplying the sample size n, by the proportion represented on the percentile.

3. If the location calculated is a non-integer value, round up to the next integer value. Identify the data

value in that position as the desired percentile.

4. If the location calculated is an integer value, average the data value in that position and the next

position. The average of the two values will be identified as the desired percentile.

Consider the exam score data and identify P85.

Page 24: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

1

Chapter 5: Measurements of Variability

Learning Objectives

1. Compute and interpret measures of variability for a data set.

2. Determine the appropriate measure of variability to use for a data set based on

distributional shape.

3. Apply and interpret the Empirical Rule and Chebyshev’s Rule data set.

Example Data:

92 83 31 56 87 71 62 74 86 77

73 98 82 70 67 75 46 91 80 88

72 74 81 77 91 83 67 51 79 91

Why do we care about measurements of variability?

Range

Page 25: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

2

Variance and Standard Deviation

English Parameter Statistic What does it measure?

Variance

Standard Deviation

What is the meaning of “variability about the mean.”?

Find the sample variance and standard deviation for our example data:

Stat > Calc > 1-Var Stats (list name)

Page 26: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

3

The Empirical Rule

Applies to:

Chebychev’s Rule

Applies to:

Number of sd’s Empirical Rule Chebychev’s Rule

1

2

3

4

Page 27: Chapter 1 Handout - Taft Collegefaculty.taftcollege.edu/bjean/stat-1510/handouts/CH1-5.pdf · Kern County and the California San Joaquin Valley had a major coccidioidomycosis (valley

4

Consider our example data. Which rule appears to apply best to this data? Verify.


Recommended