+ All Categories
Home > Documents > Stat 2411 Statistical Methods

Stat 2411 Statistical Methods

Date post: 23-Feb-2016
Category:
Upload: bly
View: 53 times
Download: 0 times
Share this document with a friend
Description:
Stat 2411 Statistical Methods. Chapter 2: Summarizing data. Summarizing Data. Data are collected to answer some questions. The analysis of the data includes thinking and statistical methods. Example: 8 lb test Fishing Line Question: Which type(s) of line are strongest?. - PowerPoint PPT Presentation
Popular Tags:
29
Stat 2411 Statistical Methods Chapter 2: Summarizing data
Transcript
Page 1: Stat 2411 Statistical Methods

Stat 2411 Statistical Methods

Chapter 2: Summarizing data

Page 2: Stat 2411 Statistical Methods

Summarizing Data

Data are collected to answer some questions. The analysis of the data includes thinking and statistical methods.

Example: 8 lb test Fishing Line

Question: Which type(s) of line are strongest?

Page 3: Stat 2411 Statistical Methods

2.1 Listing numerical data

• Trilene XL 11.5 11.3 11.7 11.6 11.7 11.4 11.5 11.5 11.6 11.4

• Trilene XT11.6 11.8 11.7 11.7 11.5

116 11.6 11.8 11.4 11.7

• Stren 11.1 11.1 11.2 11.0 11.1

11.3 11.2 10.9 11.0 11.1

Page 4: Stat 2411 Statistical Methods

Plotting of the dataDot diagram

When Analyzing data, always plot the data!

A dot diagram:XL XT Stren

11.8 * *11.7 * * * * *11.6 * * * * *11.5 * * * *11.4 * * *11.3 * *11.2 * *11.1 * * * *11.0 * *10.9 *

Page 5: Stat 2411 Statistical Methods

Plotting of the dataBar Chart

• A bar chart – Trilene XL

11.3 11.4 11.5 11.6 11.7

Page 6: Stat 2411 Statistical Methods

2.2 Stem and Leaf Diagram1) Separate each observation into 2 parts

• Stem: everything but the rightmost digit• Leaf: the final digit

2) Write the stems in a vertical column, then draw a vertical line next to them

3) Write each leaf in a row to the right of its stem

Page 7: Stat 2411 Statistical Methods

Stem Leaf plot

9 10111213

Systolic bp data

108 134 100 108 112 112 112 122 116 116 120 108 108 96 114 108 128 114 112 124 90 102 106 124

130 116

820

4

82

Page 8: Stat 2411 Statistical Methods

Completed Stem Leaf plot

9 10 11 12 13

06026888882222446660244804

Page 9: Stat 2411 Statistical Methods

Stem and Leaf Diagram Exercise

Cardiac output in middle aged runners. (Journal of Sports Medicine)

20.9 17.9 19.9 16.0 12.8 23.2 21.221.0 20.9 15.0 22.2 22.2 18.3 19.821.0 15.8 23.6 20.6

Tip: Stem—Ones Leaves—Tenths

12 813 14 15 0 816 0 17 918 319 8 920 6 9 921 0 0 2 6 922 2 2

Page 10: Stat 2411 Statistical Methods

2.3 Frequency DistributionsWith larger data sets it helps to count numbers of values in different summary classes, usually 5-15 classes.

E.g. Suspended solids in agricultural watersheds. (Water Resources Bulletin)

Suspended Solids (ppm) Frequency30-39 840-49 7

50-69 560-69 1170-79 680-89 190-99 2

Page 11: Stat 2411 Statistical Methods

Frequency Distributions

Look at book for:– Class limits– Upper class limits– Lower class limits– Class marks– Class intervals

Page 12: Stat 2411 Statistical Methods

2.4 Graphical Representations

• A histogram represents a frequency distribution with bars.

8 7 5

116

1 230-39 40-49 50-59 60-69 70-79 80-89 90-

99

Page 13: Stat 2411 Statistical Methods

Pie Chart (360 x %)

Tree # % Degrees Oak 50 62.5% 225Maple20 25% 90Ash 10 12.5% 45

80 360

oak

maple

ash

Page 14: Stat 2411 Statistical Methods

2.5 Two Variable DataScattergram

Cma Chromosome Abnormal %

0.11 20.19 50.51 130.53 151.08 251.62 281.73 362.36 452.72 563.12 593.88 634.18 60

0

20

40

60

80

0.0 1.0 2.0 3.0 4.0 5.0Cm_

%A

bnor

mal

Page 15: Stat 2411 Statistical Methods

Plotting Original Data

• Always plot original data points.– This is the first thing to do when analyzing data– This is very important!

Page 16: Stat 2411 Statistical Methods

Plotting Cancer Study Results

• The following plots are from a study by Dr. Terry Rose-Hellekant in the Medical School Duluth

• Treatments– Tamoxifen– Placebo

• Some mice develop breast cancer

Page 17: Stat 2411 Statistical Methods

• The data are RT-PCR expressions corresponding to particular genes– In RT-PCR the values are roughly a log base 2

scale of the RNA content.• PUM1 Is a “housekeeping” gene

– Account for RNA quality in the sample– For example time since death for a study of

schizophrenia on deceased patients’ brains

Page 18: Stat 2411 Statistical Methods
Page 19: Stat 2411 Statistical Methods
Page 20: Stat 2411 Statistical Methods
Page 21: Stat 2411 Statistical Methods
Page 22: Stat 2411 Statistical Methods

Two groups can be compared with back to back stem and leaf diagramsE.g. Stopping distances of bikes

Treaded tire Smooth tire34 1 8 935 5

5 366 4 37 5

38 1 39 12 0 40

Or dot diagrams | | | * | ** | | * |** Treaded340 350 360 370 380 390 400 |*** | * | | * | | * | Smooth

Page 23: Stat 2411 Statistical Methods

When there are associations between sets of data values, plot the data accordingly.

E.g., Snowfall for duluth and White Bear Lake 1972-2000A not very good way to plot the data

WB Lake Duluth130 *120 *110 **

** 100 *** * 90 *****

80 ****** ****** 70 **

*** 60 ** ********** 50 ****

*** 40 *** *** 30 * *** 20

Page 24: Stat 2411 Statistical Methods

Snowfall plot

0102030405060708090100110120130140

1972 1977 1982 1987 1992 1997

year

snow

_tot

al

DuluthWhite Bear

Page 25: Stat 2411 Statistical Methods

A study of trace metals in South Indian River

12

3

4

5

6

T=top water zinc concentration (mg/L)B=bottom water zinc (mg/L)

1 2 3 4 5 6Top 0.415 0.238 0.390 0.410 0.605 0.609Bottom 0.430 0.266 0.567 0.531 0.7070.716

Page 26: Stat 2411 Statistical Methods

• One of the first things to do when analyzing data is to PLOT the data

• This is not a useful way to plot the data. There is not a clear distinction between bottom water and top water zinc

• even though Bottom>Top at all 6 locations.

0 1 2 30

0.10.20.30.40.50.60.70.8

Zinc In River

Depth 1=Top 2=Bottom

Page 27: Stat 2411 Statistical Methods

A better way

0.2

0.3

0.4

0.5

0.6

0.7

Zinc

Top Bottom

Connect points in the same pair.

Page 28: Stat 2411 Statistical Methods

A better way

0

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8

Bottom=Top

Page 29: Stat 2411 Statistical Methods

This following plot would imply a natural ordering of sites from 1 to 6. This would not be the best way to plot the data unless the sites 1-6 correspond to a natural ordering such as distance downstream of a factory.

0 1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

TopBottom

Site

Zin

c


Recommended