Midterm Review!

Post on 19-Mar-2016

80 views 3 download

description

Midterm Review!. Unit I (Chapter 1-6) – Exploring Data Unit II (Chapters 7-10) - Regression Unit III (Chapters 11-13) - Experiments Unit IV (Chapters 14-17) - Probability. Unit 1 (Chapters 1-6). Exploratory Data Analysis. Key Ideas. Identifying types of variables - PowerPoint PPT Presentation

transcript

Midterm Review!Unit I (Chapter 1-6) – Exploring Data

Unit II (Chapters 7-10) - RegressionUnit III (Chapters 11-13) - Experiments

Unit IV (Chapters 14-17) - Probability

Unit 1 (Chapters 1-6)

Exploratory Data Analysis

Key Ideas Identifying types of variables Describe Data with numbers, graphs and words

(CUSS – Center, shape, spread, unusual features) Comparing two data sets (CUSS) Resistant vs. non-resistant statistics Finding Outliers Picking the right graph for your data Contingency tables – Marginal & conditional

totals Normal

Identifying types of variables Variables you can average, and it make sense to

do so

Variables which fit into categories

CUSS Center

Unusual

Shape

Spread

CUSS Be sure that if you talk about mean, then you

also talk about….

Similarly for median…

Describing a distribution

Example – comparing using CUSS

Resistant Classify the following as resistant/non-resistant:

Mean Median Mode Standard Deviation IQR Range r R^2

Potential Outliers – How do I find ‘em? Look:

1.5*IQR (must memorize) Look at SD’s – more than 2 away for normal

distributions, more than 3 if we don’t know what the distribution looks like

Choosing a graph – advantages/disadvantages Dotplots

Box&Whisker

Stem & leaf

Histogram

Ogives (cumulative frequency)

Contingency Table

Normal Models

Normal Models

Unit 1 (Chapter 1-6) Calculator Stuff Put values in lists Create:

Histogram Do 1-VarStats – find

Mean, standard deviation (which one to use?) 5 number summary

Normalcdf(low z, high z) InvNorm(area to LEFT of cut point)

Chapters 1-6 I can do by hand: Use a 5 number summary to create a boxplot Find outliers using 1.5IQR rule Use a boxplot to create a 5-number summary Create & interpret a stem & leaf plot

Hot Tips Know how the mean follows the skewness, but the

median doesn’t. Be ready to crank out the outlier test given only Q1 and

Q3. Compare shapes, compare centers (using mean or

median), and compare spreads (using standard deviation or IQR). Use context.

Remember, the y-axis on a histogram show frequency, not data.

If you are going to discuss how unusual a data point is, use IQR or standard deviation to compare it to the center.

Know how to use InvNorm – you are finding the z-score for the area to the LEFT of your cut point.

Unit I Key Problems Chapter 3 #5, 15, Chapter 4 #5, 15, 19, 29, Chapter 5 #13, 23 (outlier test for b!), 25, 29, 31

Unit I (Chapters 1-6) VocabCategorical variable Histogram Boxplot

Quantitative variable Stemplot Dotplot

Pie Chart Relative Frequency Frequency Table

Marginal Distribution Conditional Distribution Modified Boxplot

Bar Chart Cumulative Freq Plot (Ogive) Skewed Left/Right

Uniform Unimodal Bimodal

Skewed left/right 5-number summary IQR

Quartile(s) Variance Range

Unit 2 Review

Chapters 7-10Scatterplots and Regression

Key Concepts Describe a scatterplot IN CONTEXT - SUDS (Shape,

unusual features, direction, Strength). Use r if you have it.

Be able to interpret regression given computer print out Interpret in context:

Slope Y-intercept R^2 (CoD) Correlation coefficient (r) S (standard deviation of residuals)

Find a residual and interpret its meaning

More Key concepts Outliers and influential points Non-resistance of r and LSRL Why we call an LSRL and LSRL The importance of residual plots – what do they

tell us? Using logs, ln’s, etc. to linearize Be careful with wording!

SUDS

Computer OutputRegression Analysis: IQ versus Time in KY (in years)Predictor Coef SE Coef T PConstant 129.092 5.996 21.53 0.000Time -5.196 1.146 -4.54 0.001S = 13.1089 R-Sq = 69.6% R-Sq(adj) = 66.2%

Analysis of VarianceSource DF SS MS F PRegression 1 3536.0 3536.0 20.58 0.001Residual Error 9 1546.6 171.8Total 10 5082.5

Residuals and why LSRL

Why Residuals Plots are important

Outliers, resistance or r and LSRL

Re-expressing data Know how to work with something like:

log(y-hat) = 2.3 log(x) + 4 You won’t have to figure out how to re-express Know how to interpret R^2 for the above

equation (say R^2 = 85%) Be able to look at residual plots of multiple re-

expressions and determine which is the best.

Unit II Calculator Stuff LinReg – gets RESID list Enter data and find equation of LSRL, r, R^2 Create scatterplot and residual plot

Hot tips Computing a residual from a point and the LSRL is very common. The list of stuff to interpret in context is common, too. Un-doing a transformed LSRL (chapter 10) should be easy (Ch. 10 #1) Make sure you don’t just write x and y for an equation. Define them in

context. It is highly doubtful you will need to find the LSRL or the residual plot

on your calculator—it is essential that you can read the LSRL from computer output and be able to interpret a given residual plot.

Don’t forget that r not only tells you the strength of the linear relationship, it also tells you whether it’s positive or negative. Make sure to include that fact in any interpretation of r.

Unit II Key Problems Chapter 7 # 1, 5, 11, 17 Chapter 8 # 5, 7, 9, 35 Chapter 9 #1, 11, Chapter 10 # 2

Good to REALLY make sure you have it down: Chapter 7 #9 (Tricky like an AP question) Chapter 8 #1ab Chapter 10 #1