Statistics – Ch 3.1 Notes Name: __________________________ Describing Relationships
Explanatory and Response Variables
Note: In many studies, the goal is to show that changes in one or more explanatory variables actually cause changes in a response variable. However, other explanatory-response relationships don’t involve direct causation.
Independent Practice: Julie asks, “Can I predict a state’s mean SAT Math score if I know its mean SAT Critical Reading score?” Jim wants to know how the mean SAT Math and Critical Reading scores this year in the 50 states are related to each other. For each student, identify the explanatory variable and the response variable if possible. Displaying Relationships: Scatterplots
A scatterplot shows the relationship between two ____________________________ variables measured on the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears as a point on the graph.
A _____________________________ variable measures an outcome of a study.
An ____________________________ variable may help explain or influence changes in a response variable.
How to Make a Scatterplot
Decide which variable should go on each axis. • Remember, the eXplanatory variable goes on the X-axis!
Label and scale your axes. Plot individual data values.
As in any graph of data, look for the overall pattern and for striking departures from that pattern.
• You can describe the overall pattern of a scatterplot by the _________________, _________________ and ____________________ of the relationship.
• An important kind of departure is an ____________________, an individual value that falls outside the overall pattern of the relationship.
How to Examine a Scatterplot
key
response Dependent
explanatory independent
Julie SATCritical readingScore Explanatory and mean SATMathscore Response
Jim simplyinterested inexploringtherelationshipbetween twovariables
There is no clear explanatoryandresponse
quantitative
23
directionform strength
outlier
Statistics – Ch 3.1 Notes Name: __________________________ Describing Relationships
Describing Scatterplots Two variables have a ______________________ association when above-average values of one tend to accompany above-average values of the other and when below-average values also tend to occur together. Two variables have a _________________________ association when above-average values of one tend to accompany below-average values of the other. Describe the Scatterplot: Independent Practice: Passing Yards Rushing Yards Points
246 70 21 262 49 25 124 116 13 131 54 7 112 100 10 249 77 21 217 57 17 280 53 23 153 103 14 186 130 22 99 108 20
131 23 10 191 77 16 233 90 21 126 129 7 220 96 21
The table shows the number of passing yards, rushing yards, and points scored in each of the 16 regular season games for the 2008 Detroit Lions. Make a scatterplot to show the relationship between the number of passing yards and points scored for the 2008 Detroit Lions. Describe what you see.
positive
negative
There is a moderatelystrongnegative curvedrelationshipbetween thepercent of studentswho take theSATandHe meanSATmaltscore There are twodistinctclusters and twopossibleoutliers
There is a moderatelystrongpositivelinear association betweennumberofpassingyardsandpoints scoredAspassingyardsincrease pointsscoredincrease Nopointsdeviategreatly butHere is onegamewheretheLionsonlythrew99ydsandscored20points
Statistics – Ch 3.1 Notes Name: __________________________ Describing Relationships
Measuring Linear Association: Correlation ● A scatterplot displays the strength, direction, and form of the relationship between two quantitative
variables. ● Linear relationships are important because a straight line is a simple pattern that is quite common.
Unfortunately, our eyes are not good judges of how strong a linear relationship is.
Measuring Correlation
Correlation Examples
The ______________________ measures the direction and strength of the linear relationship between two
quantitative variables.
x r is always a number between_____________________
x ______________ indicates a positive association.
x ______________ indicates a negative association.
x Values of r near _____ indicate a very weak linear relationship.
x The strength of the linear relationship increases as r moves away from 0 towards ___________.
x The extreme values r = -1 and r = 1 occur only in the case of a perfect __________ relationship.
Correlation r
l and lr or Lo
Ol or 1
linear
CorrelationApplication
Statistics – Ch 3.1 Notes Name: __________________________ Describing Relationships
Practice problem The scatterplot of average points per game and number of wins for college football teams in the SEC is repeated to the right. For these data r = 0.936
1. Interpret the value of r in this context.
2. The point highlighted in red is Mississippi. What effect does Mississippi have on the correlation? Justify your answer.
Independent Practice For each graph, estimate the correlation r. Interpret the value of r in context.
TheCorrelation of 0.936 confirms thereis a strong positive linear relationshipbetweenpointspergameandwins in theSEC
Mississippi makesthe correlation stronger closerto1If Mississippi wereremoved theremainingpointswouldnotbe astightlyclustered in a linearpattern
f 0.9 f 0.5 moderateStrohl positivepositive
f0.3 f 0.1weak weak
negativepositive
Statistics – Ch 3.1 Notes Name: __________________________ Describing Relationships
How to calculate correlation: Suppose that we have data on variables x and y for n individuals. The values for the first individual are x1 and y1, the values for the second individual are x2 and y2, and so on. The means and standard deviations of the two variables are x-bar and sx for the x-values and y-bar and sy for the y-values. The correlation r between x and y is:
Facts about Correlation: 1. Correlation makes no distinction between explanatory and response variables.
2. Because r uses the standardized values of the observation, r does not change when we change
the units of measure of x, y, or both.
3. The correlation r itself has no unit of measurement. Cautions about Correlation:
● correlation does not imply causation
● correlation requires that both variables be quantitative
● correlation does not describe curved relationships between variables, no matter how strong the relationship is
● a value of r close to -1 or 1 does not guarantee a linear relationship between two variables
● like mean and standard deviation, the correlation is not resistant: r is strongly affected by a few
outlying observations
● correlation is not a complete summary of two-variable data