��
Inf.
Sta
tsTw
oV
aria
ble
s
We
ofte
nw
ish
toco
mpa
retw
odi
ffere
ntva
riabl
es
Exa
mpl
es:
diffe
rent
test
sre
sults
,age
and
abili
ty,e
duca
tion
(inye
ars)
and
inco
me,
spee
dan
dac
cura
cy,..
.
Met
hods
toco
mpa
retw
o(o
rm
ore)
varia
bles
:
corr
elat
ion
coef
ficie
nt
regr
essi
onan
alys
is
Not
ate
bene
!
num
eric
varia
bles
1
��
Inf.
Sta
tsB
ackg
roun
d
Term
inol
ogy:
we
spea
kof
CA
SE
S,e
.g.,
Joe,
Sam
,
���
and
VA
RIA
BL
ES
,e.g
.he
ight
(
� )an
dw
eigh
t(
� ).T
hen
each
varia
ble
has
aV
AL
UE
for
each
case
,
��� isJo
e’s
heig
ht,
and
��� isS
am’s
wei
ght.
We
com
pare
two
varia
bles
byco
mpa
ring
thei
rva
lues
for
ase
tofc
ases
,
� � vs.
� �� � vs
.
��
etc.
2
��
Inf.
Sta
tsTa
bula
rP
rese
ntat
ion
Exa
mpl
e:
Hop
penb
rouw
ers
mea
sure
dpr
onun
ciat
ion
diffe
renc
esam
ong
pairs
ofdi
alec
ts.
We
com
pare
thes
eto
the
geog
raph
icdi
stan
cebe
twee
npl
aces
they
’resp
oken
.
Dia
lect
Pai
rP
hon.
Dis
t.G
eo.D
ist.
Alm
elo/
Haa
rlem
�� �
�
Alm
elo/
Ker
krad
e
� ���
�
Alm
elo/
Mak
kum
���
�
Alm
elo/
Roo
desc
hool
����
��
Alm
elo/
Soe
st
����
�H
aarle
m/K
erkr
ade
� ��
��. . .
. . .. . .
Ker
krad
e/S
oest
� ���
��M
akku
m/R
oode
scho
ol
���
�� M
akku
m/S
oest
� �
���R
oode
scho
ol/S
oest
����
���Tw
ova
riabl
es—
phon
etic
and
geog
raph
icdi
stan
ce,a
nd
�� case
s(h
ere,
each
pair
isa
sepa
rate
CA
SE
).
3
��
Inf.
Sta
tsS
catte
rplo
ts
One
usef
ulte
chni
que
isto
visu
aliz
eth
ere
latio
nby
grap
hing
it.
G E
O _
D S
T
4 0
0 3
0 0
2 0 0
1
0 0
0
P H O N _ D S T
1 . 3
1 . 2
1 . 1
1 . 0
. 9
. 8
. 7
. 6
. 5
4
��
Inf.
Sta
tsS
catte
rplo
ts
Eac
hdo
tis
aca
se,w
hose
� -val
ueis
geo.
dist
ance
,and
� -val
ueph
on.
dist
ance
.
G E
O _
D S
T
4 0
0 3
0 0
2 0 0
1
0 0
0
P H O N _ D S T
1 . 3
1 . 2
1 . 1
1 . 0
. 9
. 8
. 7
. 6
. 5
Inge
nera
l,w
eus
e
� -axi
sfo
rIN
DE
PE
ND
EN
Tva
riabl
es,
and
� for
(pot
entia
lly)
DE
PE
N-
DE
NT
ones
.W
edo
n’t
know
whe
ther
phon
.di
stan
cede
pend
son
geo.
dist
ance
,bu
tit
mig
ht(w
hile
reve
rse
isim
plau
sibl
e).
5
��
Inf.
Sta
tsLe
ast
Squ
ares
Reg
ress
ion
The
sim
ples
tfor
mof
depe
nden
ceis
LIN
EA
R—
the
inde
pend
entv
aria
ble
dete
rmin
esa
port
ion
ofth
ede
pend
entv
alue
.
We
can
visu
aliz
eth
isas
fittin
ga
stra
ight
line
toth
esc
atte
rplo
t.
G E
O _
D S
T
4 0
0 3
0 0
2 0 0
1
0 0
0
P H O N _ D S T
1 . 3
1 . 2
1 . 1
1 . 0
. 9
. 8
. 7
. 6
. 5
6
��
Inf.
Sta
tsLe
ast
Squ
ares
Reg
ress
ion
G E
O _
D S
T
4 0
0 3
0 0
2 0 0
1
0 0
0
P H O N _ D S T
1 . 3
1 . 2
1 . 1
1 . 0
. 9
. 8
. 7
. 6
. 5
Like
ever
yst
raig
htlin
e,th
isha
san
equa
tion
ofth
efo
rm:
����� �
� isth
epo
intw
here
the
line
cros
ses
the
� -axi
s,th
e
� -IN
TE
RC
EP
T,a
nd
� the
SL
OP
E.
7
��
Inf.
Sta
tsP
redi
cted
vs.
Obs
erve
dV
alue
s
The
inde
pend
ent
varia
ble
dete
rmin
esth
ede
pend
ent
valu
e(s
omew
hat)
;th
isis
the
pred
icte
dva
lue
� � —th
eva
lue
onth
elin
e.
Not
eal
soth
eac
tual
� —th
eda
tado
t,no
talw
ays
the
sam
e.
G E
O _
D S
T
4 0
0 3
0 0
2 0 0
1
0 0
0
P H O N _ D S T
1 . 3
1 . 2
1 . 1
1 . 0
. 9
. 8
. 7
. 6
. 5
8
��
Inf.
Sta
tsR
esid
uals
The
diffe
renc
ebe
twee
npr
edic
ted
and
actu
alva
lues
�� �!� � � " is
the
RE
SID
UA
L—
wha
tthe
linea
rm
odel
does
not
pred
ict.
Itis
the
vert
ical
dist
ance
betw
een
the
dota
ndth
elin
e.
LE
AS
T-S
QA
RE
SR
EG
RE
SS
ION
finds
the
line
whi
chm
inim
izes
the
squa
red
resi
dual
s—fo
ral
lthe
data
.
� � � ! � � � "�
9
��
Inf.
Sta
tsS
PS
SR
egre
ssio
n
Leas
t-sq
ares
regr
essi
onfin
dsth
ebe
stst
raig
htlin
ew
hich
mod
els
the
data
(min
imi-
zes
the
squa
red
erro
r).
**
MUL
TI
PL
ERE
GRE
SS
IO
N*
*
Equation
Number1
Dependent
Variable..
PHON_DST
Block
Number
1.
Method:
Enter
GEO_DST
Analysis
ofVariance
[ignore!]
-----------Variables
intheEquation
-------------
Variable
BSEB
GEO_DST
.001631
5.1714E-04
(Constant)
.649778
.104898
� �# �$��# �##�$ �
10
��
Inf.
Sta
tsR
esid
uals
Reg
ress
ion
finds
best
line,
buti
sse
nsiti
veto
extr
eme
valu
es.
Exa
min
ere
sidu
als.
R e
s i d
u a
l s o
f L
e a
s t
S q
u a
r e s
R e
g r e
s s
i o n
G E
O _
D S
T
4 0
0 3
0 0
2 0 0
1
0 0
0
R e s i d u a l
. 3
0 . 0
- . 3
11
��
Inf.
Sta
tsS
PS
SP
lot
ofR
esid
uals
R e
s i d
u a
l s o
f L
e a
s t
S q
u a
r e s
R e
g r e
s s
i o n
G E
O _
D S
T
4 0
0 3
0 0
2 0 0
1
0 0
0
R e s i d u a l
. 3
0 . 0
- . 3
Sav
ere
sidu
als
asne
wva
riabl
e,th
engr
aph
vs.
orig
inal
� valu
e.
Wat
chou
tfo
rex
trem
e
� valu
es—
influ
entia
l,th
ough
resi
dual
may
besm
all.
See
exam
ple
2.12
inM
oore
and
McC
abe.
Als
oex
amin
eO
UT
LIE
RS
—la
rge
resi
dual
s.
12
��
Inf.
Sta
tsLe
ast
Squ
ares
Reg
ress
ion
(opt
iona
l)
How
does
regr
essi
onw
ork?
We
expr
ess
the
squa
red
resi
dual
sas
afu
nctio
nof
the
line.
Thi
sis
afu
nctio
nin
two
varia
bles
:
� ,the
inte
rcep
t,an
d
� ,the
slop
e.
%! �'&�"� �� �
� ! � � � "�
� !! �
�� � "� "�
� ! �
�� � �( "�
� �
� �) ��� ) ��( ���� �)� � �( ��( �
Tom
inim
ize
this
func
tion,
find
whe
reits
deriv
ativ
e
%*�# .
13
��
Inf.
Sta
tsLe
ast
Squ
ares
Reg
ress
ion
%! �'&�"� �
� �) ��� ) ��( ���� �)� � �( ��( �
Tom
inim
ize
afu
nctio
nin
two
varia
bles
,loo
kat
part
iald
eriv
ativ
esin
%,+* ,
%�-*
%.+* ! �'&�"� ) �
�)�� ) �
%�-* ! �'&�"� ) ��
�)�� �) � �
We
then
sete
ach
part
iald
eriv
ativ
eto
zero
,and
solv
e(t
hepa
irof
linea
req
uatio
ns).
14
��
Inf.
Sta
tsR
egre
ssio
n—T
iny
Exa
mpl
e
Dia
lect
Pai
rP
hon.
Dis
t.G
eo.D
ist.
Alm
elo/
Haa
rlem
�� �
�
Alm
elo/
Ker
krad
e
� ���
�
Ker
krad
e/R
oode
scho
ol
� ����
%.+* ! � &�"� ) �
�)�� ) �
�) ��)�! �##")# ��/�
) ��)�! )##")� ��/�
) ��)�! 0##")� �)1
�$ ���)##�$ �#$
%2-* ! � &�"� ) ��
�)�� �) � �
) �!�##" �)�!�##"�)�### ��/
) �!)##" �)�!)##"�))##� ��/
) �! 0##"�)�! 0##"�)0##� �)1
��)## ��)/# &###��0�#
15
��
Inf.
Sta
tsR
egre
ssio
n—T
iny
Exa
mpl
e
Now
we
solv
eth
ese
two
linea
req
uatio
ns(s
etto
zero
).
#�$ ���)##�$ �#$
$ ��$ �#$�)##�
��� �#�)##�
#��)##! � �#�)##�"�)/# &###��0�#
��)�))3# &###� �)/# &###��0�#
3# &###���0�#�)�)
���0/4 3# &###�# �##03�
��� �#��)##! # �##03�"�# �0)
16
��
Inf.
Sta
tsE
xam
ple
,Con
t.
��# �0)�# �##03� �
R e
g r e
s s
i o n
w i t
h 3
C a
s e
s
G E
O _
D S
T X
4 0
0 3
0 0
2 0 0
1
0 0
0
P H _ D I S T X
1 . 3
1 . 2
1 . 1
1 . 0
. 9
. 8
. 7
. 6
. 5
17
��
Inf.
Sta
tsE
xam
ple
,Con
t.
**
MULTIP
LE
REGR
ESSION
**
Equation
Number1
DependentVariable..
PH_DISTX
Variable(s)
Entered
onStepNumber
1..
GEO_DSTX
-----------
Variables
intheEquation
----------
Variable
BSE
B
GEO_DSTX
.003450
.001472
(Constant)
.320000
.318041
18
��
Inf.
Sta
tsLi
near
Reg
ress
ion
Asy
mm
etri
c—ap
prop
riat
ew
hen
one
varia
ble
mig
htbe
“exp
lain
ed”
bya
seco
nd–
Rea
ctio
ntim
eon
basi
sof
diffi
culty
—ne
gativ
e!–
Chi
ld’s
abili
tyon
basi
sof
pare
nts’
–et
c.
No
answ
er(y
et)
toho
ww
elld
oes
� expl
ain
�C
OR
RE
LA
TIO
Nan
alys
ispr
ovid
esan
swer
s.
Sym
met
ric
mea
sure
ofex
tent
tow
hich
varia
bles
pred
icte
ach
othe
r
Ans
wer
toho
ww
elld
oes
� expl
ain
�R
egre
ssio
n,co
rrel
atio
nin
appr
opri
ate
whe
n“b
est
line”
not
stra
ight
(nee
dtr
ansf
or-
mat
ions
).
19
��
Inf.
Sta
tsC
orre
latio
nC
oeffi
cien
t
aka
“Pea
rson
’spr
oduc
t-m
omen
tcor
rela
tion”
56879�
� :�
�� ; 6
�� ; 9
refle
cts
stre
ngth
ofre
latio
n
# noco
rrel
atio
n
� perf
ectp
ositi
veco
rrel
atio
n
� perf
ectn
egat
ive
corr
elat
ion
none
cess
ary
depe
nden
ce!
shoe
size
,rea
ding
abili
tyco
rrel
ate—
both
depe
nden
ton
age
20
��
Inf.
Sta
tsC
orre
latio
nC
oeffi
cien
t
--
Correlation
Coefficients
--
GEO_DST
PHON_DST
GEO_DST
1.0000
.6584
(15)
(15)
P=.
P=.008
PHON_DST
.6584
1.0000
—ge
ogra
phic
and
phon
etic
dist
ance
corr
elat
eat
# �$�
21
��
Inf.
Sta
tsC
orre
latio
n
56879�
� :�
�� ; 6
�� ; 9
alte
rnat
ive
:
5�6879�
� :�< >=�
? 6 ?9
5 “pur
enu
mbe
r”—
noun
its
inse
nsiti
veto
scal
e,pe
rcen
tage
s,...
corr
.w
.te
mpe
ratu
reca
nig
nore
scal
e
sym
met
ric
5�6879�5 9 76
22
��
Inf.
Sta
tsP
rope
rtie
sof
Cor
rela
tion
56879�
� :�< >=�
? 6 ?9
rm
easu
res
“clu
ster
ing”
rela
tive
to
@ 6 &@ 9
as
5�! or
� ),do
tscl
uste
rne
arre
gres
sion
line
care
fulw
hen
“eye
ing”
data
–ch
ange
in
@ affe
cts
appa
rent
clus
teri
ng–
sepa
rate
clus
ters
lose
inco
rrel
atio
n–
wat
chfo
rno
nlin
ear
rela
tions
23
��
Inf.
Sta
tsC
orre
latio
n/R
egre
ssio
n
regr
essi
onan
alys
is—
5� :ho
wm
uch
of
� ’sva
rianc
em
aybe
atrib
uted
to
� ?
nons
ymm
etri
c:
� anal
yzed
asde
pend
ento
n
�
smoo
thed
plot
of
� aver
ages
(for
� grou
ps)
alw
ays
flatte
rth
anS
Dlin
e,th
elin
ew
ithsl
ope
@ 94 @ 6w
hich
pass
esth
roug
h
!A 6&A9"
regr
essi
onlin
e(G
auss
):
� ���� �
then
� �5@ 9 @ 6
24
��
Inf.
Sta
tsIn
terp
reta
tion
ofC
orre
latio
nvi
aA
vera
ges
.
Exa
mpl
e:he
ight
,wei
ghth
ave
corr
.co
eff.
5 �# ��
B(C��1/
cm
&BD�1) kg
&@ C�$ cm
&@ D�$ kg
for
each
@ 6 ,ther
ear
e
5@ 9 ’s
wha
tis
ave.
wei
ghto
ftho
se18
4cm
tall?
�/3
cm
��1/�$ cm
�B C��@ C
EGF C��
� ��� cm�BD� 5 D 7C
EGF C@ D
�1) kg
�# ���$ kg
�1� kg
25
��
Inf.
Sta
tsIn
terp
reta
tion
ofC
orre
latio
nvi
aA
vera
ges
.
for
each
@ 6 ,ther
ear
e
5@ 9 ’s,
#5� .
E D less
(in
? term
s)th
an
E C
sinc
e
5� av
erag
esof
corr
elat
edva
riabl
esm
ust“
regr
ess”
tow
ard
mea
n
26
��
Inf.
Sta
tsR
egre
ssio
nFa
llac
y
Hei
ght/w
eigh
tex
ampl
e:
Infig
urin
g
� for
rest
ricte
dgr
oups
:E D le
ss(in
? term
s)th
an
E C
Sin
ce
5� ,a
vera
ges
ofco
rr.
var.
mus
t“r
egre
ss”—
butt
his
ispu
rely
mat
hem
atic
al,
noca
usal
“Reg
ress
ion
falla
cy”:
—se
eing
caus
atio
nin
regr
essi
on
heig
htco
rrel
atio
nbe
twee
npa
rent
san
dch
ildre
n(
5 �# �3 )
butv
ery
tall
pare
nts
have
less
tall
child
ren
(stil
ltal
ler
than
ave.
)
test
-ret
ests
ituat
ions
show
extr
emes
(hig
han
dlo
w)
clos
erto
mea
non
seco
ndte
st
“the
cour
sesh
owed
noge
nera
lim
prov
emen
t,bu
tthe
wor
stst
uden
tsim
prov
ed”
27
��
Inf.
Sta
tsC
orre
latio
n
mea
sure
sst
reng
thof
linea
rre
latio
n
sym
met
ric
5�6879�5 9 76
rela
ted
tosl
ope
ofre
gres
sion
line
Cau
tion
need
ed:
outli
ers
—re
duce
5
nonl
inea
ras
soci
atio
n,e.
g.in
tens
ityvs
.lo
udne
ss
“eco
logi
calc
orre
latio
ns”
use
aver
ages
,rat
espo
pula
rin
polit
ics,
buto
vers
tate
5 (bas
edon
indi
vidu
als)
corr
elat
ion
caus
atio
nex
ampl
e:sh
oesi
zean
dre
adin
gab
ility
28
��
Inf.
Sta
tsR
egre
ssio
nE
rror
BH@HH
regr
essi
onlin
eH
regr
essi
oner
ror
regr
.er
ror
mea
sure
sdi
sper
sion
arou
ndre
gr.
line
regr
.er
ror
can
beca
lcul
ate
asst
anda
rdde
viat
ion
(fro
mre
gres
sion
line,
but
also
(sho
rter
)
��5�@ 9
n.b.
reg.
erro
r
@ 929
��
Inf.
Sta
tsS
PS
SP
lot
ofR
egre
ssio
nE
rror
G E
O _
D S
T
4 0
0 3
0 0
2 0
0 1
0 0
0
P H O N _ D S T
1 . 3
1 . 2
1 . 1
1 . 0
. 9
. 8
. 7
. 6
. 5
R s
q =
0 . 4
3 3
5
Sho
ws
) stan
dard
erro
rsar
ound
regr
essi
onlin
e—w
here
I�J of
data
mus
tbe
foun
d.
30
K LInf. Stats
Next: Multiple Regression
M31