SPSS/Excel Project 1
Running head: SPSS/EXCEL PROJECT
SPSS/Excel Project: Expenditure vs. Academic Performance
Hannah J. Anderson
Seattle Pacific University
EDU 6976 Interpreting & Applying Educational Research II
Autumn Quarter, 2009
SPSS/Excel Project 2
This report is based upon data from the Digest of Educational Statistics. This data can be
used to determine if there are relationships between various aspects of the educational system
and student achievement.
Part 1 Histograms, Box Plots & Frequency Distribution
The data collected is mainly from the years 2006 – 2007, and includes expenditure,
student/teacher ratio, salary, annual salary, percentage of students taking the SAT, average
reading, math, and writing SAT scores, region, number and percentage of students eligible for
free or reduced lunch, percent of students with disabilities, and total revenues.
First we will look at histograms for the numerous variables, for this gives us the
opportunity to look at data distribution in a visual way. There is only one categorical variable
that is important to note which is the region that the various data is collected from. The
following shows the frequency distribution for this categorical variable:
Region Count of States
1.00 13
2.00 12
3.00 17
4.00 9
Region 1 includes schools from the west, region 2 includes schools from the Midwest,
region 3 includes schools from the south, and region 4 includes schools from the east. It is
important to note the differences between the number of schools from region 3 and 4, as there is
a difference of 8 states included in the data.
SPSS/Excel Project 3
The following histograms and box plots include data from each variable that was
reported. Histograms are useful in that it is a visual representation of the distribution of scores
and data reported, while box plots represent medians, interquartile ranges, and upper and lower
whiskers, as to see the distribution of the data.
Part I: Histograms and Boxplots
Figure 1.1
Annual SalaryFigure 1.2
As can be seen in Figure 1.1 and 1.2, the median salary of all salaries reported was
$45,575, with the highest salary being reported as $61, 372 and the lowest annual salary being
$35, 607. The largest amount of salaries fell between the $40,000 to $45,000 range, while the
least fell within the $60,000 to $65,000 range. From the box plot it can also be seen that more
annual salaries fell above the median annual salary than below.
0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000
<=35 (35, 40] (40, 45] (45, 50] (50, 55] (55, 60] (60, 65] >650
5
10
15
20
25
Frequency - Annual Salary (per $1,000)
SPSS/Excel Project 4
Figure 2.1
Total Revenues 2005 – 2006Figure 2.2
Figures 2.1 and 2.2 represent the total revenues reported by each state from 2005 – 2006.
As can be seen, the median amount of total revenue was $6,346,033, with the lowest reported
revenue being reported at $958,109 and the highest total revenue being reported within the
whiskers of Figure 2.2’s box plot being 22,799,624. It should also be noted that there are three
outliers in this statistic, being Texas reporting at $39,691,436, New York reporting at
$46,776,452, and California reporting at $63,785,872. From the frequency distribution it can be
seen that the largest distribution of revenue being reported was between $1,000,000 and
$5,000,000. Also, it should be noted that if the three outliers are taken out of the reported data,
the median would be reported at $5,668,758, so it should be recognized that the addition of these
three outliers causes the median amount of revenue reported to increase by an estimated
$677,000.
<=0 (0, 5] (5, 10]
(10, 15]
(15, 20]
(20, 25]
(25, 30]
(30, 35]
(35, 40]
(40, 45]
(45, 50]
(50, 55]
(55, 60]
(60, 65]
>650
5
10
15
20
25
Frequency - Total Revenues (per 1,000,000)
-30000000 -20000000 -10000000 0 10000000 20000000 30000000 40000000 50000000 60000000 70000000
SPSS/Excel Project 5
Figure 3.1
Figure 3.2
Pupil/Student Ratio 2005Figure 3.3
Pupil/Student Ratio 2006Figure 3.4
<=10 (10, 11]
(11, 12]
(12, 13]
(13, 14]
(14, 15]
(15, 16]
(16, 17]
(17, 18]
(18, 19]
(19, 20]
(20, 21]
(21, 22]
(22, 23]
>230
2
4
6
8
10
12
14
Frequency - Pupil/Student Ratio - Fall 2006
<=9 (9, 10]
(10, 11]
(11, 12]
(12, 13]
(13, 14]
(14, 15]
(15, 16]
(16, 17]
(17, 18]
(18, 19]
(19, 20]
(20, 21]
(21, 22]
(22, 23]
>230
2
4
6
8
10
12
Frequency - Pupil/Student Ratio - Fall 2005
0 5 10 15 20 25 30
0 5 10 15 20 25 30
SPSS/Excel Project 6
Figures 3.1, 3.2, 3.3 and 3.4 represent the student to teacher ratio reported by the varying
states. As can be seen in the above figures, the median is the same for both years, being at 15
students per teacher in both 2005 and 2006. Yet, in the fall of 2005, the most common ratio of
student to teachers was 14 – 15 students per teacher, but in the fall of 2006 the 13-14 students
representation was the most common, as can be seen in the two aforementioned frequency
distributions. It should also be noted that both years there are the same three states that are the
outliers of the given data, being Utah at 22.10 and Arizona and California, averaging a ratio of
20.75 and 20.85 students per teacher respectively.
Figure 4.1
Percent of students with disabilitiesFigure 4.2
0 5 10 15 20 25 30
<=9 (9, 10] (10, 11]
(11, 12]
(12, 13]
(13, 14]
(14, 15]
(15, 16]
(16, 17]
(17, 18]
(18, 19]
(19, 20]
>200
2
4
6
8
10
12Frequency - IDEA - Percent of students with disabilities
SPSS/Excel Project 7
Figures 4.1 and 4.2 represent the reported percentage of students with disabilities from
2006 – 2007. From the given data it can be determined that the median percentage is 14.3%,
with the upper whisker being 19.9% and the lower whisker being 10.5%. From the frequency
distribution it can be seen that the highest amount of states fall within the 14-15% range, with the
15 – 16% range being the second highest.
Figure 5.1
Percent of students who are eligible for free or reduced priced lunchFigure 5.2
Figures 5.1 and 5.2 represent the percent of students in elementary and secondary schools
who are eligible for free or reduced-price lunch. From 5.2 it can be determined that the median
<=10 (10, 15]
(15, 20]
(20, 25]
(25, 30]
(30, 35]
(35, 40]
(40, 45]
(45, 50]
(50, 55]
(55, 60]
(60, 65]
(65, 70]
>700
2
4
6
8
10
12
14Frequency - Percent of Students Eligible for Free/
Reduced Lunch
-40 -20 0 20 40 60 80 100 120
SPSS/Excel Project 8
percentage of students is 37.35%, with the upper whisker being at 67.5% and the lower whisker
being at 17.7%. From Figure 5.1 it can be seen that the highest number of states reported a
percentage range of 30 – 35%, with thirteen schools reporting in this range.
Figure 6.1
Total number of students who are eligible for free or reduced-price lunchFigure 6.2
Figures 6.1 and 6.2 represent the total number of students in elementary and secondary
schools who are eligible for free or reduced-price lunch as reported by each state. From these
graphs it can be determined that the median amount of students is 271,839 with the lowest
<=0 (0, 5000] (5000, 10000]
(10000, 15000]
(15000, 20000]
(20000, 25000]
(25000, 30000]
(30000, 35000]
>350000
5
10
15
20
25
30
35
40
45Frequency - Students Eligible for Free/Reduced Lunch
(per 1,000)
-1000000 -500000 0 500000 1000000 1500000 2000000 2500000 3000000 3500000
SPSS/Excel Project 9
number of students being reported at 24,467 students and the highest amount being reported at
3,042,713. There are four outliers from the highest and lowest whisker in Figure 3.2, which are
New York reporting 1,179,269 students, Florida reporting 1,179,269 students, Texas reporting
2,126,815 students, and California reporting 3,042,713 students. If these four outliers are taken
out of the reported data, the median amount of students decreases from 271,839 to 266,179
students, which is an estimated difference of 5,660 students.
Figure 7.1
Figure 7.2
<=0 (0, 5] (5, 10]
(10, 15]
(15, 20]
(20, 25]
(25, 30]
(30, 35]
(35, 40]
(40, 45]
(45, 50]
(50, 55]
(55, 60]
(60, 65]
>650
5
10
15
20
25Frequency - Enrollment 2006 - Per 100,000
<=0 (0, 5] (5, 10]
(10, 15]
(15, 20]
(20, 25]
(25, 30]
(30, 35]
(35, 40]
(40, 45]
(45, 50]
(50, 55]
(55, 60]
(60, 65]
>650
5
10
15
20
25Frequency - Enrollment 2005 - Per 100,000
SPSS/Excel Project 10
Enrollment 2005 – Per 100,00Figure 7.3
Enrollment 2006 – Per 100,000Figure 7.4
Figures 7.1, 7.2, 7.3 and 7.4 represent total student enrollment reported by each state
from the years 2005 and 2006. From Figures 7.1 and 7.2 one can determine that the distribution
on these frequency graphs is positively skewed, meaning that there are large clusters of low
scores represented on the graph. From this frequency distribution graph it can be seen that the
most commonly reported range of data for enrollment was in the 100,000 to 500,000 student
range. Figure 7.3 shows that the median amount of total enrollment for 2005 was 654,526 and in
Figure 7.4 one can determine that the median amount of total enrollment for 2006 was 675,851.
This shows that the median increased by 21,325 students from 2005 to 2006. There are four
outliers in both 2005 and 2006, which were Florida, Illinois, Texas and California, with
California having the largest reported total enrollment both years, with an average enrollment of
6,422,011 between both 2005 and 2006.
<=0 (0, 5] (5, 10]
(10, 15]
(15, 20]
(20, 25]
(25, 30]
(30, 35]
(35, 40]
(40, 45]
(45, 50]
(50, 55]
(55, 60]
(60, 65]
>650
5
10
15
20
25Frequency - Enrollment 2006 - Per 100,000
-3000000 -2000000 -1000000 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000
-3000000 -2000000 -1000000 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000
SPSS/Excel Project 11
Figure 8.1
Figure 8.2
<=480 (480, 500]
(500, 520]
(520, 540]
(540, 560]
(560, 580]
(580, 600]
(600, 620]
>6200
2
4
6
8
10
12
14
16
Frequency - Average Writing SAT Scores
<=480 (480, 500]
(500, 520]
(520, 540]
(540, 560]
(560, 580]
(580, 600]
(600, 620]
>6200
2
4
6
8
10
12
14
16
Frequency - Average Reading SAT Scores
<=470 (470, 490]
(490, 510]
(510, 530]
(530, 550]
(550, 570]
(570, 590]
(590, 610]
(610, 630]
>6300
2
4
6
8
10
12
14
16
Frequency - Average Math SAT Scores
SPSS/Excel Project 12
Figure 8.3
Average Verbal SAT ScoresFigure 8.4
Average Writing SAT ScoresFigure 8.5
Average Math SAT ScoresFigure8.6
Figures 8.1 through 8.6 represent average reported SAT scores from each in each of the
three areas: verbal, math, and writing. Figures 8.1 and 8.4 are in reference to average verbal
SAT scores. From the frequency distribution it can be seen that most common average score
reported was in the 480-500 data area. From 8.4 one can determine that the median verbal score
reported was 523, with the upper whisker being 610 and the lower whisker being 482. Also, it
can be seen that the interquartile range was 71 for the given data. There are no outliers for the
given data as all of the given scores for each of the three reported areas of SAT scores fall within
a normal distribution range.
200 300 400 500 600 700 800 900
200 300 400 500 600 700 800 900
300 350 400 450 500 550 600 650 700 750 800
SPSS/Excel Project 13
Figures 8.2 and 8.5 represent the average verbal SAT scores reported by the various
states. From this data we can see that the median reported score was 511, with the upper whisker
being at 591 and the lower whisker at 472. The interquartile range for Figure 8.5 was 73.5,
showing that there was only a 2.5 point difference between interquartile ranges for Figures 8.4
and 8.5.
Figures 8.3 and 8.6 represent the average math scores reported by the various states.
From this data it can be seen that the median reported average math score was 529, with the
upper whisker being at 617 and the lower whisker being at 472. From Figure 8.3 one can also
determine that the most frequently occurring score was in the range of 490 to 510. The
interquartile range for Figure 8.6 was 59 points. This was the smallest interquartile range
between the three SAT areas reported. It was 14.5 points lower than the average writing scores,
and 12 points lower than the average verbal scores.
Figure 9.1
<=0 (0, 10] (10, 20]
(20, 30]
(30, 40]
(40, 50]
(50, 60]
(60, 70]
(70, 80]
(80, 90]
(90, 100]
>1000
2
4
6
8
10
12
14
16
18
20Frequency - Percent of graduates taking the SAT
-200 -150 -100 -50 0 50 100 150 200 250 300
SPSS/Excel Project 14
Percent of graduates taking the SATFigure 9.2
Figures 9.1 and 9.2 represent the percentage of graduates taking the SAT from 2006 –
2007. From Figure 9.1 , one can see that the most frequently common percentage of graduates
taking the SAT was from 0 – 10%. It can be said that the distribution of percentages was
positively skewed, a large number of the total reported percentages fell within the lower
percentages. From Figure 9.2 it can also be determined that median percent age was 32%, with
the upper whisker being 100% and the lower whisker being 3%. Figure 9.2 also shows that the
interquartile range is 60.5% for the given data.
Figure 10.1
<=5000 (5000, 7000]
(7000, 9000]
(9000, 11000]
(11000, 13000]
(13000, 15000]
(15000, 17000]
(17000, 19000]
>190000
2
4
6
8
10
12
14
16
18
20
Frequency - School Expenditure
0 5000 10000 15000 20000 25000
SPSS/Excel Project 15
School Expenditure per StudentFigure 10.2
Figures 10.1 and 10.2 represent data collected from each state on school expenditure per
student. From Figure 10.1 it can be seen that the highest frequency of monetary amounts per
student was in the $9,000 to $11,000 range. From Figure 10.2 it can be seen that the median
amount of money spent per student was $9,805 , with the upper whisker being at $14,277 and the
lower whisker being at $482.00. The interquartile range from this box plot was $2,787, but there
were three outliers in this data representation. The three outliers were New Jersey at $15, 759
per student, New York at $16, 511 per student, and D.C. at $18, 339. If these three outliers were
removed the median would decrease to $9, 506, which is a difference of almost $300.
Part II: Differences in Regions
In this next section, we will discuss the differences that were found between the different
regions in this study. Each graph represents the statistical findings of the varying ranges of data
as per each region.
Region Expense/Pupil % of Total1 $120,184 23%
2 $118,865 23%
3 $165,255 31%
4 $122,413 23%
Total $526,717 Figure 11.1
Figure 11.1 represents the expense reported by each state per pupil. As can be seen in
this figure, Region 3 represented 31% of the total expense per pupil, which is 8% more than any
other region.
SPSS/Excel Project 16
Teacher/Pupil RatioRegion 2005 2006
1 18 18 2 15 15 3 15 15 4 13 13
Total 15 15 Figure 11.2
Figure 11.2 represents the pupil to teacher ratio in both 2005 and 2006. It should be
noted that no median changed between each year. As can be seen, Region 1 had the highest
median ratio, which was 18 students/teacher, and Region 4 had the lowest median ratio, which
was 13 students/teacher. From Figure 11.1 and Figure 11.2 one can note that even though
Region 1 has the highest pupil/teacher ratio, it has the second lowest expense/pupil.
Region Avg. Salary1 $47,2232 $46,3133 $45,7174 $53,865
Figure 11.3
Figure 11.3 represents the average reported teacher salary for each region. As can be
seen, Region 4 has the highest reported average salary at $53,865, with Region 3 having the
lowest average salary at $45, 717. Region 3 having the lowest average salary is significant
because it has the lowest average salary yet spends the most on each pupil, as can be seen in
Figure 11.1. This would show that the increase in expense/pupil does not increase the teacher’s
salary. In order to make an appropriate and viable conclusion on why Region 4 has the highest
salary by over $5,000 one would need to have an average cost of living per region. If this
showed that Region 4 had the highest cost of living, as national standards may show, then it
would make statistical sense that they would have the highest average salary.
Region % Taking SAT
SPSS/Excel Project 17
1 33%2 13%3 40%4 81%
Figure 11.4
Figure 11.4 represents the percentage of students taking the SAT in each region. As can
be seen, Region 4 has the highest percentage of students by over 40%, while Region 2 has a very
low percentage taking the SAT. In order to make a viable conclusion about this data one would
need to know the percentage of schools requiring the SAT in each region. One may conclude
that the reason the percentage for Region 4 is so high is due to the fact that many colleges and
universities on the East coast require SAT scores for acceptance requirements, while many
colleges and universities in the Midwest do not, thus possibly explaining the stark contrast
between the regions.
Region Average of reading Average of math Average of writing1 529 535 5152 577 587 5643 527 527 5214 504 512 497
Figure 11.5
Figure 11.5 represents the average SAT score on each of the three parts of the SAT. As
can be seen, Region 2 had the highest average score for all three areas, with Region 4 having the
lowest average scores for all three areas as well. One practical explanation of this finding is that
the averages for Region 2 came from the 13% of total students that took the SAT, as seen as
11.4. Comparing this with the fact that the averages for Region 4 came from the 81% of total
students in that region, it would seem statistically significant and practical to attribute the lower
average for Region 4 to come from the higher percentage of students contributing to that
average.
SPSS/Excel Project 18
Figure 11.6
Figure 11.6 represents the percentage of total students on free or reduced lunch in each
given region. From this representation one can see that Region 3 had the highest percentage of
students on free/reduced lunch at 49% while Region 4 had the lowest percentage of students at
30%. On a practical note it is interesting to note that Region 3 spends the most money per
student, but has the highest amount of students on free and reduced lunch. Therefore it might be
concluded that some of the expenditure per student goes towards the free and reduced lunch
program in Region 3.
Region IDEA %
1 12.4%
2 14.9%
3 14.0%
4 16.3%
Figure 11.7
Figure 11.7 represents the percentage of students with disabilities in each given region.
As can be seen, Region 4 had the highest percentage of students with disabilities at 16.3%. From
this data one can conclude that this could have possibly affected the scores reported for the SAT
sections, and also the revenue and expenditure per student. Region 1 had the lowest percentage
of students with disabilities at 12.4%.
Region SES %1 40%2 34%3 49%4 30%
SPSS/Excel Project 19
Figure 11.8
Figure 11.8 represents the total revenues reported for each region. As can be seen,
Region 4 had the highest amount of average revenue, at $13,661, 646, with Region 4 having the
lowest average revenue at $8, 785, 486. As aforementioned, the fact that Region 4 has the
highest percentage of students with disabilities may contribute to this factor, as well as the
popularity of private and charter schools in this Region. One of the reasons for believing that
this could be a possible correlation is that Region 1 has the lowest average revenue and the
lowest amount of students with disabilities. This could show a possible correlation but data
would have to found on how much extra revenue a school receives for students with disabilities.
Part III: Scatterplots & Regression Lines
Average SAT Score vs. Expenditure per studentFigure 12.1
4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 20,0000.00
200.00
400.00
600.00
800.00
1,000.00
1,200.00
1,400.00
1,600.00
1,800.00
2,000.00
f(x) = − 0.0181355421351082 x + 1788.20192836819R² = 0.164618003067412
Region Avg. Revenues1 $8,785,486
2 $9,680,181
3 $9,842,097
4 $13,661,646
SPSS/Excel Project 20
From Figure 12.1 one can see the slope, intercept, and regression line for the relationship
between average SAT scores and expenditure per student. A slope of -0.01814 shows that there
is a slight negative correlation between SAT scores and expenditure per student. A negative
correlation like this would infer that the higher the SAT score, the less amount of money was
spent per student. With R² at 0.1646, it can be said that only 16% of SAT scores can be
attributed to expenditure per student. Therefore, it can be concluded that expenditure per student
does not appear to be the most important variable to increased SAT scores.
10.00 12.00 14.00 16.00 18.00 20.00 22.00 24.000.00
200.00
400.00
600.00
800.00
1,000.00
1,200.00
1,400.00
1,600.00
1,800.00
2,000.00
f(x) = − 1.46623618031585 x + 1623.1772508335R² = 0.00108735884031996
Average SAT Score vs. Average Student/Teacher RatioFigure 12.2
Figure 12.2 represents the slope, intercept and regression line of the relationship between
average SAT scores and average student/teacher ratio from 2005 and 2006. A slope of -1.4662
shows that there is a weak negative correlation, inferring that the higher the SAT scores, the
lower the amount of students/teacher. Taking the square root of the coefficient of determination
gives you a value of 0.033, meaning that there is only a slight correlation between the two areas.
Even though it is a slight correlation, the conclusion can be made that teacher/student ratio does
play a small part in SAT scores.
SPSS/Excel Project 21
Average Salary vs. Expenditure per studentFigure 12.3
Figure 12.3 represents the slope, intercept, and regression line of the relationship between
average teacher salary and expenditure per student. A slope of 1.9127 shows that there is a
strong positive correlation between these two variables. This means that when salary goes up,
the expenditure per student increases as well. A coefficient of correlation of 0/6894 shows that
this is a moderately strong correlation between the two variables. Also, observing that
R²=0.4753 means that about 47% of teacher average reported salaries are explained by the
expenditure per student.
4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 20,0000.00
10,000.00
20,000.00
30,000.00
40,000.00
50,000.00
60,000.00
70,000.00
f(x) = 1.91272568637067 x + 27924.8600912137R² = 0.475312632437971
SPSS/Excel Project 22
10.00 12.00 14.00 16.00 18.00 20.00 22.00 24.000.00
10,000.00
20,000.00
30,000.00
40,000.00
50,000.00
60,000.00
70,000.00
f(x) = 219.83332083361 x + 44339.3361376102R² = 0.00634468112316711
Average Salary vs. Student/Teacher RatioFigure 12.4
Figure 12.4 represents the slope, intercept, and regression line of the relationship between
average salary vs. teacher/student ratio. With the coefficient of correlation at 0.0797, this shows
that there is a weak positive correlation between these two variables. A slope of 219.83 shows
that this is a positive correlation, meaning that when salary goes up, student/teacher ratio would
increase as well. With R² at 0.0063, this shows that from this study it can be concluded that
average salary does not play a influential part in teacher/student ratio.
Discussion and Recommendations
The data analysis shows that there is a slight correlation between the average
student/teacher ratio and SAT scores. It would be easy to conclude from this that in some cases,
the lower the student/teacher ratio, the higher the SAT scores. The strongest correlation came
from average salary vs. expenditure per student. From this one can make the conclusion that the
higher the teacher salary, the more money is spent per student. This would imply that if a state
pays its teachers well, it also is spending quite a bit of money on its students as well.
SPSS/Excel Project 23
Yet, an interesting correlation to note was the lack of correlation between average salary
and student/teacher ratio. One could conclude that salary does not play an influential roll in
student/teacher ratio. From an educators’ standpoint, looking at all of the data from this entire
statistical analysis, it can be seen that the money spent on a student does not necessarily play a
role in academic success. So, to educators this should say that we cannot depend on complaining
about our districts not spending enough money on our students, but instead need to focus on how
our teaching practices affect our students’ academic success. It would be interesting to see
percentages of highly qualified teachers vs. SAT scores, or amount of AP courses offered vs.
SAT scores. This would show how educational practices and qualifications affect student
achievement, since this data analysis shows that expenditure does not play a large role.