Kinds of data
10 red15 blue5 green
160cm172cm181cm
4 bedroomed3 bedroomed2 bedroomed
size 12size 14size 16size 18
fredlissymaxjack
callumzoeluke
stephen
baby 5lb3oz6lb10oz7lb12oz11lb1oz
Qualitative Continuous
Quantitative Discrete
Page 10Exercise 2AQ1 - 5 and 7
Averages
There are three types of average and they all begin with M
.....most popular value or class
.....middle value if all values are placed in order
.....the sum of all the values shared by how many values there are
MeanHow would you say the mean average differed
from the median average?In which circumstances may you use the mean
rather than the median and vice versa?
Rather than describing how to find the mean in words we need to learn some notation.
The sum of all the x values should be written as:
The sum of the values in the fx column should be written as:
Page 15Exercise 2B
Q1, 2, 4, 6 and 7
In what kind of questions will we need to add the values in the fx column rather than just add together all the x values?
Try finding the mean of the year 7 girl heights and compare it to the year 7 boy
heights.Would you use the mean or the median to
summarise this data and why?
If you used the raw data for the above calculation why?
If you used the grouped data from the frequency polygon or histogram work why
am I going to tell you you've made a mistake?
We use the "x Bar" notation to represent a sample mean.
If I was using all the data possible it would be called a population
mean and we use the "mu" notation
Mean Formulae
for a list of data
for a frequency table of data
for grouped data it is necessary to find the midpoint of each class first and use this as a value for x and then use the same equation above.
Page 18Exercise 2CQ1, 4 and 5
Stem and Leaf
Put the data below into a suitable stem and leaf diagram.
127, 135, 147, 147, 149, 139, 145, 155, 149, 155, 151, 159, 139, 141, 155, 160, 138, 144, 155, 148156, 143, 147, 157, 152, 150, 161, 133, 146, 155
The data represents heights of a first year class in a boys
school.
How else can you summarise or represent this data?
Below is a list of the heights of 30 year 7 girls. Add these to the other side of your
stem and leaf diagram and make some comparitive statements based on suitable
summary data you find.
127, 145, 147, 147, 149, 149, 145, 165, 139, 157, 152, 169, 129, 121, 158, 160, 148, 141, 155, 148156, 143, 157, 156, 152, 150, 161, 133, 146, 155
Page 55Exercise 4A
Q2 and 5
If you have grouped your data with equal class widths, you have little to worry
about. However if the class widths are uneven you will need to plot them against
frequency density rather than just frequency.
Frequency Density = Class Frequency
Class Width
Sometimes Relative Frequency Density is plotted on the y axis. This can be
calculated as:
Rel Freq Dens = Class Frequency Total
Frequency
It is best that Histograms are plotted against frequency density or relative
frequency density.
They should also only be drawn with continuous data.
Discrete or qualitative data can be plotted in Bar Charts but their bars should not really touch as they aren't connected
Histograms
Histograms are similar to bar charts apart from the consideration of areas.
In a bar chart, all of the bars are the same width and the only thing that matters is the
height of the bar. In a histogram, the area is the important
thing.
Page 64Exercise 4EQ1, 4 and 5
We may also need to find the average from these
grouped continuous data sets
Page 22Exercise 2D
Q1 - 5
Summarising Data
What types of data summary have you come up with so far?
And how do they differ?Give examples of when one type would be better
than another.
Quartiles
You will recall finding the median from a list of data involves adding one to the number of values before halving it to find out which value (placed in order) you should use.
For example in a list of 7 numbers the median value is the(7 + 1) / 2 th valuethe 4th value
3, 5, 5, 6, 8, 10, 10
If you consider the quartiles you can see that it is the 2nd and 6th values. These can be
found by dividing the number of values by 4 and as long as this gives a whole number find the average of it's value and the
value above it.If this yields a decimal value rather than a whole number then
always round UP to the value above it.In a list of 14 numbers the lower quartile would be taken as the
14 / 4 th value or 3.5th valueso you'd take the 4th value
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
Page 34Exercise 3A
Q1 and 2
Interquartile Ranges and Boxplots
We use the Quartiles and minimum and maximum values to draw boxplots.
These are great for comparing the spread of data between two or more data sets
Q1 Q3Q2
Q0 Q4
Where Q0 = min valueQ1 = lower quartileQ2 = medianQ3 = upper quartileQ4 = max value
Draw box plots to compare the year 7 height data you put into stem and
leaf diagrams earlier
Read page 57Page 58
Exercise 4BQ1 - 2
Page 59Exercise 4C
Q1 - 2
Page 61Exercise 4D
Q1 - 2
Cumulative Frequency
We have seen how to find the quartiles from a list of data or stem and leaf diagrams.
We have also seen that data is often stored in frequency distributions. If these are grouped it
becomes difficult to find these quartiles. Why?
We used to overcome this by drawing cumulative frequency curves.
60 students got below 50 marks
30 students got below 40 marks
by joining up these two points we pretend the 30 students with between 40 to 50 marks are spread evenly throughout the band.
Interpolation
We could actually find this value much quicker by using some simple mathematics known as
interpolating.
Think how you could find the mark of the 40th student from this year 10 class using just the
data rather than reading from the graph.
Now try estimating the mark of the 55th student.
Remember estimating doesn't mean guessing; it involves exact calculations but it is unlikely to be the true mark as the students are unlikely to be spaced
evenly througout the class. we cannot know the exact mark without the raw data - we are not given this with
grouped data - hence we estimate
Quantile = b + (Qn - f) x wfm
Where fm is the frequency of the class the quantile falls in
f is the cumulative frequency up to the class the quantile falls in
w is the class width of the class the quantile falls inQ is the quantile you are finding expressed as a
fractionn is the number of data
b is the LOWER bound of the class the quantile fits in
We can divide the data into as many equal parts as we like.
Quartiles divide in fourDeciles in ten and Percentiles into 100
The formula below is known as interpolating and estimates a quantile by assuming the data
collected in each class is spread evenly.It should be very similar to the formula yo
created earlier to find quartiles without drawing a cumulative frequency curve.
Quantiles
Page 34Exercise 3A
Q3 and 5
Page 37Exercise 3BQ1, 3 and 5