IB Math Studies – Topic 6. IB Course Guide Description.

Post on 17-Dec-2015

225 views 0 download

Tags:

transcript

IB Math Studies – Topic 6

Statistics

IB Course Guide Description

IB Course Guide Description

IB Course Guide Description

Continuous: A numerical data which has values within a continuous range that has been measured.

Discrete: A numerical data which has whole numbers and has been counted.

Continuous and Discrete Data

Presenting and Interpreting Data

Stem-and-leaf Plots

Stem-and-leaf, or at times called stemplot, is a easy way of writing down the data in groups. Used for small data sets

For number with two digits, the first digit forms part of the stem and the second digit forms a leaf

• A line graph, utilized much like a histogram, that gives a visual appreciation of the shape of the frequency distribution. • The midpoint of each bar is used to represent the whole interval. • Lines are then draw between these midpoints.

Frequency Polygon

A histogram is a vertical column graph used to represent continuous grouped data.

There are no gaps between the columns in a histograms as the data is continuous.

Histograms

A box-and-whisker plot is a visual display of some of the descriptive statistics of a data set.

Box and Whisker Plot

• Outliers are extraordinary data that are usually separated form the main body of the data. • The upper boundary =

upper quartile + 1.5 X IQR• The lower boundary =

lower quartile – 1.5 X IQR

Summarizing the Data

• Mean: is the arithmetic average obtained by adding all the scores and dividing by the total number of scores.

• Mode: is the score that occurs most frequently.

• Median: Is the middle score after they have been placed in order.

Grouped Discrete Data

Grouped Continuous Data

Range: is the difference between the maximum data value and the minimum data value. Range = maximum data value – minimum data

value Interquartile Range: is the range of the middle

half (50%) of the data. The data set has been divided into quarters by

the lower quartile (Q1), the median (Q2) and the upper quartile (Q3).

IQR = Q3 – Q1

Measure of Dispersion

Standard Deviation: measures the deviation between scores and the mean.

Ungrouped Data

Grouped Discrete Data

Standard Deviation

Correlation• A correlation refers to the relationship or association between two

variables

A scatter diagram indicates the relationship between two variables.

If there is a relationship, we can draw in the “line of best fit”

Line of Best Fit

• Drawing a Line of Best Fit• Calculate mean of x values , and

mean of y values• Mark the mean point on

the scatter plot• Draw a line through the mean point

that is through the middle of the data• equal number of points above

and below line

xy

,x y

The line of best fit on a scatter diagram is called a “regression line” and it can be calculated from the data pairs.

Regression Line

)(2 xxs

syy

x

xy

• The regression line is used for prediction purposes.• The regression line is less reliable when extended far beyond

the region of the data.

Correlation Coefficient

• -1 indicates perfect negative correlation.

• 0 indicates no correlation

• +1 indicates perfect positive correlation.

• 0.25 ≤ r < 0.5 = weak

• 0.5 ≤ r < 0.75 = moderate correlation

• 0.75 ≤ r <1 = strong correlation

The Chi-Squared Test

How many people are in the sample? How many males? How many females?

This is called a 2 x 2 contingency table.

1) Write the null hypothesis (H0) and the alternate hypothesis (H1).

2) Create contingency tables for observed and expected values.

3) Calculate the chi-square statistic and degrees of freedom.

4) Find the chi-squared critical value (booklet).• Depends on the level of significance (p) and the degrees

of freedom (v).

5) Determine whether or not to accept the null hypothesis.

On the calculator:

Put your contingency table in matrix A STAT

TESTS C: χ2 Test

Observed: [A] Expected: [B] (this is where you want to go)

Calculate

e

obscalc

f

ffX

2exp2