In the name of Allah Kareem,Most Beneficent, Most Gracious, the Most Merciful !
3
After this session the students will be able to analyze the collected data using descriptive statistics by
• Producing summaries of data in both tabular and graphical forms
• Calculating the central tendencies using mean median and modes
• Calculate the dispersion of data using range, IQR and Standard Deviation
• Checking if the data is normally distributed using Normal curve phenomenon
Session Objectives
4SUPERIOR GROUP OF COLLEGES
Analyzing Data“The process of breaking down the complex data to gain better understanding of it.”
There are two types of statistics
descriptive statistics inferential statistics
In this session we will work on descriptive statistics
5SUPERIOR GROUP OF COLLEGES
Descriptive statistics are used to describe, summarize, organize, and simplify data in quantitative terms. We will cover
1. Summarizing Numerical Data
2. Measures of Central Tendency
3. Measurement of Dispersion
4. Checking Data Normality
Descriptive statistics
6
1. SummarizingVariable
Categorical
Frequency Distribution
Table
Bar chart
Numerical
Five Figure Summary
Box Plot / Histograms
Tabular Presentation
GraphicalPresentation
7
A frequency distribution is a tally or count of the number of times each score on a single variable occurs
Frequency Distribution.Summarizing categorical data
Analyze Descriptive Statistics Frequencies move religion to the variable box OK (make sure that the Display frequency tables box is checked)
Frequency table for religionreligion
Frequency Percent Valid PercentCumulative
PercentValid Muslims 30 40.0 44.8 44.8
Christians 23 30.7 34.3 79.1Hindus 14 18.7 20.9 100.0Total 67 89.3 100.0
Missing other religion 4 5.3
blank 4 5.3Total 8 10.7
Total 75 100.0
8
With Nominal data, it is better to make a bar graph or chart of the frequency distribution of variables like religion, ethnic group, or other nominal variables; the points that happen to be adjacent in your frequency distribution are not by necessarily adjacent.To get a bar chart select
Graphs legacy dialogues interactive bar chart move variable to the box OK
Bar ChartsSummarizing categorical data
9
Five Figure Summary
It is used to summarize the Numerical data. Five figures include locating the following values in data
1. Minimum value2. Maximum Value3. Median4. Lower Quartile5. Upper Quartile
Summarizing Numerical data
10
Exercise: Calculate Five Figure Summary
Department B: 30 employees
1 1 2 2 2 3 3 4 4 55 6 6 6 6 6 7 7 8 8
9 9 10 11 13 13 15 15 19 20
11
Exercise: Calculate Five Figure Summary
Department B: 30 employees2 2 3 3 3 4
4 5 5 5 6 6 7 7 7 7 8 8 8 8 8 8 10 10 12 15
12
box & whisker plotFor ordinal and normal data, the box and whiskers plot is useful The box and whisker plot is a graphical representation of distribution of scores and is helpful in distinguishing between ordinal and normally distributed data
Graphs legacy dialogues interactive box plot move gender to the x-axis and move SAT math to y-axis OK
13
HistogramHistograms are just like bar graph but there is no space between the boxes, indicating that there is a continuous variable theoretically underlying the scores. Histograms can be used even if data, as measured, are not continuous, if the underlying is conceptualized as continuous. To draw a histogram select:
Graphs legacy dialogues interactivehistogram move variable to the box OK
14
Mean. The arithmetic average or mean takes into account all of the available information in computing the central tendency of a frequency distribution.
Median. The middle score or median is the appropriate measure of central tendency for ordinal level raw data.
Mode. The most common category, or mode can be used with any kind of data generally provides the least precise information about central tendency
MEASUREMENT OF CENTRAL TENDENCY
15
Measure of Central Tendency
Exercise
16
Statistics scholastic aptitude test - math N Valid 75
Missing 0 Mean 490.53 Median 490.00 Mode 500
Analyze Descriptive statistics Frequencies put SAT Math into variable box click on statistics mark
mean, median and mode click continue Ok
17
Measures of Variability Range—The range (highest minus lowest
score) is the crudest measure of variability but does give an indication of the spread in scores if they are ordered.
Standard Deviation—The standard deviation is based on the deviation (x) of each score from the mean of all scores.
18
Analyze Descriptive statistics Frequencies put SAT Math into variable box click on statistics mark
Range and std deviation click continue Ok
19
Nominal Dichotomous Ordinal NormalFrequency Distribution Yes Yes Yes Ok
Bar Chart Yes Yes Yes OKHistogram No No OK YesFrequency Polygon No No OK YesBox &Whisker Plot No No Yes YesCentral TendencyMean No OK OK YesMedian No OK Yes OKMode Yes Yes OK OKVariabilityRange No Always 1 Yes YesStandard Deviation No No OK YesInterquartileRange No No OK OKHow many categories Yes Always 2 OK NoShapeSkewness No No Yes Yes
20SUPERIOR GROUP OF COLLEGES 20