+ All Categories
Home > Documents > Statistical Analysis of Data with report writing

Statistical Analysis of Data with report writing

Date post: 07-Apr-2018
Category:
Upload: usmansiddiq1
View: 224 times
Download: 0 times
Share this document with a friend

of 16

Transcript
  • 8/6/2019 Statistical Analysis of Data with report writing

    1/16

    Statistical Analysis of DataBy: Usman Siddique

    0300-4556959M.phil Applied Linguistics

    Lahore Pakistan

    Introduction

    Raw data can take a variety of forms, including measurements, survey responses, and observations. In its rawform, this information can be incredibly useful, but also overwhelming. Over the course of the data analysis

    process, the raw data is ordered in a way which will be useful.

    Data for statistical analysis and report writing are taken from University of Taxes through internet. Data includefacts and figures about main areas of the university such as faculty members, students, etc. Data have fifty-one(51) variables and thirteen hundreds and fifty eight cases (individuals). Data are in tabular form, there are fifty-one columns and thirteen hundreds and fifty eight rows. Columns show categories or variables and rows showfrequency of individuals or cases.

    Objectives

    It is impossible to do a sound analysis without knowing what you wish to achieve. Too often an analysis isstarted without a clear idea of where it is going. The result is usually a lot of wasted time and an inadequateanalysis. Avoid this by deciding on the objectives of the analysis before starting it.

    These are objectives of data analysis:1. Understanding of data2. Finding association among variables3. Finding significant difference between the groups4. Data reduction

    Questions

    1. Is there any significant association between student sex and student grade point average?

    2. Is there any significant difference between the faculty rank of males and females?3. Is there any significant difference between faculty salary and four levels of highestdegree earned?

    4. Is there any significant difference between student expected grade and student grade point average?

    5. Is there any significant correlation between faculty rank and salary?

  • 8/6/2019 Statistical Analysis of Data with report writing

    2/16

    Analysis of Data

    Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. Data analysis hasmultiple facets and approaches, encompassing diverse techniques under a variety of names, in different

    business, science, and social science domains.

    Several analyses can be used during the initial data analysis phase: Univariate statistics Bivariate associations (correlations) Graphical techniques (scatter plots)

    It is important to take the measurement levels of the variables into account for the analyses, as special statisticaltechniques are available for each level:

    Nominal and ordinal variableso Frequency counts (numbers and percentages)o Associationscircumambulations (cross-tabulations)

    hierarchical log-linear analysis (restricted to a maximum of 8 variables)log-linear analysis (to identify relevant/important variables and possible confounders)

    o Exact tests or bootstrapping (in case subgroups are small)o Computation of new variables

    Continuous variableso Distribution

    Statistics (Mean, Standard Deviation, variance, skewness, kurtosis)Stem-and-leaf displays

    Box plots

    Statistical analysis of Data

    The statistical data analysis is the data analysis plan to examine the research questions or hypotheses. Thestatistical data analysis includes the specific statistic to be used and the assumptions of that statistic. Statisticaldata analysis involves discussing the specific statistic to be used, the assumptions of the statistic, and how thestatistic would be interpreted. The sample size requirement is also part of the statistical data analysis, whichstates how many participants would be required for the study.

    Descriptive statistics are useful for describing the basic features of data, for example, the summary statisticsfor the scale variables and measures of the data. In a research study with large data, descriptive statistics mayhelp us to manage the data and present it in a summary table. For instance, in a cricket match, descriptivestatistics can help us to manage records of the player and descriptive statistics also help us to compare one

    players records with another players records.

    1. Measure of central tendency: In descriptive statistics, the measure of central tendency measures theaverage value of the sample. In descriptive statistics, there are two types of averages: the first are themathematical averages and the second are the positional averages.

    2. Measure of dispersion: In descriptive statistics, we can elaborate upon the data further by measuringthe dispersion. In descriptive statistics, usually the range of the standard deviation and variance is used

  • 8/6/2019 Statistical Analysis of Data with report writing

    3/16

    to measure the dispersion. In descriptive statistics, range is defined as the difference between the highestand the lowest value. In descriptive statistics, the standard deviation and variance are usually used tomeasure the dispersion. Standard deviation is also called the root mean square deviation. Variance isalso used to measure the dispersion, which can be simply derived from the square of the standarddeviation.

    Statistical tools used for data analysis and report

    writingCorrelation

    Correlation is a measure of the relation between two or more variables. The measurement scales used should beat least interval scales, but other correlation coefficients are available to handle other types of data. Correlationcoefficients can range from -1.00 to +1.00. The value of -1.00 represents a perfect negative correlation while avalue of +1.00 represents a perfect positive correlation. A value of 0.00 represents a lack of correlation. Themost widely-used type of correlation coefficient is Pearson r , also called linear or product- moment correlation.

    Simple Linear Correlation or Pearson correlation assumes that the two variables are measured on at leastinterval scales, and it determines the extent to which values of the two variables are "proportional" to eachother. The value of correlation (i.e., correlation coefficient) does not depend on the specific measurement unitsused; for example, the correlation between height and weight will be identical regardless of whether inches and

    pounds , or centimeters and kilograms are used as measurement units.

    t-Test for Independent Samples

    The t -test is the most commonly used method to evaluate the differences in means between two groups.Theoretically, the t-test can be used even if the sample sizes are very small (e.g., as small as 10; someresearchers claim that even smaller n's are possible), as long as the variables are normally distributed withineach group and the variation of scores in the two groups is not reliably different. The equality of variancesassumption can be verified with the F test, or you can use the more robust Levene's test . If these conditions arenot met, then you can evaluate the differences in means between two groups using one of the nonparametricalternatives to the t - test.

    Paired sample t-Test

    Paired sample t-test helps us to take advantage of one specific type of design in which an important source of within-group variation can be easily identified and excluded from the analysis. Specifically, if two groups of observations are based on the same sample of subjects who were tested twice (e.g., before and after a

    treatment), then a considerable part of the within-group variation in both groups of scores can be attributed tothe initial individual differences between subjects.

    Specifically, instead of treating each group separately, and analyzing raw scores, we can look only at thedifferences between the two measures (e.g., "pre-test" and "post test") in each subject. By subtracting the firstscore from the second for each subject and then analyzing only those "pure (paired) differences," we willexclude the entire part of the variation in our data set that results from unequal base levels of individualsubjects.

    Cross-tabulation

  • 8/6/2019 Statistical Analysis of Data with report writing

    4/16

    Cross-tabulation is a combination of two (or more) frequency tables arranged such that each cell in the resultingtable represents a unique combination of specific values of cross-tabulated variables. Thus, cross-tabulationallows us to examine frequencies of observations that belong to specific categories on more than one variable.Only categorical ( nominal ) variables or variables with a relatively small number of different meaningful valuesshould be cross-tabulated. Note that in the cases where we do want to include a continuous variable in a cross-tabulation (e.g., income), we can first recode it into a particular number of distinct ranges (e.g., low, medium,high). For example, suppose we conduct a simple study in which males and females are asked to choose one of two different brands of soda pop (brand A and brand B).

    Chi-square

    Chi-square Test for Association is a test of statistical significance widely used bivariate tabular association analysis. Typically, the hypothesis is whether or not two different populations are differentenough in some characteristic or aspect of their behavior based on two random samples. This test

    procedure is also known as the Pearson chi-square test.

    ANOVAAnalysis of variance (ANOVA) is to test for significant differences between means. If we are only comparing

    two means, ANOVA will produce the same results as the t test for independent samples (if we are comparing

    two different groups of cases or observations) or the t test for dependent samples (if we are comparing two

    variables in one set of cases or observations). The purpose of analysis of variance is to test differences in means

    (for groups or variables) for statistical significance. This is accomplished by analyzing the variance, that is, by

    partitioning the total variance into the component that is due to true random error and the components that are

    due to differences between means. These latter variance components are then tested for statistical significance,and, if significant, we reject the null hypothesis of no differences between means and accept the alternative

    hypothesis that the means (in the population) are different from each other.

    Reporting

    There are five reports every report is consists of six parts which are as given below:

    Problem Statement

    Null Hypothesis

    Statistics to be used

    Findings

    Conclusion

    Histogram

  • 8/6/2019 Statistical Analysis of Data with report writing

    5/16

  • 8/6/2019 Statistical Analysis of Data with report writing

    6/16

    Report # 1

    Variables to analyze1. Student sex

    2. Grade point averageStatistical ToolCrosstab

    Problem StatementIs there any significant association between student sex and student grade point average?

    Null HypothesisThere is no significant association between student sex and student grade point average.

    Statistics to be usedSince variables represent nominal and ordinal level of measurement, therefore crosstab will be

    used to see significant association between student sex and grade point average.

    STUDENTS SEX * STUDENTS GRADE POINT AVERAGE Crosstabulation

    Count

    STUDENTS GRADE POINT AVERAGE

    TotalLESS THAN 2.00 2.49-2.0 2.99-2.50 3.49-3.0 4.0-3.5

    STUDENTS SEX MALE 24 118 163 179 123 607

    FEMALE 17 117 196 236 177 743

    Total 41 235 359 415 300 1350

  • 8/6/2019 Statistical Analysis of Data with report writing

    7/16

    Chi-Square Tests

    Value df Asymp. Sig. (2-sided)

    Pearson Chi-Square 8.164 a 4 .086

    Likelihood Ratio 8.146 4 .086

    Linear-by-Linear Association 7.347 1 .007

    N of Valid Cases 1350

    a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 18.43.

    FindingsCross-tab shows that there is a significant association between student sex and student grade

    point average. (Chi-square =8.164, sig = .086). The null hypothesis claiming no significantassociation between student sex and student grade point average, is therefore rejected.

    ConclusionComparatively more Females students gained higher grades than male students.

    Histogram

    Report # 2

  • 8/6/2019 Statistical Analysis of Data with report writing

    8/16

    Variables to analyze1. Faculty rank 2. Faculty sex

    Statistical ToolIndependent Sample t-test

    Problem StatementIs there any significant difference between the faculty rank of males and females?

    Null HypothesisThere is no significant difference between the faculty rank of males and females.

    Statistics to be usedSince faculty rank represent ordinal level of measurement, therefore independent sample t-testwill be used to calculate significant difference between the faculty rank of males and females.

    Findings

    Group Statistics

    FACULTY

    SEX N Mean Std. Deviation Std. Error Mean

    FACULTY RANK MALE 849 2.64 1.047 .036

    FEMALE 579 2.11 .858 .036

    Independent Samples Test

    Levene's Test for

    Equality of Variances t-test for Equality of Means

    F Sig. t df

    Sig. (2-

    tailed)

    Mean

    Difference

    Std. Error

    Difference

    95% Confidence

    Interval of the

    Difference

    Lower Upper

    FACULTYRANK

    Equal variancesassumed

    50.610 .000 10.182 1426 .000 .535 .053 .432 .638

    Equal variances

    not assumed 10.5681.379

    E3.000 .535 .051 .436 .634

  • 8/6/2019 Statistical Analysis of Data with report writing

    9/16

    (a) Levenes test indicates that groups of male and female faculty members are nothomogeneous in variability ( f = 50.610, sig = .000) , so the lower row will be used tomeasure the significant difference.

    (b) The t-test indicares that there is significant difference between the faculty rank of malesand females ( t = 10.568, sig = .000) The null hypothesis claiming no significantdifference between the faculty rank of males and females, is therefore rejected.

    ConclusionThere is no equality of ranks between males and females faculty members.

    Histogram

    Report # 3

    Variables to analyze

  • 8/6/2019 Statistical Analysis of Data with report writing

    10/16

    1. Faculty salary2. Four Highest degree levels

    Statistical ToolANOVA (Analysis of Variance)

    Problem StatementIs there any significant difference between faculty salary and four levels of highest degreeearned?

    Null HypothesisThere is no significant difference between faculty salary and four levels of highest degreeearned.

    Statistics to be usedThere are four degree levels to be compared with salary of faculty. Since the salary is compared

    with four levels of highest degree earned by the faculty members, therefore ANOVA will be

    used, because four groups of different highest degree levels are compared.

    ANOVA

    FACULYT SALARY

    Sum of Squares df Mean Square F Sig.

    Between Groups 17519.789 3 5839.930 130.734 .000

    Within Groups 63610.329 1424 44.670

    Total

    81130.118 1427

  • 8/6/2019 Statistical Analysis of Data with report writing

    11/16

    Multiple Comparisons

    FACULYT SALARYLSD

    (I) HIGHESTDEGREE

    EARNED

    (J) HIGHESTDEGREE

    EARNED

    Mean Difference (I-

    J) Std. Error Sig.

    95% Confidence Interval

    Lower Bound Upper Bound

    B.A. M.A. -3.078 1.710 .072 -6.43 .28

    PH.D. -9.933 * 1.589 .000 -13.05 -6.82

    UNKNOWN -1.895 1.627 .244 -5.09 1.30

    M.A. B.A. 3.078 1.710 .072 -.28 6.43

    PH.D. -6.855 * .697 .000 -8.22 -5.49

    UNKNOWN 1.183 .779 .129 -.34 2.71

    PH.D. B.A. 9.933 * 1.589 .000 6.82 13.05

    M.A. 6.855 * .697 .000 5.49 8.22

    UNKNOWN 8.038 * .456 .000 7.14 8.93

    UNKNOWN B.A. 1.895 1.627 .244 -1.30 5.09

    M.A. -1.183 .779 .129 -2.71 .34

    PH.D.

    -8.038 * .456 .000 -8.93 -7.14

    *. The mean difference is significant at the 0.05 level.

    Findings(a) Figures given in the above ANOVA table indicate that there is significant difference

    between the faculty salary and four levels of highest degree earned ( f = 130.734, sig = .

    000). Crosstab shows that there is a significant association between gender and salary.

    (b) LCD post hoc test indicates that there is significant difference between and within the

    groups, so the null hypothesis stating on significant difference is therefore rejected.

  • 8/6/2019 Statistical Analysis of Data with report writing

    12/16

    ConclusionFaculty members having higher degree have high salary as well.

    Histogram

  • 8/6/2019 Statistical Analysis of Data with report writing

    13/16

    Report # 4

    Variables to analyze1. Student expected grade2. Student grade point average

    Statistical ToolPaired sample t-test

    Problem StatementIs there any significant difference between student expected grade and student grade pointaverage?

    Null HypothesisThere is no significant difference between student expected grade and student grade pointaverage.

    Statistics to be usedSince both variables are of the same group or individual, therefore Paired sample t-test will beused to compare the significant difference between the student expected grade and grade point

    average.

    Findings

    Paired Samples Correlations

    N Correlation Sig.

    Pair 1 EXPECTED GRADE IN COURSE

    & STUDENTS GRADE POINT

    AVERAGE

    1358 .357 .000

    Paired Samples Test

    Paired Differences t

    df

    Sig. (2-

    tailed)

    Mean Std.

    Deviation

    Std. Error

    Mean

    95% Confidence Interval

    of the Difference

    Lower Upper

    Pair

    1

    EXPECTED GRADE IN COURSE -

    STUDENTS GRADE POINT

    AVERAGE

    .493 1.110 .030 .434 .552 16.358

    1357 .000

  • 8/6/2019 Statistical Analysis of Data with report writing

    14/16

    (a) The correlation between the expected grade and grade point average indicates low level

    of correlation ( r = .357, sig = .000) the use of the paired sample t-test is, therefore

    justified.

    (b) The paired sample t-test indicates that there is significant difference between expected

    grade and grade point average of the students, so the null hypothesis claiming no

    significant difference is therefore rejected.

    ConclusionMost of the Students did not get the grades they expected.

    Histigram

    Report # 5

    Variables to analyze

  • 8/6/2019 Statistical Analysis of Data with report writing

    15/16

    1. Faculty salary2. Faculty rank

    Statistical ToolPearson Correlation

    Problem StatementIs there any significant correlation between faculty rank and salary?

    Null HypothesisThere is no significant correlation between faculty rank and salary.

    Statistics to be usedSince both variables represent ratio level of measurement, therefore Pearson correlation will beused to find out the correlation.

    Correlations

    FACULYT SALARY FACULTY RANK

    FACULYT SALARY Pearson Correlation 1 .715 **

    Sig. (2-tailed) .000

    N 1428 1428

    FACULTY RANK Pearson Correlation .715 ** 1

    Sig. (2-tailed) .000

    N 1428 1428

    **. Correlation is significant at the 0.01 level (2-tailed).

    Findings

  • 8/6/2019 Statistical Analysis of Data with report writing

    16/16

    The Pearson correlation between faculty rank and salary indicates that a high positive

    correlation ( r = .715, sig = .000) so the null hypothesis claiming no significant correlation is

    therefore rejected.

    ConclusionSalary depends upon the rank in faculty, those who have higher ranks get higher salary.

    Histigram


Recommended