Date post: | 16-Dec-2015 |
Category: |
Documents |
Upload: | ashley-pearson |
View: | 213 times |
Download: | 0 times |
University of MinnesotaEducational Psychology
What Can We Learn from Quantitative Data in Statistics
Education Research?
Sterling Hilton Brigham Young University
Andy Zieffler University of Minnesota
John Holcomb Cleveland State University
Marsha Lovett Carnegie Mellon University
Introduction
Components of a research program Generate ideas (pre-clinical)
Develop a conceptual framework Frame question (pre-clinical, Phase I)
Constructs and Measurement Design and Methods Pilot study
Examine question (Phase I, Phase II) Establish efficacy (small)
Generalize findings (Phase III) Larger studies in varied settings
Extend findings (Phase IV) Longitudinal studies Different populations
Introduction
Quantitative methods in research program Framing: measurement development
Validity and reliability Framing: pilot study Examine Generalize Extend
Statistics education research is primarily in the “generate” and “frame” phases
Introduction
Purpose: Introduce two instruments that are in different stages of development and discuss how they have been and might be used in statistics education research Comprehensive Assessment of Outcomes
in a Fist Statistics course (CAOS) Survey of Attitudes Toward Statistics
(SATS)
Assessment Resource Tools for Improving Statistical Thinking
Several online assessments ARTIST Topic Scales Comprehensive Assessment of
Outcomes in a First Statistics course (CAOS)
Statistics Thinking and Reasoning Test (START)
ARTIST Topic Scales
7-15 MC items Many topics
Data Collection Data Representation Measures of Center Measures of Spread Normal Distribution Probability Bivariate Quantitative Data Bivariate Categorical Data Sampling Distributions Confidence Intervals Significance Tests
CAOS Test
40 MC items Designed to assess students’
statistical reasoning after any first course in statistics.
CAOS test focuses on statistical literacy and conceptual understanding, with a focus on reasoning about variability.
Developed through a three-year process of acquiring and writing items, testing and revising items, and gathering evidence of reliability and validity.
CAOS Test
Reliability Analysis Sample of 10287 Cronbach’s alpha coefficient of .77
Content Validity Evidence 18 expert raters Unanimous agreement that CAOS measures
important basic learning outcomes All raters agreed with the statement “CAOS
measures outcomes for which I would be disappointed if they were not achieved by students who succeed in my statistics courses.”
Some raters indicated topics that they felt were missing from the scale - no agreement among these raters about the topics that were missing.
START Test
14 MC items Identified through a principal
components analysis performed on CAOS data gathered in Fall 2005 and Spring 2006 (n = 1470).
Alpha Coefficient from that data set was calculated to be 0.74.
Use of Quantitative Measures in a Phase 1 Study
Exploratory Studies What can we find out about
students’ understanding? Where are students having
difficulties? Are there inconsistencies in
students’ reasoning?
Example Item 1
Measured Learning Outcome
Understanding the interpretation of a median in the context of boxplots.
Example Item 1
The two boxplots below display final exam scores for all students in two different sections of the same course
Example Item 1
Which section has a greater percentage of students with scores at or above 80?
a) Section A
b) Section B
c) Both sections are about equal.
Example Item 1
Which section has a greater percentage of students with scores at or above 80?
a) Section A
b) Section B
c) Both sections are about equal.
Example Item 1
Pretest Posttest Response (N = 754)
73.7% 65.6% Section A
6.6% 6.1% Section B
19.6% 28.2%Both sections are about equal.
Example Item 1
Is this surprising? What can we learn from
students’ responses to this item?
Implications/Directions for research? Teaching?
Example Item 2
Researchers surveyed 1,000 randomly selected adults in the U.S. A statistically significant, strong positive correlation was found between income level and the number of containers of recycling they typically collect in a week. Please select the best interpretation of this result.
Example Item 2
a) We can not conclude whether earning more money causes more recycling among U.S. adults because this type of design does not allow us to infer causation.
b) This sample is too small to draw any conclusions about the relationship between income level and amount of recycling for adults in the U.S.
c) This result indicates that earning more money influences people to recycle more than people who earn less money.
Example Item 2
a) We can not conclude whether earning more money causes more recycling among U.S. adults because this type of design does not allow us to infer causation.
b) This sample is too small to draw any conclusions about the relationship between income level and amount of recycling for adults in the U.S.
c) This result indicates that earning more money influences people to recycle more than people who earn less money.
Example Item 2
Pretest Posttest Response (N = 743)
54.6% 52.6%We can not conclude whether earning more money causes more recycling among U.S. adults because this type of design does not allow us to infer causation.
18.3% 11.4%This sample is too small to draw any conclusions about the relationship between income level and amount of recycling for adults in the U.S.
27.1% 35.9%This result indicates that earning more money influences people to recycle more than people who earn less money.
Example Item 2
Is this surprising? What can we learn from
students’ responses to this item?
Implications/Directions for research? Teaching?
Example Item 3
Measured Learning Outcome
Ability to match a scatterplot to a verbal description of a bivariate
relationship.
Example Item 3
Bone density is typically measured as a standardized score with a mean of 0 and a standard deviation of 1. Lower scores correspond to lower bone density. Which of the following graphs shows that as women grow older they tend to have lower bone density?
Example Item 3
Pretest Posttest Response (N = 748)
90.5% 92.5% Graph A
6.1% 6.6% Graph B
3.3% 0.9% Graph C
Example Item 3
Is this surprising? What can we learn from
students’ responses to this item?
Implications/Directions for research? Teaching?
Example Item 4
Measured Learning Outcome
Understanding of the purpose of randomization in an experiment.
Example Item 4
A recent research study randomly divided participants into groups who were given different levels of Vitamin E to take daily. One group received only a placebo pill. The research study followed the participants for eight years to see how many developed a particular type of cancer during that time period. Which of the following responses gives the best explanation as to the purpose of randomization in this study?
Example Item 4
a) To increase the accuracy of the research results.
b) To ensure that all potential cancer patients had an equal chance of being selected for the study.
c) To reduce the amount of sampling error.
d) To produce treatment groups with similar characteristics.
e) To prevent skewness in the results.
Example Item 4
a) To increase the accuracy of the research results.
b) To ensure that all potential cancer patients had an equal chance of being selected for the study.
c) To reduce the amount of sampling error.
d) To produce treatment groups with similar characteristics.
e) To prevent skewness in the results.
Example Item 4
Pretest Posttest Response (N = 754)
41.4% 31.8% To increase the accuracy of the research results.
13.5% 19.8%To ensure that all potential cancer patients had an equal chance of being selected for the study.
22.7% 29.4% To reduce the amount of sampling error.
8.5% 12.3% To produce treatment groups with similar characteristics.
13.9% 6.6% To prevent skewness in the results.
Example Item 4
Is this surprising? What can we learn from
students’ responses to this item?
Implications/Directions for research? Teaching?
How Can We Use the Results?
Begin to look for underlying reasons students are having difficultiesExamine the research literature Interview students to gain a more in-
depth understanding of their reasoning
Compare results with data from other classes (other teachers, schools)
How Can We Use the Results?
They can inform our instruction Reconsider how difficult or easy some
concepts are for students Rethink how we currently teach these ideas Add new activities or tools Re-allocate classroom time
Change the way we assess students Assessment items better aligned with
learning outcomes Assessment items that probe students
reasoning
SATS
Survey of Attitudes Towards Statistics Candace Schau and Tom Dauphinee (http://www.unm.edu/~cschau/satshomepage.htm)
Twenty-eight item survey Seven point Likert scale response
Strongly Neither agree Strongly
Disagree nor disagree Agree
1 2 3 4 5 6 7
SATS
Original four subscales Value (9 items; α range .80 - .90 )
“Statistics is worthless.” Affect (6 items; α range .80 - .85)
“I like statistics.” Cognitive Competence (6 items; α range .77
- .85)
“I have no idea of what’s going on in statistics.” Difficulty (7 items; α range .64 - .79)
“Statistics is a complicated subject.”
SATS
Two additional subscales Interest (4 items)
“I am interested in using statistics.”
Effort (4 items)
“I plan to complete all of my statistics assignments.”
SATS
Attitude is multi-faceted outcome
Issues to consider Pre-existing attitudes Direction and magnitude of
changes over a semester Relevance of items to study
Using the SATS: A Case Study
Assessment of a project-rich introductory statistics course
Fall 2004, at Cleveland State University
Class 1: 30 students Pre/Post Class 2: 16 students Pre/Post SATS administered first day and
final exam day
Class 1: Projects - Rich
4 team projects that used/requiredReal dataComputer SoftwareCollaborationWriting
Individualized Mid-Term and Take-home Data Analysis Exams
http://academic.csuohio.edu/holcombj/eku/index.html Login: holcomb pwd: projects22
Class
Pre
AFF
ECT
21
7
6
5
4
3
2
1
PreAFFECT vs Class
Class
Pre
COGCOM
P
21
7
6
5
4
3
2
1
PreCOGCOMP vs Class
Class
Pre
VA
LUE
21
7
6
5
4
3
2
1
PreVALUE vs Class
Class
Pre
DIF
FICULT
Y
21
7
6
5
4
3
2
1
PreDIFFICULTY vs Class
Class
Pre
INTE
RES
T
21
7
6
5
4
3
2
1
PreINTEREST vs Class
Class
Pre
EFFO
RT
21
7
6
5
4
3
2
1
PreEFFORT vs Class
Class 1 Change from Pre to Post(2 – sided tests)
Significant Differences for: Cognitive Competence Value Difficulty* Interest
Insignificant Differences for: Affect Effort
*(Not Significant with Nonparametric Test)
diffAFFECTdiffCOGCOMP
diffVALUEdiffDIFFICULTY
diffINTERESTdiffEFFORT
-6.00
-4.00
-2.00
0.00
2.00
4.00
6.00
727
29
5
24
2
2
218
29
Six Components for Class1: Pre - Post
p = 0.541 p=0.018 p = 0.038 p = 0.049 p = 0.006 p = 0.881
Class 2: Change from Pre to Post (2- sided tests)
Significant Differences Affect (wrong direction)
Insignificant Differences Cognitive Competence Value Difficulty Interest Effort
diffAFFECTdiffCOGCOMP
diffVALUEdiffDIFFICULTY
diffINTERESTdiffEFFORT
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
43
31
32
42
40
p = 0.020 p = 0.522 p = 0.247 p = 0.303 p = 0.062 p = 0.051
Six Components for Class2: Pre - Post
Multivariate Analysis of Post DataClass Significant vs Insignificant
Significant Differences Affect Value Interest
Insignificant Differences Cognitive Competence Difficulty Effort
Does SATS Ask the Right Questions?
Value Component Questions Statistics is worthless. Statistics should be a required part of my
professional training. Statistical skills will make me more employable. Statistics is not useful to the typical professional. Statistical thinking is not applicable in my life outside
my job. I use statistics in my everyday life. Statistics conclusions are rarely presented in
everyday life. I will have no application for statistics in my
profession. Statistics is irrelevant in my life.
Instructors: Do try this at home!
But first, set your expectations Results may not be as high as you
desire by the end of your course (e.g., CAOS)
Results may not change from the beginning to the end of your course or in the direction you anticipate (e.g., SATS)
Same is true for other instruments, too
How might you use such data?
To better understand students’ learning of particular concepts and skills
To identify different patterns of student performance
To establish a starting point for further inquiry To make your teaching and students’ learning
more effective To assess where students start and to reveal
areas of difficulty during course
Some Practical Considerations
Motivating students to take these instruments seriously Grading? Feedback
Instrument integrity Time to administer Others?
INQUERI Project
INQUERI = Initiative for Quantitative Education Research Infrastructure To build a research infrastructure by
focusing on the development, deployment, user training, and archiving of high quality research methods, instruments, and data
To disseminate these methods and results
To catalyze research collaborations See www.inqueri.org
Back to the Big Picture
Focus on the question/goal you want to address and relate that to past research
Start small Using existing instruments is one
way Working within your own course to
start Share with colleagues, connect
with the literature, and then extend
References
delMas, R., Garfield, J., Ooms, A., & Chance, B. (2006). Assessing students' conceptual understanding after a first course in statistics. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Garfield, J., delMas, R., & Chance, B. (n.d.). Assessment Resource Tools for Improving Statistical Thinking Retrieved May 8, 2007, from https://app.gen.umn.edu/artist/index.html.
References
http://www.unm.edu/~cschau/satshomepage.htm Dauphinee, T. L., Schau, C., & Stevens, J. J. (1997).
Survey of Attitudes Toward Statistics: Factor structure and factorial invariance for females and males. Structural Equation Modeling, 4, 129-141.
Schau, C., Stevens, J., Dauphinee, T. L., & Del Vecchio, A. (1995). The development and validation of the Survey of Attitudes Toward Statistics. Educational and Psychological Measurement, 55, 868-875.
Hilton, S. C., Schau, C., & Olsen, J. A. (2003). Survey of Attitudes Toward Statistics: Factor structure invariance by gender and by administration time. Structural Equation Modeling, 11, 92 – 109.
Contact Information
Sterling Hilton [email protected]
Andy Zieffler [email protected]
John Holcomb [email protected]
Marsha Lovett [email protected]