1
Revising the Course Evaluation Form
Process followed by
The Effective Teaching Committee of George Mason University
March 7, 2018
Charge of the Committee
The Effective Teaching Committee is charged with developing and helping implement
procedures that encourage and reward effective teaching, enabling faculty to improve their
teaching effectiveness independent of any evaluation procedures, and implementing
procedures for evaluation of effective teaching.
The Committee is charged with recommending policy to the Faculty Senate and
monitoring the use of such policy for the evaluation of teachers and courses, including the
following:
A. Review, improve, and provide guidance to Institutional Research and Reporting on the
course evaluation form and related procedures at least once every three years;
B. Review existing policies relating to the faculty evaluation process, identify alternatives to
these policies and recommend changes to the Faculty Senate;
C. Work closely with the Center for Teaching Excellence to support the use of formative and
self-assessment techniques and materials for promoting faculty professional growth and
teaching effectiveness, including strategies for robust student feedback.
Background
From its three-pronged charge, during AY 2016-17, the Effective Teaching Committee
focused on the Course Evaluation Form, since this form has not undergone any kind of
significant review or revision for over a decade. This Committee aims not only to recommend
ways to revise the form to make it more research-based, useful to faculty for making
improvements in teaching, and more fair when used for faculty evaluation but also to validate
the form for the dual purposes for which it is currently used – faculty evaluation and
improvement of teaching.
2
We decided to follow a rigorous development process to include: (1) determining
elements of effective teaching; (2) obtaining feedback from the Mason community on those
elements; (3) drafting a policy for use of Course Evaluation Form data; (4) obtaining feedback
from the Mason community on that policy; (5) revising course items; (6) pilot testing and
revising the items again; (7) finalizing and using the revised Course Evaluation Form.
We chose to undertake this process for two main reasons: (1) to ensure that the
inferences made about teaching and the subsequent decisions based on those inferences are
valid and can be supported by an instrument that adheres to measurement development
principles, and (2) to protect all parties involved in a high-stakes evaluation process. We
recommend revisions to items on the form based on a review of the literature on the validity
and reliability of using university student evaluations of teaching (SETs), as well as on faculty
and student input for what they consider to be indicators of effective teaching.
In an effort to establish the construct validity of categories and items on the form, we
reviewed a variety of sources on teaching effectiveness and identified eighteen potential
categories that we ultimately collapsed into thirteen for ease of response on a survey. Some
of the categories were overlapping, and that became evident in the data results and on
feedback from faculty. We identified categories of effective teaching by reviewing the criteria
for teaching excellence set by the Center for Teaching and Faculty Excellence, the Provost’s
criteria for genuine excellence in teaching, and databases for course evaluation form
categories and items used by other universities.
During Fall 2016 and Spring 2017, we sought to determine the relative importance of
the identified categories on effective teaching to four distinct stakeholder groups:
(1) all faculty, (2) faculty evaluation committees from every school, (3) academic deans and
administrators, and (4) a stratified random sample of students.
Methods
We conducted a survey to determine those aspects of teaching effectiveness each stakeholder
group perceived to be the most important. Most surveys were online, but the initial surveys in
Fall 2016 were on paper distributed at one of two meetings – the Provost’s Academic Council
and a forum of Faculty Evaluation Committee (FEC) Chairs. Surveys administered to Deans and
FEC members included questions aimed at perceived purpose of the Course Evaluation Form
and how they actually used the data generated by the form; their responses will inform our
policy recommendations to the Faculty Senate.
3
November 2016
We made brief presentations at meetings with the Provost’s Academic Council (Deans and
Directors) to familiarize them with the Committee’s charge and to ask them to take the survey.
We also met with Faculty Evaluation Committee (FEC) Chairs to ask for their input on the
survey, and they asked that we conduct the survey with all of their committee members
(annual evaluation, contract renewal, promotion & tenure), which we did online. Although the
total number of FEC members has not yet been confirmed, we did receive surveys from 25
respondents, some of whom may also have responded to the all-faculty survey conducted in
January, as well.
January 2017
We conducted two online surveys, one for all instructional faculty and another for a stratified
random sample (25%) of the entire student population. The Faculty Senate distributed the all-
faculty survey and the Office of Institutional Research and Assessment (OIRA) distributed the
survey to the student sample. In each of these surveys we asked respondents to rate each of
thirteen categories on teaching effectiveness in order of importance to them.
Surveys conducted with faculty also allowed for open-ended responses on what they would
suggest adding to the current Course Evaluation Form.
Surveys conducted with Deans and Directors, as well as with Faculty Evaluation Committee
members, had additional open-ended responses on what they saw as the purpose of the form,
how they defined effective teaching, and how they used the results of Course Evaluation
Forms. Their responses will guide our policy recommendations for how results are to be used
for valid and fair evaluation of faculty (see below on PURPOSE).
Surveys conducted with students only contained the thirteen categories associated with
effective teaching.
Instructional faculty, faculty evaluation committees, administrators, and students all
tended to agree on the most important aspects of effective teaching. Across all schools,
faculty and students identified the following five categories as the most relevant for measuring
effective teaching: Communication, Commitment to Teaching, Respect for Students,
Preparation & Organization, and Passion for Teaching. Some differences across faculty role
(adjunct/term/tenure-line) were evident but not significant.
4
In addition to the indicators of teaching effectiveness selected by survey respondents
during AY 2016 – 17, findings from a previous survey of faculty conducted in Spring 2014 by
this Committee on the usefulness of the Course Evaluation Form suggested that fewer than
half (40%) of all respondents were satisfied with the current course evaluation form, and
almost one-third expressed dissatisfaction. Almost half (47%) of respondents indicated that
free or open-ended responses were of most use to them. Respondents suggested putting the
forms online, using fewer questions, using more open-ended questions, and adding new
categories of questions. They also suggested using different types of questions, more open-
ended questions, and more instructor-generated and course-specific questions. Additional
categories recommended were questions on use of Blackboard and technology by the
instructor and using a different evaluation form for online courses and distance education.
Sept. - Dec. 2017
The Committee searched for and identifed possible items for inclusion on a new form
based on the six categories of effective teaching identified by stakeholders over the past year.
We consulted with the Office of Digital Learning for suggestions on adding items specific to
online courses. We also met with the Faculty Senate Chair and the Director of OIRE to inquire
about the process and make plans to pilot the new form in Spring 2018.
Feb. 2018
To determine the clarity and relevance of new items being proposed for the Course
Evaluation Form, we met with a number of faculty and student focus groups and conducted an
online survey with Program Chairs. Participants in each focus group were asked to identify
items that were unclear or unnecessary to determining teaching effectiveness. Based on
feedback from these focus groups, we made further revisions to our proposed items. We met
with the following focus groups: (1) graduate students in the College of Education; (2)
undergraduate students in a face-to-face Spanish class; (3) graduate students in an online
Spanish course; (4) students in an engineering class; (5) Student Senate; (6) faculty from one
program in the Graduate School of Education. We also conducted an online survey asking for
feedback on the items from 168 Program Chairs.
Purpose of the Course Evaluation Form
Over time, the Committee’s understanding has changed with regard to our charge and
its various components. Based on our survey data, Deans and other administrators don’t
necessarily agree on the purpose of the form. Faculty do not agree that the form is useful for
5
evaluating teaching effectiveness. A number of instructional faculty, faculty evaluation
committee members, and administrators commented either on the survey itself, in a separate
email, or by personal communication that the Course Evaluation Form should not be used to
determine teaching effectiveness because it lacks reliability and/or validity for that purpose.
The perception is that it is a measure of student satisfaction based on an anticipated course
grade or of how the instructor interacted with each student, and our preliminary review of the
research appears to support this perception. The literature does not appear to support the
validity of using the form for evaluating teaching effectiveness.
It appears that the Course Evaluation Form suffers from trying to serve two masters
without having been validated for either: both formative and summative assessment of
teaching performance. This is a critical consideration, because in order to revise the items on
the Course Evaluation Form, we need to first establish its purpose, then validate the items for
that purpose. Based on the data, we see a need for a broader discussion on the purposes of
the form, whether two separate forms are needed for two distinct purposes, and alternative
approaches to faculty evaluation that are supported by the research. We propose the Faculty
Senate engage in this discussion as soon as possible.
Recommended Next Steps
The Committee recommends:
1. Piloting the revised draft form with online courses in May 2018 and face-to-face courses in
December 2018 with the intent to establish validity and reliability of the form.
2. Based on results of the pilots, revising items again for a final form.
Members of the Effective Teaching Committee, 2017 - 18
Lorraine Valdez Pierce, Chair, College of Education & Human Development
Mihai Boicu, Volgenau School of Engineering
Esperanza Roman-Mendoza,
Tom Wood, School of Integrative Studies
Alexandria Zylstra, School of Business
6
Student Evaluations of Teaching
References
Adams, M. J. D., & Umbach, P. D. (2012). Nonresponse and online student evaluations of teaching: Understanding the influence of salience, fatigue, and academic environments. Research in Higher Education, 53, 576-591.
Adams, M. J. D. & Umbach, P. (2010). Who Doesn't Respond and Why? An Analysis of Nonresponse to Online Student Evaluations of Teaching. Presented at the annual meeting of the Association for the Study of Higher Education.
Alhija, F. N. A., & Fresko, B. (2009). Student evaluation of instruction: What can be learned from students' written comments? Studies in Educational Evaluation, 35 (1), 37-44.
Avery, R. J., Bryant, W. K., Mathios, A., Kang, H., & Bell, D. (2006). Electronic course evaluations: Does an online delivery system influence student evaluations? The Journal of Economic Education, 37 (1), 21-37.
Amrein-Beardsley, A. & Haladyna, T. (2012). Validating a theory-based survey to
evaluate teaching effectiveness in higher education. Journal on Excellence
in College Teaching, 23 (1), 17-42.
Benton, S.L. & Cashin, W. E. (2012). Student ratings of teaching: A summary of
research and literature. IDEA Paper #50. Manhattan, KS: The IDEA Center.
Benton, S. L., Webster, R., Gross, A. B., & Pallett, W. H. (2010). An analysis of IDEA student ratings of instruction using paper versus online survey methods, 2002-2008 data. IDEA Technical Report #16 : The IDEA Center.
Berk, R. A. (2005). Survey of 12 strategies to measure teaching effectiveness.
International Journal of Teaching & Learning in Higher Education, 17 (1): 48-62.
Boring, A. (2015). Gender biases in student evaluations of teachers. Working
Paper. OFCE-PRESAGE-Sciences Po†and LEDa-DIAL (France).
7
Boring, A., K. Ottoboni, & P. Stark (2016). Student Evaluations of Teaching (Mostly) Do Not Measure Teaching Effectiveness. Science Open Research.
Braga, M., M. Paccagnella, & M. Pellizzari. (2014). Evaluating students’
evaluations of professors. Economics of Education Review, 41 (p. 71 – 78).
Bubb, D.K., G. Schraw, D.E. James, B. G. Brents, K. F. Kaalberg, G. C. Marchand, P.
Amy, & A. Cammett. (May-June 2013). Making the Case for Formative
Assessment: How it Improves Student Engagement and Faculty Summative
Course Evaluations. Assessment Update, 25 (3): 8-9, 12. Available online at
http://wileyonlinelibrary.com
Chang, T. S. (2004). The results of student ratings: Paper vs. online. Journal of
Taiwan Normal University, 49 (1), 171-186.
Chang, L. (Sept. 1994). A psychometric evaluation of 4-Point and 6-point Likert-
type scales in relation to reliability and validity. Applied Psychological
Measurement, 18 (3): 205-215.
Carrell, S. E. & J. E. West (2010). Does professor quality matter? Evidence from
random assignment of students to professors. Journal of Political Economy.
Vol. 118, No. 3, 409-432.
Chickering, A. W. & Z. F. Gamson (1999). Development and Adaptations of the
Seven Principles for Good Practice in Undergraduate Education. New
directions for teaching and learning (80), pp. 75-81.
Davis, B. G. (2009). Tools for teaching, 2nd Ed. San Francisco, CA: Jossey-Bass.
(chapter on student rating forms for interpreting student evaluations).
Dolmans, D., R. Kamp, R. Stalmeijer, J. Whittingham, & I. Wolfhagen. (2014). Biases in course evaluations: ‘what does the evidence say?’ Medical Education, 48 (2). Available online at http://onlinelibrary.wiley.com/doi/10.1111/medu.12297/abstract
Flaherty, C. (2015). AAUP committee survey data raises questions on
effectiveness of student teaching evaluations. Retrieved from
8
https://www.insidehighered.com/news/2015/06/10/aaup-committee-survey-data-raise-questions-
effectiveness-student-teaching
Ford, T. (2016). The problem of bias and student course evaluations. TopHat Blog
retrieved on March 22, 2016 from http://blog.tophat.com/student-course-
evaluations/?mkt_tok=3RkMMJWWfF9wsRolsqzAZKXonjHpfsX56%2B4kWaKylMI%2F0ER3fOvrP
UfGjI4FT8NrI%2BSLDwEYGJlv6SgFT7bDMapn07gFWRD0TD7slJfbfYRPf6Ba2Jwyq%2F4%3D
Glenn, D. (Dec. 19, 2010). 2 studies shed new light on the meaning of course
evaluations. Chronicle of Higher Education. Available online at
http://chronicle.com.mutex.gmu.edu/article/2-Studies-Shed-New-Light-
on/125745/#sthash.kIPDWM6R.dpuf
Glenn, D. (2010). Rating your professors: Scholars test improved course
evaluations. Chronicle of Higher Education. Retrieved on Sept. 12, 2016
from http://www.chronicle.com.mutex.gmu.edu/article/evaluations-that-make-
the/65226
Gravestock, P. & Gregor-Greenleaf, E. (2008). Student Course Evaluations:
Research, Models and Trends. Toronto: Higher Education Quality Council of
Ontario.
Hardy, N. (2003). Online ratings: Fact and fiction. In D. L. Sorenson & T. D. Johnson
(Eds.), Online student ratings of instruction. New Directions for Teaching
and Learning (Vol. 96, pp. 31-38). San Francisco: Jossey-Bass.
Hativa, N., Many, A., & Dayagi, R. (2010). The whys and wherefores of teacher
evaluation by their students. [Hebrew]. Al Hagova, 9, 30-37.
Hativa, N. (2013). Student ratings of instruction: A Practical approach to
designing, operating, and reporting. (see Chapter 7: Online Ratings)
Higher Learning Commission. (2016). Determining qualified faculty through HLC’s
criteria for accreditation and assumed practices. Guidelines for institutions
and peer reviewers. Chicago, IL: Author.
IDEA. (2011). Paper versus online survey delivery. IDEA Research Notes No. 4 : The
IDEA Center.
9
Johnson, T. D. (2003). Online student ratings: Will students respond? In D. L.
Sorenson & T. D. Johnson (Eds.), Online student ratings of instruction. New
Directions for Teaching and Learning (Vol. 96, pp. 49-59). San Francisco:
Jossey-Bass.
Jones, S. J. (2012). Reading between the lines of online course evaluations:
Identifiable actions that improve student perceptions of teaching
effectiveness and course value. Journal of Asynchronous Learning
Networks, 16 (1). Available online at http://files.eric.ed.gov/fulltext/EJ971039.pdf
Kulik, J. A. (2005). Online collection of student evaluations of teaching. Retrieved
April 2012, from http://www.umich.edu/~eande/tq/OnLineTQExp.pdf
Kwan, K. 1999. How Fair are Student Ratings in Assessing the Teaching
Performance of University Teachers? Assessment & Evaluation in Higher
Education 24:2.
Layne, L. (2012). Defining effective teaching. Journal on Excellence in College
Teaching, 23 (1), 43-68.
Layne, B. H., DeCristoforo, J. R., & McGinty, D. (1999). Electronic versus traditional
student ratings of instruction. Research in Higher Education, 40 (2), 221-
232.
Linse, A. R. (2010, Feb. 22). Building in-house online course eval system.
Professional and Organizational Development (POD) Network in Higher
Education, Listserv commentary.
Linse, A. R. (2012, April 27). Early release of the final course grade for students
who have completed the SRI form for that course. Professional and
Organizational Development (POD) Network in Higher Education, Listserv
commentary.
MacNell, L., A. Driscoll, and A. N. Hunt. (2014). What’s in a name: Exposing
gender bias in student ratings of teaching. Innovative Higher Education, 1–
13.
10
Marks, M., D. Fairris, & T. Beleche. (2010). Do Course Evaluations Reflect Student
Learning? Evidence from a Pretest/Posttest Setting . Riverside, CA:
University of California.
http://faculty.ucr.edu/~mmarks/Papers/marks2010course.pdf
Marlin, J. W. Jr. & J. F. Niss. (1980). End-of-course evaluations as indicators of
student learning and instructor effectiveness. The Journal of Economic
Education, Vol. 11, No. 2, 16-27.
Marsh, H. W. (2007). Students’ Evaluations of University Teaching:
Dimensionality, Reliability, Validity, Potential Biases and Usefulness in The
Scholarship of Teaching and Learning in Higher Education: An Evidence-
Based Perspective, (eds. R.P. Perry & J.C. Smart), pp 319-383.
Morgan, D. A., J. Sneed & L. Swinney. 2003. Are student evaluations a valid
measure of teaching effectiveness: perceptions of accounting faculty
members and administrators. Management Research News, 26 (7): 17-32.
Morley, D. (2014). Assessing the reliability of student evaluations of teaching:
Choosing the right coefficient. Assessment & Evaluation in Higher
Education 39 (2), 127–139.
Nulti, D. D. (2008). The adequacy of response rates to online and paper surveys:
What can be done? Assessment and Evaluation in Higher Education, 33 ,
301-314.
Perrett, J. I. (2013). Exploring graduate and undergraduate course evaluations
administered on paper and online: a case study. Assessment & Evaluation
in Higher Education, 38 (1). Available online at
http://www.tandfonline.com/doi/abs/10.1080/02602938.2011.604123#.VSwOAvnF-IC
Poropat, A. (2014). Students don’t know what’s best for their own learning. The
Conversation. Available at https://theconversation.com/students-dont-know-
whats-best-for-their-own-learning-33835
11
Porter, R. S., & Umbach, P. D. (2006). Student survey response rates across
institutions: Why do they vary? Research in Higher education, 47 (2), 229-
247.
Rampichini, C., Grilli, & Petrucci. (2004). Analysis of university course evaluations:
from descriptive measures to multilevel models. Statistical Methods and
Applications, 13 (3):; 357-373.
Sonner, B. S. (2010). A is for “Adjunct”: Examining Grade Inflation in Higher
Education. Journal of Education for Business, 76, 2000 - Issue 1, pages 5-8.
http://www.tandfonline.com/doi/abs/10.1080/08832320009599042
Spooren, P. & Van Loon. F. (2012). Who participates (not)? A non-response
analysis on students’ evaluations of teaching. Procedia - Social and
Behavioral Sciences 69: 990 – 996. Available online at
http://www.sciencedirect.com/science/article/pii/S1877042812054857
Schiekirka, S. & T. Raupach. (2015). A systematic review of factors influencing
student ratings in undergraduate medical education course evaluations.
BMC Medical Education, 15 (30). Available online at
http://www.biomedcentral.com/1472-6920/15/30
Sorenson, D. L., & Johnson, T. D. (Eds.). (2003). Online student ratings of
instruction. New Directions for Teaching and Learning (Vol. 96). San
Francisco: Jossey-Bass.
Sorenson, D. L., & Reiner, C. (2003). Charting the uncharted seas of online student
ratings of instruction. In D. L. Sorenson & T. D. Johnson (Eds.), Online
student ratings of instruction. New Directions for Teaching and Learning
(Vol. 96, pp. 1-24). San Francisco: Jossey-Bass.
Sprague, J. & K. Massoni. 2005. Student Evaluations And Gendered Expectations:
What We Can’t Count Can Hurt Us. Sex Roles: A Journal of Research 53, 11‐
12: 779‐793.
12
Stroebe, W. (2016). Why good teaching evaluations may reward bad teaching:
On grade inflation and other unintended consequences of student
evaluations. Perspectives on Psychological Science, Vol. 11 (6), 800- 816.
Stark, P.B. & Boring, A. (2014). Student evaluations of teaching (mostly) do not
measure teaching effectiveness. Science Open Research.
Stark, P.B. & R. Freishtat. (Sept. 2014). An evaluation of course evaluations.
Available online from Science Open Research at
https://www.scienceopen.com/document/vid/42e6aae5-246b-4900-8015-
dc99b467b6e4;jsessionid=U935-XeG5IRlPGAxfXeyagAb.slave:so-app2-prd?0
Taylor, A. & C. McQuiggan (2008). Faculty development programming: If we
build it, will they come? EDUCAUSE Quarterly, No. 3. Retrieved on Jan. 15,
2017 from http://er.educause.edu/articles/2008/8/faculty-development-
programming-if-we-build-it-will-they-come
Vasey, C. & L. Carroll. (May-June 2016). How Do We Evaluate Teaching? Findings
from a survey of faculty members. Academe, online magazine of the
America Association of University Professors (AAUP).
Venette, S., Sellnow, D., & McIntyre, K. (2010). Charting new territory: Assessing
the online frontier of student ratings of instruction. Assessment &
Evaluation in Higher Education, 35 (1), 97-111.
Wilson, K.L., A. Lizzio, & P. Ramsden (1997). The development, validation, and
application of the Course Experience Questionnaire. Studies in Higher
Education, Vol. 22, No. 1.
http://www.tandfonline.com/doi/abs/10.1080/03075079712331381121#a
HR0cDovL3d3dy50YW5kZm9ubGluZS5jb20vZG9pL3BkZi8xMC4xMDgwLzAz
MDc1MDc5NzEyMzMxMzgxMTIxQEBAMA==
Zumbach, J. & Funke, J. (May 2014). Influences of mood on academic course
evaluations. Practical Assessment, Research, & Evaluation, 19 (4).
Available online at http://pareonline.net/getvn.asp?v=19&n=4