Critical Thinking Inter-Institutional Team Final Report ... · Critical Thinking...

Critical Thinking Inter-Institutional Team Final Report: Assessing Assessment

Compiled by Mary Walczak, St. Olaf College With contributions from Nancy Bostrum, Macalester College David Lopatto, Grinnell College,

Robert McClure, St. Olaf College Karl Wirth, Macalester College Chico Zimmerman, Carleton College

Collectively, the Inter-institutional Critical Thinking Team assessed five distinct instruments:

CLA Collegiate Learning Assessment CAAP CT Collegiate Assessment of Academic Proficiency Critical

Thinking Test MAP Mentored Advanced Projects SO-CLASSE St. Olaf Collegiate Learning Assessment Survey on Student

Engagement CCTST California Critical Thinking Skills Test WSU CT Rubric Washington State University Critical Thinking Rubric

In this report, we will briefly describe our conclusions about each instrument and summarize recommendations for assessing critical thinking. Collegiate Learning Assessment

Three of our four institutions are finishing a four-year longitudinal study of student learning gains as measured by the Collegiate Learning Assessment (CLA). Overall, faculty have been pleased with the instrument as a valid measure of student abilities to reason with evidence. The scoring rubric utilized to evaluate student work is thorough. On the other hand, the scores are not broken down into clear categories (critical thinking, analytic reasoning, written communication, and problem-solving). It is, therefore, difficult to ascertain whether students are making specific gains in critical thinking, as distinct from the other skills in play. This, of course, is not a new problem in attempting to measure critical thinking, whose definition is notoriously hard to pin down, although most can agree on its general features.

In fact, three years ago, the inter-institutional team came up with the following qualities of critical thinking:

• ability and inclination to evaluate evidence • willingness/inclination to challenge received opinion • ability and inclination to assess one's own position, and to recognize what's known and

unknown • ability to combine understanding, appreciation, and evaluation of a text or argument • ability to identify theme, thesis, policy, argument, etc. in a discipline-appropriate way • some "intellectual virtues" that link to CT: curiosity, courage, commitment,

accountability • ability to recognize and critique assumptions, including one's own ability to "transfer

training" from one area of study to another • ability to assess data/information in a rigorous or disciplined way • an inclination to introspection, self-examination

This is a daunting list to try to assess with any one instrument, especially the more attitudinal aspects. The CLA, therefore, offers a window into some of the elements listed above, and it makes an admirable effort to construct real-life applications of skills associated with critical thinking. When used in conjunction with other data, it can be a useful part of assessing critical thinking. It is also noteworthy that the test attempts to measure “value-added” learning by controlling for factors such as ACT/SAT score and prior educational experience. The

CALL I: Critical Thinking Inter-institutional Team Final Report <<m/d/y>> 1

comparative possibilities with other institutions is a strong positive feature as well. The costs, both in actual fees and in administrative time, however, limit the test’s attractiveness.

An exacerbating factor in using the CLA is student recruiting. We are studying the class of 2009 and have asked them to take the CLA three times: as incoming first-year students, as rising juniors and as graduating seniors. In some cases there is evidence that rising juniors did not take the CLA seriously, for instance when they only spent 30 minutes completing a test that allows 3 hours time. Macalester College was quite concerned about the scores of rising juniors not reflecting their true abilities and conducted in-depth interviews with CLA participants to better understand the situation. The in-depth interviews provided valuable insight into the challenges of CLA administration, and helped to guide plans for the spring 2009 administration. Interviews revealed that students were not surprised by the decrease in scores overall and they cited several possible reasons for this outcome, including lack of motivation/lack of understanding of the purpose of the CLA, immediate cash reward regardless of performance or time on task, and competition with coursework or other obligations. The students were also asked what would motivate students to do their best on the CLA. The highest number of responses centered on making the CLA a meaningful experience with clear, high expectations. Students suggested that monetary incentives be linked to performance or time spent doing the CLA. Students also reported the following motivating factors: the CLA is important for Macalester, they want to see how their scores changed over the four years, and they are curious how Macalester compares to other institutions. Collegiate Assessment of Academic Proficiency Critical Thinking Test A much more manageable performance test to administer is the CAAP Critical Thinking Test. Carleton is just beginning a four-year longitudinal study of liberal arts learning (The Wabash National Study of Liberal Arts Education), part of which involves the use of this 40-minute test to measure student gains in critical thinking ability over time. While this test does not ask students to produce their own arguments, it does involve authentic work in evaluating arguments. This measure of critical thinking, while greatly limited in light of the list of features above, will be linked to a variety of other measures in the Wabash National Study (including attitudes toward diversity). Rather than utilizing a random sample of a class as the CLA does, the WNS will administer the CAAP CT test to the entire cohort. This will allow Carleton to conduct more extensive comparisons and analysis with other “local” and national data, which may provide a more appropriate comparison group than CLA institutions. Given the very limited scope of the CAAP CT test, it might be useful to administer the CAAP Reading Test, which has a separate sub-score for reasoning skills, and the CAAP Science Test, which has separate scores for analyzing and evaluating evidence. These three tests would provide a fuller picture of our students’ critical thinking skills, more in line with the features generated by the inter-institutional team. The trade-off is time and cost, although there is significant savings in administering more than one CAAP test. Mentored Advanced Projects Grinnell College’s Mentored Advanced Project program features a very simple and efficient evaluation form that documents actual student performance on authentic, original research. In other words, it evaluates critical thinking skills “in the wild.” The web-based form is easy to administer and is linked to a database, whose design and output features are extremely adaptable to the needs of the institution. As deployed by Grinnell, its limitation is that it does not necessarily provide a very accurate snapshot of critical thinking in the general student population, since the students taking on a MAP are highly self-selecting. However, at a school that requires a disciplinary capstone experience of all students, such an evaluation form could provide a great deal of assessment data with very little additional effort on the part of faculty (who are already evaluating the capstone products anyway). While the Grinnell tool has been created with its own


needs in mind, any school could produce a similar tool that addressed the issues or features of critical thinking that needed to be measured on its own campus. Over time, the data would provide excellent information for assessing effectiveness on the program and institutional level. It could also be linked with other assessment measures to ascertain “value-added” learning. An added benefit would be the campus-wide conversations necessary to generate the evaluation form. This would heighten faculty awareness of the elements of critical thinking that the college values and provide a more consistent vocabulary in discussing those elements with students. Because this assessment is embedded in work that students are already doing, its benefits are very high relative to the costs in development and administration. It is limited by its lack of comparability across institutions and potential problems with inter-rater reliability. St. Olaf Collegiate Learning Assessment Survey on Student Engagement

The Collegiate Learning Assessment of Student Engagement is a fifteen-minute questionnaire that was designed and piloted at St. Olaf in Spring 2006 to accompany the Collegiate Learning Assessment. Its purpose is to examine the dimensions of student educational experience that are most closely related to the learning outcomes measured by the CLA – thinking critically, interpreting data judiciously, solving problems imaginatively, and writing effectively. Items are adapted (with permission) from the National Survey of Student Engagement. The instrument was administered to seniors who completed the CLA in spring 2006 and to members of the class of 2009 as rising juniors. By examining the CLASSE results in light of the CLA results, we can determine whether students with different kinds of CLA results (e.g., “well below average” or “well above average” changes over their College years) also had different kinds of educational experiences.

Although this instrument is not a direct measure of students learning like some of the others assessed, valuable information can be obtained about student behaviors in relation to their course work. Furthermore, the connection between SO-CLASSE and the CLA allows analysis of CLA performance in relation to behaviors. The series of questions pertaining to receiving feedback from instructors and acting on that feedback provide data for an interesting faculty conversation. The main drawback of this instrument at this time is the staff time required for data analysis. California Critical Thinking Skills Test v. 2000 http://www.insightassessment.com/9test-cctst2k.html

The California Critical thinking Skills Test is widely used and based on the Delphi Expert Consensus Definition of Critical Thinking. The test is described by Insight Assessment, the distributing organization, as an “objective measure of critical thinking skills.” The 34 item multiple-choice exam uses familiar topics and asks for analysis and interpretation of provided information, inference and evaluation. Performance is reported on six scales: Analysis, Inference, Evaluation, Inductive Reasoning, Deductive reasoning, and Total Critical Thinking Skill.

To assess the impact of using this assessment instrument and other related class activities, Professors John Welckle and Bob McClure (Education, St. Olaf College) administered the California Critical Thinking Skills Test-Form 2000 as a pre and post assessment in their class "Human-Environment Interaction. A Geographic Perspective on Current Issues of Global Concern" in Fall 2008. In addition, a team of Center for Interdisciplinary Research (CIR) students under the direction of Professors McClure and Sharon Lane-Getaz (Statistics Education, St. Olaf College) conducted focus groups of students within the course to determine additional insight into the impact the rubric and related activities had on their development as critical thinkers. The results of this concurrent, mixed methods study will be presented at a conference in LaCrosse, WI later this Spring.


Washington State University Critical Thinking Rubric http://wsuctproject.wsu.edu/ctr.htm Professors Welckle and McClure also utilized a modified version of the Modified

Washington State University Critical & Integrative Thinking Rubric (4.24.07) in their Human-Environment Interaction course. The modified rubric was used primarily as a teaching device in an attempt to assist students in the development of their critical thinking skills.

The Washington State University Critical Thinking rubric is widely used and adopted. As a sidelight, the AAC&U Valid Assessment of Learning in Undergraduate Education (VALUE) Project Critical Thinking Metarubrics Team (see http://www.aacu.org/value/metarubrics.cfm) reviewed many rubrics based on the Washington State Rubric and developed a “metarubric” by distilling the essential aspects of critical thinking. The performance criteria for the critical thinking metarubric are: Explanation of issues; Investigation of evidence; Influence of context and assumptions; Own perspective, hypothesis, or position; Conclusions, implications and consequences. Macalester Learning Assessment

Faculty at Macalester College developed an assessment instrument called "Macalester Learning Assessment" (MLA) to assess student learning of all the outcomes in the Teagle Grant: critical thinking, quantitative reasoning, effective writing and global understanding. Other Macalester learning outcomes such as liberal arts and multiculturalism were also included in the MLA. During orientation of 2007, the MLA was administered to 155 First Year students. In the Summer of 2008, a group of 20 faculty met to examine the student responses to the MLA and to discuss the efficacy of this assessment instrument. The faculty concluded that the questions on the MLA did not elicit responses that were well aligned with the College's learning outcomes, or did not yield useful assessment data; these were largely attributed to the design of the instrument. In examining the scoring rubric for the CLA, in fact, the faculty decided that the CLA is a much better instrument for assessing student critical thinking abilities and questioned whether Macalester should pursue development of a home-grown critical thinking instrument. However, for the other learning outcomes (e.g., multiculturalism, global understanding, quantitative reasoning, and the liberal arts) it may be worth the effort to develop a new instrument.

Summary Critical thinking is a core component of a college education. In fact, there is likely no institution of higher education in the United States (or beyond, for that matter) that would disagree with the central role critical thinking plays in higher ed. At the same time, we would find variability in the institutional definitions of critical thinking, adding additional challenge to those interested in assessing this central outcome. Critical thinking is a vital, but slippery, educational good to measure, as evidenced by the multiple critical thinking qualities identified by the inter-institutional team.

There is room in higher education for all the assessment instruments we have considered here. Each instrument has its own strengths and weaknesses and instruments should be selected based on institutional needs. Future attempts to assess critical thinking can productively use any of these instruments in combination with other measures to improve the way institutions foster student learning.


Assessing Assessment Instruments: A Rubric for Comparing Multiple Instruments

I. Summary of descriptive information Instruments Assessed:

CLA Collegiate Learning Assessment CAAP CT Collegiate Assessment of Academic Proficiency Critical Thinking Test MAP Mentored Advanced Projects SO-CLASSE St. Olaf Collegiate Learning Assessment Survey on Student Engagement CCTST The California Critical Thinking Skills Test WSU CT Rubric Washington State University Critical Thinking Rubric

Instruments

CLA CAAP CT MAP (evaluated by Carleton)

MAP (evaluated by Grinnell)

SO-CLASSE CCTST WSU CT Rubric

Learning focus

Critical thinking, analytic reasoning, written communication, problems solving

Analyzing elements of arguments, evaluating arguments, extending arguments

Research and critical thinking skills

Independent Research Skills

Critical thinking, effective writing and quantitative reasoning.

Critical Thinking

Critical Thinking

Dimensions

Making and breaking actual arguments

Read and analyze arguments, make inferences

Independence, research design, intellectual curiosity, reading, sources of information, types of information, judging information, argumentation, evidence, and factual and theoretical context.

Student characteristics displayed during independent research projects including independence, intellectual curiosity, use of information, argumentation, and use of evidence.

Measures student experiences

CCTST scales: Analysis, Inference, Evaluation, Inductive Reasoning, Deductive reasoning, and Total Critical Thinking Skill score

Student behaviors, dispositions and writing.


Origins

CAE ACT Grinnell College Developed at Grinnell College by David Lopatto, Marci Sortor, and others.

Developed at St. Olaf College by IR&E based on items from NSSE (with permission)

Developed and marketed by Insight Assessment based upon the Delphi Expert Consensus Definition of Critical Thinking

Washington State University

Administrators

School School School School School School/Faculty Faculty

Sample

Random sample Entire cohort Self-selecting sample

It is administered to students, normally upper-level, who participate in a MAP

Random sample of students completing the CLA

Can be used in many ways. Here it was used at the course level in part as a self-assessment guide.

Used in classroom teaching. Students can use it as a self-assessment guide.

Item types

Performance task Multiple choice checklist Rating scales with four levels.

Rating Scale and Short Answer

Multiple Choice Rating scales

Performance vs. perceptions

Performance Performance Performance Performance Perceptions Perceptions Perceptions

Customization

No Yes Yes Yes Yes No Yes

Technology

Web-based Paper Web-based Web-based, but could be done on paper.

Paper, but could be web-based

Paper Paper

Completion time

90 minutes 40 minutes 15 minutes 10 minutes. 15 minutes 30 minutes Time varies depending on how many dimensions are scored


Testing window

Fall/spring Year-round End of term after the completion of a project.

In conjunction with institutional use of NSSE or as a stand alone instrument.

Used as a pre and post assessment

Frequently and ongoing during course

Cost

Unknown 13.50 per student None None None Varies depending on number of sheets scored. We paid $610 for 100 sheet capacity

None

Reports

1800 pt. scale with national benchmarks

Percentile with national norms

As designed by school

Frequencies are compiled and reported.

Results are reported as frequencies and percentages.

Insight Assessment. Can not be self-scored

Students can self-assess or faculty can score

Data files

Unknown Unknown Unknown N.A. Student-identifiable data that is available.

Data is scored and analyzed by Insight Assessment. Results are returned.

Raw data is kept by either student or professor

Data security

Unknown Unknown unknown The same as other files on the college computing system

Identifiable data is only available to IR&E staff.

Unknown, instructor kept raw data

Completed sheets are kept by professor or returned to students just like other papers or exams.

Comparisons

Yes Yes No Not yet possible No Yes No

Additional info


II. Summary of evaluative information

Instruments

Characteristics CLA CAAP CT

MAP (evaluated by

Carleton)

MAP (evaluated by

Grinnell) SO-CLASSE CCTST WSU CT Rubric

Mission synergy 2 2 2 2 2 1 2 Conceptual alignment 1 1 2 2 1 1 2 Credibility 1 1 2 2 2 2 2 Validity/reliability 2 2 1 2 1 2 1 Manageability 0 1 2 2 1 2 2 Representativeness 1 2 0 2 1 2 2 Actionability 1 1 1 2 2 1 2 Cost-effectiveness 1 1 2 2 1 0 2 Sustainability 0 1 1 2 2 0 2 Other Overall quality 1 1 2 2 1 1 2

Sum of Scores 10 13 15 20 14 12 19

Comments:


Assessing Assessment Instruments: Carleton College

Name of instrument: Collegiate Learning Assessment I. Descriptive information Learning focus: What intended learning outcome(s) is the instrument intended to measure? The test measures critical thinking, analytic reasoning, written communication and problems solving. Dimensions: What aspects of each learning outcome is it intended to measure (e.g., knowledge, attitudes, experiences, behaviors, etc.)? The test measures actual skills in analyzing and using evidence to make arguments and to “break” arguments. Origins: Who developed it and how? The test was developed by the Council for Aid to Education, a non-profit formerly affiliated with the RAND Corporation. Administrators: Who administers it (company, non-profit, the school itself)? The test is administered on campus by the school itself. Sample: Who can complete it (first-years, seniors, any class, etc.)? The test is typically given to incoming first year students in the fall, and then to a comparison group in spring, either the same cohort of first-year students or seniors to provide cross-sectional data on “value-added.” Item types: Are items multiple-choice, short-answer, essay, rating scales, performance tasks, or a combination? The test is based on performance tasks and analytical writing tasks. Performance vs. perceptions: Does the instrument provide a direct measure of what students know/can do, or indirect measures based on self-reporting and perceptions? The test measures what students can do, but does not depend on them having any particular or specialized knowledge. Customization: Can institutions add their own items? The test is not customizable. Technology: How is it completed (paper, on-line, either)? What platforms are required? The test is web-based, amenable to any platform. Completion time: How long does it take to complete it? The test requires 90 minutes. Testing window: When and how often can it be administered? Tests are administered in the fall and in the spring. Cost: What does it cost per administration? Unknown, but I’m guessing a lot. Reports: Who does the scoring? How is it scored? How are results reported (simple frequencies and percentages, indexes, benchmarks)? CAE does the scoring on an 1800 point scale (or more). Data is reported with benchmarks and national comparisons. Data files: Are raw data files returned to the institution? Are the data identifiable so a student’s results can be linked with other information about that student? Data can be linked to individual student’s information, although the unit of analysis is purported to be primarily the institution. Data security: To what extent, and how, is data security maintained? Unknown Comparisons: Do institutions receive aggregated data from other institutions? Can they request specific comparison groups? Can the instrument be used to track changes in student outcomes over time? Comparative data is available in aggregate from other institutions. Longitudinal studies are possible. Additional information:


II. Evaluation rubric For each of the characteristics listed below, mark the box that indicates your evaluation of the quality of the instrument with respect to that characteristic. If you wish, add comments in the box (or on the reverse) to explain why you evaluated the instrument in this way. The number next to each rating indicates the score you will enter into the accompanying rubric for comparing multiple instruments.

Instrument characteristics Excellent (2) Satisfactory (1) Poor (0) Mission synergy: The instrument measures outcomes appropriate to the mission of the institution and/or program being assessed.

XX

Conceptual alignment: The definition of the outcome implicit in the instrument fits the definition of the outcome by the institution or program.

XX

Credibility: The instrument answer questions posed by faculty and administrators in ways they are likely to find meaningful and persuasive

XX

Validity and reliability: The instrument measures what it claims to measure and results are consistent over time when the conditions are consistent

XX

Manageability: The instrument can be administered with reasonable institutional effort

XX

Representativeness: Recruitment of participants yields representative data

XX

Actionability: Faculty can use what is learned from the instrument to improve curriculum and instruction and to strengthen student learning

XX

Cost-effectiveness: The cost of collecting and analyzing the data is commensurate with the knowledge gained

XX

Sustainability: There is institutional support (staff, funds, time) to continue this assessment in a reasonable manner.

XX

Other characteristic (describe):

Overall quality: Viewed holistically, the instrument supports mission-driven, meaningful, and manageable assessment.

XX


Name of instrument: CAAP Critical Thinking Test I. Descriptive information Learning focus: What intended learning outcome(s) is the instrument intended to measure? Analysis of elements of an argument; evaluation of an argument; extension of an argument. The test “assess[es] the students' ability to identify essential elements of an argument, including hypotheses, premises, and conclusions, and also their ability to identify logical fallacies, exaggerated claims, unstated assumptions, analogies, and multiple points of view. Students are also tested regarding their ability to analyze the structure of arguments, including their ability to distinguish between statements of fact and opinion, to make judgments about equivalent and nonequivalent statements, and to recognize inductive and deductive arguments and supported and unsupported claims. Also tested is students' ability to recognize patterns and sequences of arguments, including their ability to see relationships of premises, sub-arguments, and sub-conclusions to the overall argument.” Dimensions: What aspects of each learning outcome is it intended to measure (e.g., knowledge, attitudes, experiences, behaviors, etc.)? The test measures knowledge of critical thinking as embodied through the ability to read critically and analyze arguments and make inferences. It does not ask students to make their own arguments. Origins: Who developed it and how? The test was developed by the non-profit company ACT as part of a suite of tests in the Collegiate Assessment of Academic Proficiency. It is intended as a standardized measure of actual student abilities that will allow comparison of achievement across campuses. Administrators: Who administers it (company, non-profit, the school itself)? The test is administered by the school itself and graded by ACT. Scores are returned with data about national norms. Sample: Who can complete it (first-years, seniors, any class, etc.)? Any class can complete it, although it is most often used to show “value-added” performance longitudinally. Item types: Are items multiple-choice, short-answer, essay, rating scales, performance tasks, or a combination? Multiple choice. Performance vs. perceptions: Does the instrument provide a direct measure of what students know/can do, or indirect measures based on self-reporting and perceptions? The test measures actual student performance, although it is limited by the multiple choice format. Customization: Can institutions add their own items? Institutions can add up to 9 localized questions. Technology: How is it completed (paper, on-line, either)? What platforms are required? The test is currently only available in paper format. Completion time: How long does it take to complete it? The test takes 40 minutes. Testing window: When and how often can it be administered? The test can be given at any time. Cost: What does it cost per administration? The cost is $13.50 per student for 1-500 students, slightly less for more than 500 students.


Reports: Who does the scoring? How is it scored? How are results reported (simple frequencies and percentages, indexes, benchmarks)? ACT does the scoring. Results are given in percentiles with national norms as well as institutional norms. Data files: Are raw data files returned to the institution? Are the data identifiable so a student’s results can be linked with other information about that student? Unknown as to raw data. ACT can link individual data to other tests administered by ACT. Data security: To what extent, and how, is data security maintained? ACT has a solid reputation, although I am unaware of its actual security policies and procedures Comparisons: Do institutions receive aggregated data from other institutions? Can they request specific comparison groups? Can the instrument be used to track changes in student outcomes over time? Yes, comparative data are available and longitudinal studies are common. Additional information: II. Evaluation rubric For each of the characteristics listed below, mark the box that indicates your evaluation of the quality of the instrument with respect to that characteristic. If you wish, add comments in the box (or on the reverse) to explain why you evaluated the instrument in this way. The number next to each rating indicates the score you will enter into the accompanying rubric for comparing multiple instruments.


XX


XX (a much narrower definition of CT)


XX


XX


XX


XX


XX

Cost-effectiveness: The cost of collecting and analyzing the data is commensurate with the

XX


knowledge gained Sustainability: There is institutional support (staff, funds, time) to continue this assessment in a reasonable manner.

XX



XX


Name of instrument: MAP Report Form I. Descriptive information Learning focus: What intended learning outcome(s) is the instrument intended to measure? Mentored Advanced Projects (MAPS) are independent research projects that are guided by faculty. They can be associated with a particular course, but are often part of a summer research project being conducted by the professor. The projects must result in a visible product that is then evaluated with the MAP Report Form. The form addresses the student’s research and critical thinking skills. Dimensions: What aspects of each learning outcome is it intended to measure (e.g., knowledge, attitudes, experiences, behaviors, etc.)? There are ten dimensions evaluated on a 4-point scale: independence, research design, intellectual curiosity, reading, sources of information types of information, judging information, argumentation, evidence, and factual and theoretical context. Origins: Who developed it and how? The report form was developed at Grinnell College by David Lopatto and colleagues in response to the need to regularize the MAP program and to give consistent credit to faculty for leading them. Administrators: Who administers it (company, non-profit, the school itself)? The form is administered by the school and filled out by the directing professor. Sample: Who can complete it (first-years, seniors, any class, etc.)? MAPS are open to any student, but the emphasis on advanced, independent research limits it to upper-class students in practice. Item types: Are items multiple-choice, short-answer, essay, rating scales, performance tasks, or a combination? Each of the ten dimensions is rated with one of 4 descriptive statements reflecting student achievement or behavior. For example, the four possibilities for the dimension of “evidence” are: 1) Student does not use evidence 2) Student uses evidence without judging its quality 3) Student manipulates evidence to fit his/her preconceptions 4) Student considers relevant evidence fairly. Since the statements are always listed in order of increasing achievement, a higher overall “score” indicates higher performance in critical thinking. Performance vs. perceptions: Does the instrument provide a direct measure of what students know/can do, or indirect measures based on self-reporting and perceptions? The report is based on actual student work, reflecting both the process and the final product of research. Customization: Can institutions add their own items? This format is ideal for customization. Technology: How is it completed (paper, on-line, either)? What platforms are required? The form is available on line and is linked to a database. Completion time: How long does it take to complete it? The form takes about 10-15 minutes to complete, but it comes at the end of a long advising/mentoring relationship. Testing window: When and how often can it be administered? This is limited only by faculty time and attention and student willingness to conduct independent research. Cost: What does it cost per administration? There are start up costs for establishing the web-based form and linking it to a database. Small maintenance costs after that. Reports: Who does the scoring? How is it scored? How are results reported (simple frequencies and percentages, indexes, benchmarks)?


Faculty do all the “scoring” of individual students. Analysis of the data is left to the school or interested offices. Data files: Are raw data files returned to the institution? Are the data identifiable so a student’s results can be linked with other information about that student? The data certainly could be linked to other institutional data concerning the student. Data security: To what extent, and how, is data security maintained? Unknown. Comparisons: Do institutions receive aggregated data from other institutions? Can they request specific comparison groups? Can the instrument be used to track changes in student outcomes over time? There would be no comparative data outside the institution, but intramural data could be easily collected and compared. Longitudinal tracking could only be done if a similar report were filled out for a student early in his/her career. Additional information: II. Evaluation rubric For each of the characteristics listed below, mark the box that indicates your evaluation of the quality of the instrument with respect to that characteristic. If you wish, add comments in the box (or on the reverse) to explain why you evaluated the instrument in this way. The number next to each rating indicates the score you will enter into the accompanying rubric for comparing multiple instruments.


XX


XX


XX


XX


XX


XX (this would be improved if

it were adopted for mandatory senior capstone

experiences) Actionability: Faculty can use what is learned from the instrument to improve curriculum and

XX



XX


XX


XX (this really depends on

faculty buy-in)



XX


Assessing Assessment Instruments: A Rubric for Evaluating Individual Instruments

Name of instrument: Grinnell College MAP evaluation form. (MAP stands for Mentored Advanced Project; a more generic

name might be “research and critical thinking form”) I. Descriptive information Learning focus: What intended learning outcome(s) is the instrument intended to measure? Originally, at Grinnell College, the form is used by a faculty member who has directed a student mentored advanced project, normally an individually guided piece of independent research designed for dissemination or publication. Dimensions: What aspects of each learning outcome is it intended to measure (e.g., knowledge, attitudes, experiences, behaviors, etc.)? The items evaluate student characteristics displayed during the project such as independence and intellectual curiosity, as well as thinking skills such as use of information, argumentation, and use of evidence. Origins: Who developed it and how? The rubric was developed at Grinnell College by a group that included David Lopatto, Professor of Psychology, Marci Sortor, Professor of History, and others. Administrators: Who administers it (company, non-profit, the school itself)? The college administers it. Sample: Who can complete it (first-years, seniors, any class, etc.)? It is administered to students, normally upper-level, who participate in a MAP. Item types: Are items multiple-choice, short-answer, essay, rating scales, performance tasks, or a combination? The key items are rating scales with four levels. Performance vs. perceptions: Does the instrument provide a direct measure of what students know/can do, or indirect measures based on self-reporting and perceptions? The information is direct in the sense that it is an evaluation of student performance by faculty. Customization: Can institutions add their own items? The form is not tied to any computer platform or other method. Institutions can and have taken the form and employed it as they see fit. Technology: How is it completed (paper, on-line, either)? What platforms are required? Grinnell administers the form locally on-line. It could be done on paper. Completion time: How long does it take to complete it? 10 minutes if the faculty evaluator has thought about the student project or already graded it. Testing window: When and how often can it be administered? Grinnell administers it one time after the completion of a MAP. Cost: What does it cost per administration? Negligible. Reports: Who does the scoring? How is it scored? How are results reported (simple frequencies and percentages, indexes, benchmarks)? Frequencies are compiled and reported. In addition, we have attempted some validity studies in which we correlate the data with other measures of student performance, such as grades.


Data files: Are raw data files returned to the institution? Are the data identifiable so a student’s results can be linked with other information about that student? N.A. Data security: To what extent, and how, is data security maintained? The files are protected in the same way that other files are protected on the college computing system. Comparisons: Do institutions receive aggregated data from other institutions? Can they request specific comparison groups? Can the instrument be used to track changes in student outcomes over time? Not yet possible. Additional information: II. Evaluation rubric For each of the characteristics listed below, mark the box that indicates your evaluation of the quality of the instrument with respect to that characteristic. If you wish, add comments in the box (or on the reverse) to explain why you evaluated the instrument in this way. The number next to each rating indicates the score you will enter into the accompanying rubric for comparing multiple instruments.


The form is customized for Grinnell’s purposes.


Same.


We have piloted this form with faculty from across the disciplines to make sure the language is acceptable to all.


So far, our data show positive correlations with student GPA.


It is easy to administer.


All MAP students are evaluated. This is a census of the entire group.

Actionability: Faculty can use what is learned from Yes.


the instrument to improve curriculum and instruction and to strengthen student learning Cost-effectiveness: The cost of collecting and analyzing the data is commensurate with the knowledge gained

It is cheap to use.


Our institutional research office supports the project.



We think so.



Name of instrument: The Collegiate Learning Assessment Survey on Student Engagement (CLASSE) I. Descriptive information Learning focus: What intended learning outcome(s) is the instrument intended to measure? This instrument spans critical thinking, effective writing and quantitative reasoning. Dimensions: What aspects of each learning outcome is it intended to measure (e.g., knowledge, attitudes, experiences, behaviors, etc.)? CLASSE measures student experiences Origins: Who developed it and how? CLASSE was developed by the St. Olaf College Office Institutional Research & Evaluation (IR&E) from items from the National Survey on Student Engagement (NSSE), with permission. Administrators: Who administers it (company, non-profit, the school itself)? St. Olaf College (IR&E) administered the CLASSE. Sample: Who can complete it (first-years, seniors, any class, etc.)? Seniors completing the NSSE (Spring 2006) and Juniors (Class of 09, Fall 07) Item types: Are items multiple-choice, short-answer, essay, rating scales, performance tasks, or a combination? Rating Scale and Short Answer Performance vs. perceptions: Does the instrument provide a direct measure of what students know/can do, or indirect measures based on self-reporting and perceptions? Indirect measure based on experiences and perceptions. Customization: Can institutions add their own items? Yes Technology: How is it completed (paper, on-line, either)? What platforms are required? CLASSE was completed by email, but could be administered in paper or electronic form. Completion time: How long does it take to complete it? 15 minutes Testing window: When and how often can it be administered? In conjunction with institutional use of NSSE or as a stand alone instrument.


Cost: What does it cost per administration? No direct costs; staff time to administer and analyze data. Reports: Who does the scoring? How is it scored? How are results reported (simple frequencies and percentages, indexes, benchmarks)? IR&E did the scoring, results are reported as frequencies and percentages. Data files: Are raw data files returned to the institution? Are the data identifiable so a student’s results can be linked with other information about that student? Since the institution administers the survey raw data that is identifiable by student is available. Data security: To what extent, and how, is data security maintained? Identifiable data is only available to IR&E staff. Comparisons: Do institutions receive aggregated data from other institutions? Can they request specific comparison groups? Can the instrument be used to track changes in student outcomes over time? N/A Additional information: II. Evaluation rubric For each of the characteristics listed below, mark the box that indicates your evaluation of the quality of the instrument with respect to that characteristic. If you wish, add comments in the box (or on the reverse) to explain why you evaluated the instrument in this way. The number next to each rating indicates the score you will enter into the accompanying rubric for comparing multiple instruments.


Critical thinking, effective writing and quantitative

reasoning are central to the mission of the College.


Some discrepancy across disciplines, especially regarding

writing 10-20 page papers.


Breaks out the extent to which students received and acted upon

feedback from faculty.


Two administrations of the instrument to seniors and juniors

show consistent results.



Administration is fairly easy; data analysis takes significant

staff time.


Sample sizes (73 and 78 students) are nearly 10% of the class; but we expect some self-selection issues from those who

participated.


Yes, in particular the difference between what students report about receiving and acting on

feedback suggests some changes to our practices to increase these

numbers.


Probably, although it is hard to ascertain the cost of staff time in

data analysis.


Yes



X



Name of instrument: California Critical Thinking Skills Test - CCTST Form 2000 I. Descriptive information Learning focus: What intended learning outcome(s) is the instrument intended to measure? Critical thinking Dimensions: What aspects of each learning outcome is it intended to measure (e.g., knowledge, attitudes, experiences, behaviors, etc.)? This assessment measures critical thinking skills. Origins: Who developed it and how? Developed and marketed by Insight Assessment based upon the Delphi Expert Consensus Definition of Critical Thinking Administrators: Who administers it (company, non-profit, the school itself)? The instrument was administered by faculty (myself and John Welckle) Sample: Who can complete it (first-years, seniors, any class, etc.)? Any level can use it. We teach it to the students and have them use it as a self-assessment guide. Item types: Are items multiple-choice, short-answer, essay, rating scales, performance tasks, or a combination? Multiple choice Performance vs. perceptions: Does the instrument provide a direct measure of what students know/can do, or indirect measures based on self-reporting and perceptions? More indirect/subjective measures of perceptions of critical thinking strength. Customization: Can institutions add their own items? No Technology: How is it completed (paper, on-line, either)? What platforms are required? It is completed in paper. Completion time: How long does it take to complete it? 30 minutes Testing window: When and how often can it be administered? We used it as a pre and post assessment. Research on test indicates this is OK. Cost: What does it cost per administration? Varies depending on number of sheets scored. We paid $610 for 100 sheet capacity. Reports: Who does the scoring? How is it scored? How are results reported (simple frequencies and percentages, indexes, benchmarks)? Insight Assessment. Can not be self-scored Data files: Are raw data files returned to the institution? Are the data identifiable so a student’s results can be linked with other information about that student? Data is scored and analyzed by Insight Assessment. Results are returned. Data security: To what extent, and how, is data security maintained? It is scored by Insight. Professor McClure kept raw data. Do institutions receive aggregated data from other institutions? Can they request specific comparison groups? Can the instrument be used to track changes in student outcomes over time? Yes. Additional information:


II. Evaluation rubric For each of the characteristics listed below, mark the box that indicates your evaluation of the quality of the instrument with respect to that characteristic. If you wish, add comments in the box (or on the reverse) to explain why you evaluated the instrument in this way. The number next to each rating indicates the score you will enter into the accompanying rubric for comparing multiple instruments.


1


1


2


2


2


2 no recruitment needed


1


0


0



1



Name of instrument: Modified Washington State University Critical & Integrative Thinking Rubric (4.24.07) I. Descriptive information Learning focus: What intended learning outcome(s) is the instrument intended to measure? Critical thinking Dimensions: What aspects of each learning outcome is it intended to measure (e.g., knowledge, attitudes, experiences, behaviors, etc.)? This rubric is intended to assess behaviors. It also addresses, to an extent, dispositions and writing. Origins: Who developed it and how? Administrators: Who administers it (company, non-profit, the school itself)? The instrument was administered by faculty (myself and John Welckle Sample: Who can complete it (first-years, seniors, any class, etc.)? Any level can use it. We teach it to the students and have them use it as a self-assessment guide. Item types: Are items multiple-choice, short-answer, essay, rating scales, performance tasks, or a combination? Rating scales used to assess seven dimensions of critical thinking. Performance vs. perceptions: Does the instrument provide a direct measure of what students know/can do, or indirect measures based on self-reporting and perceptions? More indirect/subjective measures of perceptions of performance. Customization: Can institutions add their own items? Yes. We did. Technology: How is it completed (paper, on-line, either)? What platforms are required? It is a one-page paper. Completion time: How long does it take to complete it? Time varies depending on how many dimensions are scored. It is fairly easy to administer, but a great teaching tool. Testing window: When and how often can it be administered? We use it frequently and ongoing throughout our course. Cost: What does it cost per administration? 0 Reports: Who does the scoring? How is it scored? How are results reported (simple frequencies and percentages, indexes, benchmarks)? Students can self-assess or faculty can score. Data files: Are raw data files returned to the institution? Are the data identifiable so a student’s results can be linked with other information about that student? Raw data is kept by either student or professor. Data security: To what extent, and how, is data security maintained? It is a public document. Completed sheets are kept by professor or returned to students just like other papers or exams. Comparisons: Do institutions receive aggregated data from other institutions? Can they request specific comparison groups? Can the instrument be used to track changes in student outcomes over time? No.


Additional information: II. Evaluation rubric For each of the characteristics listed below, mark the box that indicates your evaluation of the quality of the instrument with respect to that characteristic. If you wish, add comments in the box (or on the reverse) to explain why you evaluated the instrument in this way. The number next to each rating indicates the score you will enter into the accompanying rubric for comparing multiple instruments.


2


2


2


1


2


2 no recruitment needed


2


2


2



2


Date post:	07-Jan-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Critical Thinking Inter-Institutional Team Final Report ... · Critical Thinking...

Documents