DRAFT: July 18, 2013 Introduction S… · WY State Model Educator Support and Evaluation System....

WY State Model Educator Support and Evaluation System. July 18, 2013 1

THE WYOMING STATE MODEL EDUCATOR SUPPORT AND EVALUATION SYSTEM

DRAFT: July 18, 2013

Introduction

The Advisory Committee to the Wyoming Select Committee on Educational Accountability was

charged with carrying out the recommendations put forth in the Wyoming Accountability in

Education Act of 2012 (WEA 65) and House Bill 0072 (2013 Chapter 167). The specific charge

for the Advisory Committee was to design a State Model for educator evaluation in Wyoming.

The Select Committee was quite clear that they wanted a balance between state and local control

and, in keeping with Wyoming’s educational philosophy, the Select Committee placed

considerable authority for making specific design and implementation decisions with local

educational leaders and teachers. However, in order to best support the work of districts, the

Advisory Committee produced this document: The Wyoming State Model for Educator Support

and Evaluation System. This model system outlines methods and design decisions necessary for

implementing an educator evaluation system and indicates where the Advisory Committee

recommends where the requirements should be “tight” or more standardized across districts and

where flexibility is expected and even encouraged. The Advisory Committee intends for the

Model System described below to be able to be used by districts as the basis for their local

systems if they choose. The Model System will not be “plug and play” in that local districts will

still have many decisions to make to operationalize their local system, but this Model System is

designed to make districts’ jobs considerably easier.

A critical aspect of the model system, as reflected below in the key principles, is the intention to

build both an internally coherent system and an educator evaluation system that is coherent with

other educational accountability systems in Wyoming. A coherent system would use

information from the school accountability system (and perhaps the district accreditation system)

to supplement the information generated from the educator accountability systems. For example,

if a school has demonstrated high achievement and the students are growing at admirable rates,

there is good evidence of high quality education in the school. This suggests that the State (likely

WDE) can trust that the educators in the building are performing well. Relying on the larger

sample sizes associated with the school than any individual teacher means that the

determinations are that much more reliable. This intent to build off of the information from the

school accountability does not relieve school districts from implementing educator evaluation

systems, but it could mean that the state would have to provide far less oversight of educator

evaluation systems in high performing schools.


Key Principles

The following principles guided the development of the Wyoming State Model Support and

Evaluation System. The Advisory Committee kept these principles at the center of its

deliberations in the development of the various components of the system and these principles

are at the heart of the recommendations discussed throughout this document. As noted below,

the primary purpose of the system is to maximize student learning and improvements in student

learning. The system must maintain the focus on student learning and all of the following

principles support this primary purpose.

1. The primary purpose of Wyoming’s educator evaluation system to support and

promote increases in student learning in Wyoming schools such that all Wyoming

students graduate ready for college or careers.

2. The system must be designed coherently to support a system of continuous school

improvement. A coherent system will work with the school and leader accountability

systems and foster collaboration among educators, administrators, and other

stakeholders.

3. The State Model and locally-aligned versions of the system shall be designed to

promote opportunities for meaningful professional growth of educators by providing

specific and timely feedback on multiple aspects of professional practice and student

learning.

4. The system must be designed and implemented with integrity. A system designed

with integrity will be transparent such that all relevant participants clearly understand

the expectations.

5. The State Model must allow for flexibility to best fit local contexts and needs. The

local evaluation systems should be design collaboratively by administrators and

educators, with input gathered from parents and community members.

6. The system will provide credible information to support hiring, placement, and career

ladder decisions in a defensible manner.

7. The system must be supported by local and state policy makers to ensure that leaders

and teachers have the proper opportunities and resources to successfully implement

the system.

Domains of a the Wyoming Educator Evaluation State Model

A key aspect of the State Model is that it will contain five major components, four domains of

professional practice and one domain of student performance data. The four domains of

professional practice noted below represent the overarching categories of the Interstate Teacher


Assessment and Support Consortium Model Core Teaching Standards (InTASC Standards)1.

Districts will use a variety of tools to measure professional practice (e.g., Danielson’s State

Model for Effective Teaching; Marzano’s Art and Science of Teaching). The Advisory

Committee does not want to limit the options to specific tools, but recommends that all local

systems measure the four domains of effective teaching described in the InTASC Standards and

that district leaders document the degree to which its selected tool provides evidence for each of

the four domains.

1. Learner and Learning

2. Content Knowledge

3. Instructional Practice

4. Professional Responsibility

The State Model is designed to promote coherence and integration among the five domains.

Therefore, the Advisory Committee recommends weighting each component, especially student

learning, as equally as possible in the overall evaluation of each educator. Further, there is an

important difference between nominal (intended) and effective (actual) weights and the Advisory

Committee recommends that as each district pilots its system, it analyzes the data to determine

the actual weight of the various domains. This actual weighting will depend on the variability in

the responses to the specific instruments used in each district. In the following sections, the

major components of the State Model are discussed in more detail.

The Advisory Committee recommends requiring that districts provide evidence that any tool

used for evaluating teacher practices validly measures all four domains. The Advisory

Committee was willing to allow districts to alter the weighting of the various domains as long as

student learning counts at least 20% and that all domains are fully evaluated in each three year

period. The Advisory Committee further recommends that teachers in each district have input

into the weighting decisions of each district’s system.

Standards of Professional Practice

The State Model uses InTASC Standards as the framework for evaluating teachers relative to the

four domains of effective teaching. This recommendation is based in part on ensuring that the

State Model is not tied to any commercial products, but to open source materials widely used by

multiple states and districts. Local districts may adopt tools or approaches to add more

specificity to the InTASC Standards, but the Advisory Committee recommends requiring

districts to document that any tools used in their local model are supported by research

1 Council of Chief State School Officers. (2011, April). Interstate Teacher Assessment and Support Consortium

(InTASC) Model Core Teaching Standards: A Resource for State Dialogue. Washington, DC: Author.

http://www.ccsso.org/Resources/Programs/Interstate_Teacher_Assessment_Consortium_(InTASC).html

http://www.ccsso.org/Resources/Programs/Interstate_Teacher_Assessment_Consortium_(InTASC).html


supporting or at least best practice. The specific InTASC Standards, grouped by domain are

presented below. For a more complete explanation of the standards, please refer to the InTASC

document referenced in the footnote.

Learner and Learning

Standard #1: Learner Development. The teacher understands how learners grow and develop,

recognizing that patterns of learning and development vary individually within

and across the cognitive, linguistic, social, emotional, and physical areas, and

designs and implements developmentally appropriate and challenging learning

experiences.

Standard #2: Learning Differences. The teacher uses understanding of individual differences

and diverse cultures and communities to ensure inclusive learning environments

that enable each learner to meet high standards.

Standard #3: Learning Environments. The teacher works with others to create environments

that support individual and collaborative learning, and that encourage positive

social interaction, active engagement in learning, and self motivation.

Content Knowledge

Standard #4: Content Knowledge. The teacher understands the central concepts, tools of

inquiry, and structures of the discipline(s) he or she teaches and creates learning

experiences that make the discipline accessible and meaningful for learners to

assure mastery of the content.

Standard #5: Application of Content. The teacher understands how to connect concepts and

use differing perspectives to engage learners in critical thinking, creativity, and

collaborative problem solving related to authentic local and global issues.

Instructional Practice

Standard #6: Assessment. The teacher understands and uses multiple methods of assessment to

engage learners in their own growth, to monitor learner progress, and to guide the

teacher’s and learner’s decision making.

Standard #7: Planning for Instruction. The teacher plans instruction that supports every

student in meeting rigorous learning goals by drawing upon knowledge of content

areas, curriculum, cross-disciplinary skills, and pedagogy, as well as knowledge

of learners and the community context.

Standard #8: Instructional Strategies. The teacher understands and uses a variety of

instructional strategies to encourage learners to develop deep understanding of

content areas and their connections, and to build skills to apply knowledge in

meaningful ways.


Professional Responsibility

Standard #9: Professional Learning and Ethical Practice. The teacher engages in ongoing

professional learning and uses evidence to continually evaluate his/her practice,

particularly the effects of his/her choices and actions on others (learners, families,

other professionals, and the community), and adapts practice to meet the needs of

each learner.

Standard #10: Leadership and Collaboration. The teacher seeks appropriate leadership roles

and opportunities to take responsibility for student learning, to collaborate with

learners, families, colleagues, other school professionals, and community

members to ensure learner growth, and to advance the profession.

Performance Standards

All Wyoming schools, as determined by their districts, will classify all licensed personnel, as

illustrated by the State Model, as highly effective, effective, needs improvement, and

ineffective based on data from measures of the standards for professional practice and measures

of student performance. The evaluation system will produce an overall rating for each teacher.

To arrive at an overall rating, a description of performance that characterizes the types of

knowledge, skills, dispositions, and behaviors of an “effective” teacher (as well as other levels)

must be described. Further, if there is any hope in comparable ratings across the state, common

performance level descriptors must be used. Performance standards describe “how good is good

enough” and the “performance level descriptor” (PLD) is the narrative component of the

performance standard that describes the key qualities that differentiate educators at each of the

various levels.

The InTASC Standards provide performance descriptors for each of the ten standards, but they

do not provide an overall description for various levels of teacher effectiveness. One might ask,

why not require educators to meet the requirements on each of the ten standards in order to be

classified as effective? Such a conjunctive system where candidates must meet every threshold

in order to be classified as “effective” is both unrealistic and unreliable. No Child Left Behind’s

(NCLB) Adequate Yearly Progress (AYP) system is the most recent, well known example of a

conjunctive system that leads to many unreliable in invalid decisions. Therefore, a more

compensatory approach where stronger performance in one area may offset weaker performance

in other areas is more reliable and often much more realistic. Further, hybrid systems can clearly

value important aspects of the domain while allowing some compensatory decisions elsewhere in

the system. Therefore, an educator evaluation system that results in an overall classification for

each educator must also include an omnibus description of educator effectiveness. This

definition is also critical to help guide the data collection and validity evaluation of the system.


The State Model provides PLDs for each of the four overall levels of the system. These

descriptors connect the standards for professional practice with the various data produced by the

measurement instruments used in the system. This overall description is necessary, because an

effective teacher is not necessarily a simple sum of the scores on the various

components/indicators in the system. Further, defining an effective teacher as one who is

effective on each component will establish a “conjunctive” system (e.g., NCLB-AYP) with the

potential negative consequence of having very few teachers classified as effective or highly

effective. DRAFT PLDs Wyoming’s State Model Support and Evaluation System are presented

below. Each PLD essentially describes the final evaluation of how well a teacher has performed

in any given year based on all factors considered. The Advisory Committee believes that in

order to validly classify the performance of educators into one of the four levels named above, a

profile or decision matrix should be established to appropriately balance the different

components and indicators and so that the educator can never receive an unexpected overall

rating.

Highly Effective

Teachers performing at the highly effective level consistently advance student growth and

achievement. They set and maintain high expectations for learning and achievement for all

students and create an environment of mutual respect, inquisitiveness, and caring.

Highly effective teachers demonstrate extensive knowledge of content, standards, and

competencies, and connect them to relevant local and global issues. These teachers model and

encourage innovation, creativity, critical thinking, and inquiry processes for their students, and

use their expertise and skills to engage their students in authentic, accessible, and meaningful

learning opportunities aligned to the content, standards, and competencies.

Highly effective educators facilitate personalized learning2 through intentional, flexible, and

research-based strategies. They are literate in multiple forms of assessment and incorporate

these multiple assessment strategies to evaluate student learning and adjust instruction

accordingly. Highly effective educators integrate technology into their instructional and

assessment approaches in ways that advance student learning opportunities.

Finally highly effective educators consistently demonstrate leadership in their contributions to

their school’s academic progress and culture of growth. They engage productively in learning

2 The United States Department of Education (ED) 2010 National Education Technology Plan: “Transforming

American Education: Learning Powered by Technology”. Personalized Learning: Personalized learning refers to

instruction that is paced to learning needs, tailored to learning preferences, and tailored to the specific interests of

different learners. In an environment that is fully personalized the learning objectives and content as well as the

method and pace may all vary (so personalization encompasses differentiation and individualization).


communities and continuously strive to maximize their own self-directed professional growth.

These educators consistently uphold high standards of professional practice.

Effective

Educators performing at the effective level generally advance student growth and achievement.

[NOTES: James—even the worst teacher advances student achievement—need to be more

explicit.] They set and maintain high expectations for learning and achievement for all students,

create an environment of mutual respect and caring, and engage students in appropriate

learning opportunities.

Effective educators demonstrate sound knowledge of content, standards, and competencies, and

connect them to relevant real world issues. These teachers model and encourage innovation,

creativity, critical thinking, and inquiry processes for their students, and use their expertise and

skills to engage their students in authentic, accessible, and meaningful learning opportunities

aligned to the content, standards, and competencies.

Effective educators facilitate personalized learning through research-based strategies. They use

multiple forms of assessment to evaluate student learning and adjust instruction accordingly.

Effective educators appropriately integrate technology into their instructional and assessment

approaches.

Finally effective educators contribute collaboratively to their school’s academic progress and

culture of growth by engaging in learning communities, fostering their own self-directed

professional growth, and frequently providing leadership to support improvements in their

colleagues’ performance. These educators consistently uphold professional standards of

practice.

Needs Improvement

Educators performing at the needs improvement level inconsistently advance student growth and

achievement. They establish expectations for learning and achievement for most students and

engage students in appropriate learning opportunities.

Educators performing at the needs improvement level demonstrate knowledge of content,

standards, and competencies. These educators use their knowledge and skills to engage their

students in accessible and meaningful learning opportunities aligned to the content, standards,

and perhaps competencies.

Educators performing at the needs improvement level attempt to facilitate personalized learning

using a mix of research-based and other strategies. They use multiple forms of assessment to


evaluate student learning, but do not consistently use the results to adjust instruction

accordingly. Educators performing at the needs improvement level may use technology in their

instruction and assessment approaches.

Finally educators performing at the needs improvement level participate in learning

communities, but do not consistently attend to their own self-directed professional growth.

These educators uphold professional standards of practice.

The Advisory Committee recognizes that this description may be seen as emerging growth, as

opposed to solely noted as deficiencies, particularly for beginning educators or those

experienced educators undertaking a new assignment. Local school districts should make

clear in their narrative which situation applies in the evaluation process.

Ineffective

Educators performing at the ineffective level may advance some student growth and

achievement, but frequently fail to improve most students’ growth. They are unable to establish

ambitious and reasonable expectations for student learning for most students and may be unable

to engage students in appropriate learning opportunities.

Educators performing at the ineffective level may have a limited knowledge of content,

standards, and competencies, but these teachers do not use their knowledge and skills to engage

their students in accessible and meaningful learning opportunities aligned to the content,

standards, and perhaps competencies.

Educators performing at the ineffective level may attempt to facilitate personalized learning

using a mix of research-based and other strategies, using multiple measures and technology, but

cannot prove consistent improvement in instruction.

Finally educators performing at the ineffective level participate in learning communities, but do

not attend to their own self-directed professional growth and/or support the growth of their

colleagues. These educators generally uphold professional standards of practice.

NOTE: The following is an alternative conception of the PLDs produced by Kris Cundall that

we think merits serious consideration.


Highly Effective Effective Needs

Improvement

Ineffective

Learner/Learning

Learning

experiences are

consistently

reflective of

individual

differences and

inclusive

learning

environments.

Learning

experiences are

mostly reflective

of individual

differences and

inclusive

learning

environments.

Learning

experiences are

sometimes

reflective of

individual

differences and

inclusive

learning

environments.

Learning

experiences are

usually not

reflective of

individual

differences and

inclusive

learning

environments.

Content

Knowledge

The teacher has

an extensive

knowledge of

content area and

consistently

connect concepts

and engages

learners.

The teacher has

strong

knowledge of

content area and

frequently

connects

concepts and

engages learners.

The teacher has a

basic knowledge

of content area

and sometimes

connects

concepts and

engages learners.

The teacher has a

basic or

insufficient

knowledge of

content area and

infrequently

connects

concepts and

engage learners.

Instructional

Practice

Professional

Responsibility

The teacher

continually

engages in

professional

learning and

evaluates and

adapts his/her

practice

The teacher

engages in

professional

learning and

continually

evaluates and

adapts his/her

practice.

The teacher

engages in

professional

learning and

sometimes

adapts his/her

practice.

The teacher

attends

professional

learning and

sometimes

adapts his/her

practice.

Note: No matter which performance level descriptors we recommend, they will need to be vetted

by key stakeholders around the state before finalizing this report. The Advisory Committee

strongly endorses employing a set of common performance descriptors for Wyoming in order to

promote comparable expectations for educators across districts.

General Evaluation State Model

The general measurement State Model describes the overall approach for how local districts

following the State Model would approach the data collection involved in evaluating educators.

The measurement State Model follows from the key principles outlined at the beginning of this

document. There are four domains of educator practice along with evaluations based on student

achievement. The general measurement State Model is tied to this overall depiction, but

provides more structure for the State Model and perhaps local instantiations of the State Model.

All evaluations, conducted using the State Model, shall include:


[NOTE TO ADVISORY COMMITTEE: We need to decide what aspects we would like to

require of all districts versus what we are recommending for all districts.]

Professional practice measures

Multiple approaches and measures will be used to collect data on educator practices to best tailor

the data collection approaches to complex nature teaching practice. Each educator shall conduct

a self-assessment each year that will be used as the foundation of a goal setting meeting with the

principal and/or peer coach (mentor). The self assessment and collaboratively established goals

will be used to focus the professional practice data collection for the year in which the educator

is being formally evaluated. For the years in which the educator is not undergoing a formal

evaluation, the self assessment and goals shall be used to guide professional development and

formative evaluation. Data related to professional practices shall be collected using:

A selective set of artifacts related to these goals and key aspects of professional practice,

particularly for standards 4 (content knowledge), 6 (assessment), and 7 (planning for

instruction) and

Observations of practice by educational leaders and potentially peers.

Measures of student performance

Student Learning Objectives

Student Growth Percentiles (if applicable)

The SLO and/or SGP results may be “shared” among multiple educators depending upon

local theories of action around school improvement.

As part of the general measurement approach, the State Model includes the use of multiple

measures of each domain when possible and when the use of the multiple measures improves the

validity of the evaluation decision. In addition to multiple measures, the Advisory Committee

recognizes the challenge of having enough expertise and time in any single individual to conduct

all required evaluations. Therefore, the State Model includes the optional use of peer teams, in

addition to building-level administrators, to participate and advise in the evaluation process.

The Advisory Committee further recommends that at least part of the SLO and/or SGP results be

shared among multiple educators depending upon local theories of action around school

improvement.

Documentation of Practice: Artifact Collection

The artifact is a critical component of WY’s State Model and contributes data to multiple

domains of teacher practice. The Advisory Committee suggests that all educators establish (or

maintain) yearly professional goals in consultation with their supervisor or designee and

document the process and products associated with these goals through a selective collection of


artifacts documenting teacher’s professional practices and evidence of student learning. The

Wyoming Department of Education (WDE) or other designees will produce resources designed

to support the use of artifacts for evaluation purposes. The Advisory Committee recommends

that each educator’s evaluation incorporate the following components:

Documentation of self assessment

Documentation of collaboratively (among educator, administrator, and perhaps

peer team) established specific goals

A plan, including identified professional development, for achieving the goals

Includes among other things analyses of key artifacts such as student work from

specific assignments, planning documents, and assessments related to the

established goals

Self reflection at the end of the year to self evaluate the extent to which the

specific goals have been achieved

Implementation and Differentiation

The Advisory Committee has been sensitive to balancing the needs of creating a valid system

with an understanding that the system or one like it must be implemented by all school districts

without creating an unmanageable burden. While many states require a full evaluation of every

teacher every year, the Advisory Committee quickly recognized that this would place an

impossible and inefficient burden on WY schools. Therefore, the Advisory Committee

recommends differentiating evaluations according to the experience and status of the schools’

educators. Ultimately, each district shall enact a policy and set of procedures to differentiate

evaluation systems for its different classes of educators (e.g., novice, veteran, and/or high

performing, low performing) and to the specific evaluation questions to be investigated. Each

educator shall be evaluated at least once, using the full system, within the first three years of

implementation, with novice educators evaluated every year. To the extent possible, yearly

evaluations shall include multiple years of student performance results.

Novice educators, defined as those within the first three years of the teaching profession, must be

evaluated every year until they are rated “effective” for two consecutive years. In order to be

granted professional (continuing contract) status, educators must be rated effective for two

consecutive years along with other local district requirements. These two events can happen

concurrently. Districts may decide to focus specific aspects of the evaluation for novice

educators by reducing the demands of certain aspects of the systems and/or focusing the

evaluation on specific standards.

Teachers with professional status (continuing contract) receiving an ineffective or needs

improvement rating shall be evaluated every year until they receive “effective” ratings or better


for two consecutive ratings or until other actions are taken. Once these teachers receive two

consecutive effective ratings, they shall receive summative evaluations every three years.

Specific Measurement State Model

NOTE TO ADVISORY COMMITTEE: Again, we provide a lot of recommendations for how

we would collect data for an evaluation. You need to decide what we want to require,

recommend, and/or suggest.

The specific measurement State Model adds the details to the general measurement State Model

to guide the data collection methods in order to successfully conduct educator evaluations. Such

a detailed measurement State Model would describe the type and frequency of data collection

approaches for each of the major domains. The following section includes a brief review of the

relevant InTASC standards, organized by major domain, and then provides recommendations for

how the performance of educators related to each domain may be evaluated. Additional

guidance by WDE, this Advisory Committee, or others will help fully describe the specific

measurement procedures and policies to be enacted for the various educators in the system.

Domain 1: Learner and Learning

Standard #1: Learner Development. The teacher understands how learners grow and develop,

recognizing that patterns of learning and development vary individually within

and across the cognitive, linguistic, social, emotional, and physical areas, and

designs and implements developmentally appropriate and challenging learning

experiences.

Standard #2: Learning Differences. The teacher uses understanding of individual differences

and diverse cultures and communities to ensure inclusive learning environments

that enable each learner to meet high standards.

Standard #3: Learning Environments. The teacher works with others to create environments

that support individual and collaborative learning, and that encourage positive

social interaction, active engagement in learning, and self motivation.

Well structured and multiple classroom observations may be used to contribute data for

evaluating educators in relationship to standards 2 and 3. However, such observations would be

unlikely to reveal enough information about teachers’ understanding of learner development

(standard 1) to enable evaluators to make valid judgments. For example, planning documents

that describe how the educator includes an understanding of learning theory and individual

differences would be a source of information for judging educators. Similarly, evidence of

reading and understanding relevant literature could provide documentation for educators’

consideration of learner development as part of the teaching process. Of course, a thoughtful

evaluator would want to ensure that the educator could apply such theoretical and/or empirical


reading to actual classroom practice. Some of this understanding could be revealed through

reflection and planning documents, but also through pre- and post-observation conferences.

Given the variety of information necessary to support decisions related to this domain, the

Advisory Committee recommends that local evaluation systems include sources of evidence,

similar to the examples described here, in the evaluation of educators’ according to Domain 1.

Domains 2 (Content Knowledge)

Standard #4: Content Knowledge. The teacher understands the central concepts, tools of

inquiry, and structures of the discipline(s) he or she teaches and creates learning

experiences that make the discipline accessible and meaningful for learners to

assure mastery of the content.

Standard #5: Application of Content. The teacher understands how to connect concepts and

use differing perspectives to engage learners in critical thinking, creativity, and

collaborative problem solving related to authentic local and global issues.

This domain requires a teacher to demonstrate deep knowledge of disciplinary content and how

to connect that content knowledge with appropriate instructional strategies or what is referred to

as pedagogical content knowledge. Similar to Domain 1, it is unlikely that evaluators could

collect information about content and pedagogical content knowledge simply through

observations of practice. Content knowledge (standard 4) must be evaluated through collection

of artifacts such as successful completion of programs of study and/or in-depth discussions with

experts in the relevant content area. Once high levels of content knowledge have been

established, the educator should include, as part of her/his self-reflection and goal setting, plans

to stay current and improve her/his understanding of the discipline. The educator should be

expected to document and reflect on her/his new understandings of the discipline as part the

artifact collections.

Pedagogical content knowledge or the application of content to instructional practice (standard 5)

should also be evaluated by examining planning and refection documents. However, evaluators

may gather critical information related to standard 5 through structured observations of practice

that include pre- and post-observation conference to allow for reflections of this standard.

Domain 3 (Instructional Practice)

Instructional Practice

Standard #6: Assessment. The teacher understands and uses multiple methods of assessment to

engage learners in their own growth, to monitor learner progress, and to guide the

teacher’s and learner’s decision making.


Standard #7: Planning for Instruction. The teacher plans instruction that supports every

student in meeting rigorous learning goals by drawing upon knowledge of content

areas, curriculum, cross-disciplinary skills, and pedagogy, as well as knowledge

of learners and the community context.

Standard #8: Instructional Strategies. The teacher understands and uses a variety of

instructional strategies to encourage learners to develop deep understanding of

content areas and their connections, and to build skills to apply knowledge in

meaningful ways.

Information about the way in which an educator plans for instruction (standard 7) and uses

assessment (standard 6) may be revealed through pre- and post-observation conferences,

particularly planning for instruction, but examining artifacts such as unit plans, syllabi, and

assessment tools would reveal important information about these standards. Further, the

Advisory Committee is convinced that evaluators cannot validly judge how well educators

understand and use assessment to improve learning (standard 6) without hearing or reading how

educators use student work to reflect on what was revealed in the assessment process and what

instructional decisions should be made based on these results.

On the other hand, capturing information about educators’ use of appropriate instructional

strategies (standard 8) would be very difficult without direct classroom observations. The

Advisory Committee recognizes that any manageable schedule of observations will be

necessarily “thin.” In the years that the teacher is evaluated, the Advisory Committee

recommends that teachers are observed formally on at least three different occasions. The

general time frame/unit of instruction for the observations shall occur in consultation with the

educator, but the specific lessons observed may be unannounced. At least one of the

observations, but preferably most of them, shall be tied to aspects of the curriculum that are the

focus of the Student Learning Objectives (SLOs) in order to use information about what students

have learned to triangulate the information. Further, the observations shall include an analysis

and discussion of relevant documents associated with the unit of study being observed. These

documents may include lesson plans, assessments, assignments, student work, and other relevant

documents associated with the teaching, learning, and assessment of the unit.

Domain 4: Professional Responsibility

Professional Responsibility

Standard #9: Professional Learning and Ethical Practice. The teacher engages in ongoing

professional learning and uses evidence to continually evaluate his/her practice,

particularly the effects of his/her choices and actions on others (learners, families,


other professionals, and the community), and adapts practice to meet the needs of

each learner.

Standard #10: Leadership and Collaboration. The teacher seeks appropriate leadership roles

and opportunities to take responsibility for student learning, to collaborate with

learners, families, colleagues, other school professionals, and community

members to ensure learner growth, and to advance the profession.

Professional responsibility generally cannot be evaluated from classroom observations. To

clarify, professional responsibility may be observed informally through noticing how the

educator interacts with colleagues, parents, or others, but it is unlikely that information about

professional responsibility can be collected through formal classroom observations. The

Advisory Committee recommends that the yearly self reflection and goal setting activities

specifically address aspects of professional responsibility and establish the focus of professional

responsibility of the given year. The Advisory Committee deliberated whether teachers new to

the profession should be exempted from being evaluated on Domain 4, but overwhelmingly

recommended having all educators should be expected to demonstrate their responsibility as a

professional educator. One potential difference between novice and experience educators is that

novice educators may focus on more inward-facing aspects of this domain, as discussed in

standard 9, while experienced educators may continue to focus on these internal aspects of

responsibility, but would also be expected to become more outward-facing leaders whether in the

school, the district, or the profession at large. The specific focus of the professional

responsibility will guide the required data collection and reflection.

Student surveys

The Advisory Committee discussed the merits and challenges associated with incorporating the

results from student surveys into teacher evaluation decisions. On one hand, using information

from students solves a major “sampling problem” associated with both teacher observations and

student test scores. Even the most ambitious observation schedule of something like four or five

one hour observations in a year (and most would consider 2-3 ambitious) is still four or five

hours out of a possible 720 instructional hours each year (180 days x 4 instructional hours each

day). Student Growth Percentiles (or value-added models) based on PAWS or potential SBAC

assessments, while technically strong, are only a sample of students’ knowledge and skills and

suffer from limited reliability based on small numbers of students in a given class or in

Wyoming’s case, school. Student surveys, in contrast, collect information from those with the

teacher for essentially 100% of the teacher’s instructional performance. Further, by including

enough questions (e.g. 25-40), it is possible to generate fairly reliable results. In fact, the student

surveys were the most positive influence on the composite rating of teachers’ performance in

Measures of Effective Teaching (MET) project when surveys, VAM results, and observations

were combined for a teacher rating.


On the other hand, increasing reliability does not mean an increase in validity will automatically

follow. Several researchers have raised concerns that having students participate in the

consequential evaluation may change the “social contract” in the classroom. This concern

should not be taken lightly and if surveys are used, care must be taken in the design to deal with

potential challenges to the validity of the teacher evaluations.

The Advisory Committee has several recommendations if surveys are incorporated into district

evaluation systems:

1. Survey questions must be predominantly “low inference” type questions that ask about

specific practices (e.g., “how many times each week does ask you to explain your

reasoning”) compared with questions about feelings (e.g., “does your teacher care about

you?”).

2. Surveys should be piloted extensively so students can get used to completing surveys and

school personnel can gain an understanding of how the surveys relate to other

information about teachers.

3. Instead of incorporating the results of surveys into evaluations directly, districts should

consider using surveys as an additional factor to raise or lower a teacher’s evaluation.

4. In order to most conservatively provide the type of additional information called for in

#3, districts and schools should consider using the surveys normatively. In other words,

the survey results would only be a factor to adjust the evaluation results if the teacher’s

survey results were noticeably higher or lower than the average for other teachers at that

same grade span.

5. Student surveys should be designed to provide information regarding the standards for

which students would likely have meaningful insights. This would include most of

Domain 1 as well as standards 5, 6, and 8.

Domain 5: Student Performance

As stated in the first guiding principle of this State Model, the primary purpose of Wyoming’s

educator evaluation and the reason for engaging in this work is to support and promote increases

in student learning in Wyoming schools. Therefore, the results of student achievement must be

incorporated in the evaluations of all educators. While this sounds intuitively straightforward, it

is one of the most complex aspects of new forms of educator evaluation. The Wyoming State

Model uses a three part approach for incorporating student achievement and growth into

evaluations in order to attempt to maximize the benefits of doing so, while striving to minimize

potential unintended negative consequences.


Student Learning Objectives (SLO) form the foundation of Wyoming’s approach for

documenting changes in student performance associated with a teacher or group of educators

and, as such, all educators will have the results of SLOs incorporated into their evaluations. For

educators in “tested” subjects and grades, those grades and subjects for which there is a state,

standardized test as well as a state test in the same subject in the previous year, student

performance will be evaluated using Student Growth Percentiles (SGP), and the results of SGP

analyses, along with SLO results, will be used in the evaluations of educators in tested subjects

and grades. Both SGP and SLO approaches are described in more detail below.

Both SGP and SLO approaches can be used to attribute the academic achievement and growth of

students to individual educators or to appropriate aggregations of educators such as grade or

content-level teams or even the whole school. Distributing student performance results to

multiple educators is referred to as “shared attribution.” The tradeoffs associated with shared

attribution are also discussed below.

Student Learning Objectives (SLO)

All teachers, whether in “tested” or “non-tested” subjects and grades shall be required to

document student academic performance each year using SLOs in accordance with Wyoming’s

SLO guidance (see Appendix A). Both SGP and SLO analyses shall produce results in at least

three classifications of performance, to the extent possible, such as: high, typical/average, and

low. The results of the SLO determinations shall be incorporated into the evaluation of all

educators according to the rules described below in the section on combining multiple measures.

Calculating Student Performance Results in “Tested” Subjects and Grades

The growing interest in reforming long-standing approaches for evaluating and compensating

teachers has been characterized by among other things incorporating student performance results

in teacher evaluations. Advances in growth and value-added models in education have

contributed to the interest in using changes in student test scores over time as part of educator

accountability systems. Many districts, states, and non-governmental organizations have

embraced these test-based accountability initiatives, but the initial focus has been on the content

areas and grade levels for which there are state standardized tests, generally administered at the

end of each school year, or “tested” subjects/grades. Student performance for these tested

subjects and grades is generally evaluated using complex statistical models such as value-added

or student growth percentile models.

There are several possible approaches that Wyoming could use for evaluating student

performance in tested grades, but in order to adhere to the coherence principle, the Advisory

Committee recommends using the same Student Growth Percentile model currently being used

for the school accountability system. However, this is not necessarily as simple as it sounds to


move from school to teacher accountability. Appendix B outlines multiple considerations for

using SGPs in educator evaluation.

WDE shall produce Student Growth Percentiles (SGP) results documenting the individual

student and aggregate growth for students. These results will be reported for the whole school

level and for identifiable student groups in the school. A student–level file will be provided to

each district to use for aggregating SGP results according to the attribution rules in each districts’

evaluation plan, whether for individual teachers, specific groups of teachers, or both. These

results, based on PAWS and eventually Smarter Balanced Assessment Consortium (SBAC) test

scores or another assessment, using the SGP model, shall be incorporated into teachers’

evaluations either using a shared or individual attribution State Model.

Shared Attribution

The Advisory Committee recognizes the challenges of properly attributing the results of student

performance to individual teachers. It is easy to think of many examples where it does not make

much sense to attribute the performance of students to any individual teachers, such as the case

when grade-level teams of teachers place students into differentiated instructional groups and

providing instruction to students by educators other than the child’s regular teachers. Therefore,

the Wyoming State Model relies on a mix of shared attribution and individual attribution of

student performance results. The SGP results, based on state tests in grades 3-8 should,

depending on the specific theory of improvement for the particular school, be shared among

educators at the same grade and/or teaching the same subject areas. SLO results, assuming

groups of educators are working on the same SLO, may also be shared among educators at the

same grade and/or content area. However, SLOs allow for more control than state test results

and the State Model requires that at least some portion of the SLOs used to document student

performance by attributed to the individual educator of record. Like anything else in

accountability system design, there are both advantages and disadvantages to using shared

attribution.

One of the major concerns with attributing the results of student performance to individual

teachers is that many fear that this could erode collaborative cultures at many schools, especially

if the results are used in some sort of “zero sum game” accountability design. Shared attribution

approaches, if implemented sensibly, can help promote both collaboration and internal (to the

group of teachers) accountability orientations. Both of which are associated with high

performing schools and organizations. Another concern for policy makers and accountability

system designers are potential unintended negative consequences of having the mathematics and

reading teachers in grades 4-8 evaluated in potentially very different ways than the other 70-75%

of educators in the district. This could lead to higher rates of attrition from these subjects and

grades or perhaps a feeling of professional isolation. The requirement for all educators to


participate in the SLO process is one hedge against this potential problem. However, sharing the

results of all of the student performance indicators among multiple educators, as appropriate, is

one way to recognize the contributions of other educators to student performance, especially in

reading and math. Finally, one of the major concerns with tying student performance results to

individual teachers involves the reliability concerns when dealing with such small groups of

students. Aggregating the student performance results for multiple educators is one way to

ameliorate, but far from eliminate, these reliability challenges.

This discussion could lead one to believe that shared attribution has so many advantages, why

would a system include any other approach. Of course there are potential disadvantages to

shared attribution too. One important disadvantage—that could be reduced with careful

design—is the educators maybe held accountable for results for which they may have little to no

control. This was a considerable criticism of Tennessee’s approach for including student

performance results in the evaluations of teachers from non-tested subjects and grades. This

threat is likely greatest when student performance on the state math and/or reading tests is

attributed to all educators in the school as opposed to a finer-grained aggregation. Another

potential disadvantage to shared attribution is that it may mask true variability in educator

quality. If we believe that educator quality is truly variable along a continuum of being able to

influence student performance, then pooling results among multiple educators could mask such

differences. Of course, being able to separate the “signal” (true variability) from the “noise”

(unreliability in the system) is not easy with such small samples. This more problematic at the

elementary level with self-contained classroom of 20 students or so compared to a middle school

where a teacher might be responsible for the math or reading instruction of over 100 students.

The Advisory Committee is well aware that this assumption may not hold true in many of

Wyoming’s small schools and districts.

Therefore, the Advisory Committee recommends that sharing student performance results among

multiple educators should be based on more than just reliability concerns, but such decisions

must be tied to local theories of improvement. For example, if the focus of improvement

activities is the grade level team, then attribution should be shared among educators at that grade

and not at the whole school level. Therefore, the first step in implementing any sort of shared

attribution approach involves a careful articulation of the school’s locus of improvement actions.

This theory of improvement (action) should also make clear which subjects are shared and with

whom. For example, does the 5th

grade team share both math and ELA results or just one

subject? Finally, while the Advisory Committee favors shared attribution approaches in many

cases and for at least some of the weight in the accountability determinations, it also

recommends that at least some of the changes in student performance be attributed to individual

teachers. This might best be accomplished with SLOs rather than SGPs because of the closer

ties to the specific course, but the Advisory Committee suggests leaving this specific decision to

local school districts.


Combining Multiple Measures

There are many approaches for combining multiple indicators to yield a single outcome:

compensatory, conjunctive, disjunctive, and profile methods. Compensatory means that higher

performance in one measure may offset or compensate for lower performance on another

measure. Conjunctive means that acceptable performance must be achieved for every measure

(e.g., AYP). Disjunctive means that performance must be acceptable on at least one measure. A

profile refers to a defined pattern of performance that is judged against specific performance

level descriptions. A profile approach is often operationalized using a matrix to combine

indicators for making judgments. Given the challenges involved in characterizing the

complexities of teaching, the State Model must employ a thoughtful approach for combining the

multiple sources of data in order to produce the most valid inferences about overall teacher

quality possible.

A compensatory approach recognizes that some degree of variability in performance across

indicators may be expected. Such an approach has a higher degree of reliability because the

overall decision is based on multiple indicators evaluated more holistically. Conjunctive

decisions are less reliable because errors accumulate across multiple judgments meaning a

teacher might fail to be classified as effective due to poor performance on the least reliable

measure. A conjunctive approach does not appear to make much sense for an educator

evaluation system. A disjunctive method is used when any one component is viewed as adequate

assurance the teacher met expectations. Again, this does not appear to make much sense in a

teacher evaluation system. Finally, profiles are useful especially when there are certain patterns

that can be described that reflect valued performance that are not easily captured, usually

because the combinations of criteria are judged to be not equivalent.

These approaches should not be regarded as mutually exclusive. It is possible, for example, to

combine aspects of compensatory and profile ‘rules’ to arrive at a final result. For example, a

compensatory approach may be used to aggregate the data from the multiple measures within

any single domain, while a profile approach could be used to combine information across

domains. A major advantage of a profile or decision matrix approach is that once established,

the teacher can never receive an unexpected overall rating, whereas simple averages

characteristic of compensatory approach can produce some surprising outcomes.

The Advisory Committee recommends using, as part of the State Model, an approach for

combining the various sources of information that avoids mechanistic approaches such as simple

averaging, but that takes into account the nature of the different sources of information. A

“panel” or “decision matrix” approach” for combining the multiple measures allows the goals of


the system to be reflected explicitly and not buried in some numerical composite. An example of

such a panel approach is found below.

EXAMPLE A 4x 3 Panel Approach for Combining Multiple Measures (based on an

approximate 25/75 weighting between student performance and teacher practices)

“P

rofe

ssio

nal

Pra

ctic

e” R

ati

ng

4 Automatic

Review

Highly Effective Highly Effective

3 Needs

Improvement

Effective Effective

2 Needs

Improvement

Needs

Improvement

Needs

Improvement

1 Ineffective Ineffective Automatic

Review

1 2 3

“Student Performance” Rating

Again, this is just an example and this sort of 4 x 3 matrix might be useful with an immature

system such as the type we would expect during early implementation phases. As the system

matures and more data are available for each educator, particularly in terms of student

performance, more expansive matrices may be appropriate, such as the example of 5 x 4 matrix

below.

EXAMPLE #2: A 5x 4 Panel Approach for Combining Multiple Measures (based on an

approximate 25/75 weighting between student performance and teacher practices)

“P

rofe

ssio

nal

Pra

ctic

e” R

ati

ng

” 5

Needs

Improvement Effective

Highly

Effective

Highly

Effective

4 Needs

Improvement

Effective Effective Effective

3 Needs

Improvement

Needs

Improvement

Effective Effective

2 Ineffective Needs

Improvement

Needs

Improvement

Needs

Improvement

1 Ineffective Ineffective Needs

Improvement

Needs

Improvement

1 2 3 4

“Student Performance” Rating

NOTE to Advisory Committee: How much detail to we want to include in the State Model for

combining across the various domains and for making overall determinations? We think it

would be really useful for us to actually deliberate on the values we want to put in the matrices

that we include in the document as models.



Supports and Consequences

Assumptions

As stated in the guiding principles, Wyoming’s State Model is being designed such that it can

support improvements in teaching and learning. As part of this design, the Advisory Committee

emphasizes the importance of reporting detailed and actionable information so that educators and

their leaders have the information they need to guide efforts to improve their practice. This

means that educators need to receive information on each of the indicators in the system, while

recognizing that the information at the indicator level is considerably less reliable than the total

evaluation. This will require well-documenting each local system, in terms of the components

and indicators outlined in this document, so that all educators understand the nature of the

information on which they will be evaluated.

The WY State Model and all local systems must produce an overall effectiveness rating that

guides support, career development, and employment decisions. The overall rating can only be

an overall flag to guide support since the detailed information is necessary to allow for focused

support and development.

Supports

A critical support requires having each educator understand the rules by which they will be

evaluated. Therefore, each district shall develop and implement a process for training all

licensed personnel on the educator evaluation system including the consequences associated with

the ratings.

In order to fulfill one of the major guiding principles that the system is being designed to

improve educators’ performance, the State Model requires that each Wyoming school district

must include well-specified and formalized process of mentoring and support designed to

improve the performance of all educators in the district. The support and mentoring systems

should be designed collaboratively with teachers and administrators based on research and

documented best practices.

Districts shall provide training for all personnel who will be conducting classroom observations

as part of a defined training and qualification process. This training will help leaders better

understand differences in instructional quality so that they can better support their teachers’

improvement efforts. Additionally, all evaluators (administrators) must receive evidence-based

training on how best to provide feedback to those evaluated in order to support understanding of

the information derived from the evaluation system and to improve practice.


Note to Advisory Committee: Do we want to say anything about consequences in the state

model? Shall these all be suggestions and not requirements or even recommendations?

Consequences

Ultimately, the system will lead to certain consequences for educators falling well below or well

above expectations. While the system is designed for improvement and a significant support

system is required to help struggling educators, there will likely come a point where educators

may need to be counseled out of the profession. The State Model includes the following

expectations for such eventualities:

1. Educators rated ineffective or needs improvement in one year must be placed on directed

professional growth (improvement) plan that includes receiving targeted support. These

support systems must be research-based to the maximum extent possible. Further, the

evaluations of the educators involved in a directed professional growth plan shall include

additional data sources such as video records of classroom teaching experiences. The

video recording of classroom teaching is designed to serve two purposes. It can be a very

effective feedback tool for all educators, but particularly for struggling educators if

viewed with an expert mentor. Second, the video evidence will allow for review by an

appeals panel within the school district to ensure the accuracy of the principal ratings for

classroom performance.

2. The State Model requires that an experienced, educator with two consecutive years of

ineffective ratings will lose her/his current (continuing contract) status and may be

dismissed without additional cause. The Advisory Committee recognizes that such

potential consequences will need to be incorporated into locally-negotiated personnel

contracts.

3. After receiving a second consecutive “needs improvement” rating, the educator will be

considered to have received his/her first year of an ineffective rating.

4. An educator rated highly effective for two consecutive ratings should receive recognition,

as determined by the local district, and may assume a “teacher leader role” as part of the

mentoring and support system.

Implementation Recommendations

The Advisory Committee, as can be seen from the preceding discussion, has been very

thoughtful about designing a State Model for educator evaluation in Wyoming. We have

attempted to outline a clear approach to addressing the complexities for designing and

implementing educator evaluation systems in Wyoming. However, the Advisory Committee

wants to stress that there are enormous challenges to implementing such systems in any locale.

One positive aspect of having Wyoming follow other states and districts in this work is that we

have the opportunity to learn from the experiences of others. One of the most striking things


being learned is that significant time and thoughtfulness are needed to implement these systems

well.

This would be true under conditions where the state standards and assessment systems were

stable. As we know, Wyoming has recently adopted the Common Core State Standards which

call for deeper levels of understanding on the part of students than ever before. Shifting

instructional practices and curriculum will require considerable effort on the part of local school

districts. Adding requirements for a new school accountability system will further stress

systems. Therefore, the Advisory Committee appreciates that the educator evaluation system in

Wyoming can be implemented with an extended pilot period to both gradually implement the

system and to allow for formative feedback to make adjustment to the system before it is

implemented operationally.


APPENDIX A:

Student Learning Objectives: Guidance

NOTE TO ADVISORY COMMITTEE: Appendices A and B still need some work.

Introduction

The Wyoming Accountability Advisory Committee recommends the use of Student Learning

Objectives (SLOs) to document educators’ contributions to student performance in both “tested”

and “non-tested” subjects and grades. SLOs are content- and grade/course-specific measurable

learning objectives that can be used to document student learning over a defined period of time.

In essence, educators establish learning goals for individual or groups of students, monitor

students’ progress toward these goals, and then evaluate the degree to which educators help

students achieve these goals. This is a key advantage of the SLO approach. It is designed to

reflect and incentivize good teaching practices such as setting clear learning targets,

differentiating instruction for students, monitoring students’ progress toward these targets, and

evaluating the extent to which students have met the targets.

There are several important considerations for employing SLOs in educator evaluations. First,

the quality of the objectives and the validity of the inferences that can be made from the SLO

process must be assured. Second, the process by which the objectives are established must be

considered if the objectives are seen as fair for all educators. Third, the measurement approaches

and tools must enable educators and their evaluators to judge the extent to which educators have

met their objectives. Finally, the oversight and support, especially the professional development

necessary to help educators and administrators learn how to set and evaluate meaningful

objectives, and the cross school/district monitoring will be critical to assure fairness and rigor

within and across schools and districts.

While many have an interest in developing “growth-based” SLOs (i.e., measuring the change in

student achievement over two or more points in time), most will be “status-based,” usually

roughly conditioned on estimated initial understanding, then evaluating the degree to which

students reach specific targets on the measurement at the end of the instructional period. This

distinction between growth and status SLOs is discussed in more detail in Marion, et al., (20123).

This section of the report will help guide educators and administrators in designing and

implementing a local SLO process. It is divided into the four sections: 1) The Objectives; 2)

3Marion, S., DePascale, C., Domaleski, C., Gong, B., and Diaz-Bilello, E. (2012, May). Considerations for analyzing educators’

contributions to student learning in non-tested subjects and grades with a focus on Student Learning Objectives

http://www.nciea.org/publication_PDFs/Measurement%20Considerations%20for%20NTSG_052212.pdf

http://www.nciea.org/publication_PDFs/Measurement%20Considerations%20for%20NTSG_052212.pdf


The Objective Setting Process; 3) Assessment/Measures; and, 4) Oversight and Support. Each

section provides both recommendations and a rationale for the recommendations. To the extent

applicable, reference is made regarding the distinction between the early implementation years

and a more complete operational system.

The Objectives

The number and specificity of the objectives are important considerations in terms of

maximizing the validity of the evidence regarding the claims one is trying to make as a result of

the SLO process. At a minimum, evaluators are at least implicitly claiming that the results of the

SLO determinations for a given time period are a fair and valid depiction of the learning results

of an individual or group of students associated with a particular educator or educators. The

intention is to clearly use the results of the SLO process as evidence of the quality of a particular

educational experience in a particular setting.

SLOs will work best if they are situated within the theory of action or theory of improvement for

the particular school. In order to help ensure the validity of the claims about educators from the

SLO process, it is important to use a sufficient number and representativeness of objectives to

ensure that the domain of the course is appropriately sampled, but not so many objectives that

certain objectives become trivialized. As such, educational leaders should consider requiring

that at least a portion of the SLOs in the building will be shared among a group of educators

(e.g., grade level team). Further, while most SLOs will be tailored to the specific learning targets

in the particular class or course, district and school leaders should work to have SLOs related to

overall school improvement goals to the extent practical. The following recommendations are

designed to maximize the validity of the inferences from the SLOs related to educator quality

while trying to manage the implementation challenges of a new SLO process.

1. All non-administrator educator evaluations shall include a minimum of two, individually-

based SLOs for each individual educator in a building during the first pilot year. By the

first operational year up to four SLOs per teacher should be the requirement to ensure

that the subjects and grades are more appropriately represented in the complete set of

SLOs. Reliability concerns can be mitigated by:

a. Using multiple measures for each SLO, and,

b. Increasing the number of SLOs, each with its own measure.

2. Objectives for each educator should be as representative of the set of courses/subjects

they teach as possible. For example, a middle or high school teacher should have

objectives from multiple sections or courses. This does not mean that every

course/section is represented, but there should be an effort to ensure such representation

over time. Similarly, objectives for elementary school teachers should be as

representative as possible for the subjects that these teachers teach.


3. The objectives shall be linked to the appropriate specific content and skills from the

Wyoming Content Standards and/or course standards. The SLOs should be targeted to

“enduring understandings” or high priority standards. In other words, given the limited

number of student learning objectives for each teacher, they should be tied to the most

critical learning outcomes. It will be important for educators to focus on the most

important outcomes and be cautious not to narrow the curriculum.

5. Each educator shall participate in at least one shared or aggregate objective. This may be

in alignment with a school wide goal or could be a grade level or content area goal

(typically for middle or high school). This should be based on a theory of

action/improvement that leaves the school and district able to decide on the appropriate

aggregation (e.g., grade level teams) based on school/ district philosophy. For example,

most schools have “literacy across the curriculum” initiatives in place and it will make

sense to maintain focus on such initiatives through the SLO process.

6. Objectives for each individual educator, and especially the shared/aggregate objectives,

should reflect consideration of the overall school improvement plan.

7. Growth-based objectives should be encouraged and employed only where possible to do

so in technically defensible ways (Marion, et al., 2012).

8. The objectives should be ambitious, but realistic. Further, the objectives should be rich

enough such that educators are not simply classified as having met or not met the specific

objectives. The student learning objectives should be tied to a rubric of performance that

includes at least three or four levels. The objectives should be able to produce nuanced

results such as “clearly not met,” “partially met,” “met objective,” and “exceeded

objective,” as categories of performance. Such an approach will encourage objectives

rich enough to support such a scoring scheme and will hopefully maximize the chances of

capturing the true variance in educators.

Establishing the Learning Goal

The process of establishing the student learning objectives is critical to the fairness, educator

buy-in, and manageability of the SLOs. A process should be established so that educators are

held to similar levels of rigor at least within a school building. The focus should be on trying to

implement as comparable a process within each school as possible. Hopefully in the long run,

this comparability will expand across the district. If SLOs are to lead to the improvements in

student learning that many hope to see, educators should fully participate in the process and not


“have SLOs done to them.” The following recommendations are designed to address these

concerns.

1. Each district shall establish a State Model for ensuring that learning goals across the

district are comparable as possible. Participating on statewide peer teams to establish

learning for content area may be an option for districts to consider. Further, the

principal or her/his designees shall consider comparability when approving all learning

goals in the building.

2. Generally, the school principal is legally responsible for the evaluation of all personnel in

the building and therefore should approve all objectives. However, the principal,

especially at the secondary level, should consider employing a team approach to take

advantage of distributed leadership and expertise. Having a single point person (or team)

can help ensure the comparability of SLOs across the school building.

3. In addition to school administrators, teams of educators shall be involved in establishing

both shared and individual teacher objectives. Teams members may include: members

of the same academic department, grade level colleagues, district content area experts,

and other qualified individuals. This recommendation is designed to address three major

concerns: content knowledge, comparability, and buy-in.

4. Each educator shall have considerable say in establishing her/his SLOs. Shared district

learning goals can influence educator SLOs, but with administrator approval, significant

input is appropriate to better fit the needs of the educators’ particular classes.

5. Relevant performance data on students for whom learning goals will be established as

well as data from the same course in prior years shall be used to assist in establishing

meaningful objectives. Student information and longitudinal information as well as

information from the same course in previous years shall be used if available.

6. The learning goals for each course should be established ideally prior to the beginning of

the course, but no later than within four weeks of the start of the course.

Assessments/Measures

Even with rigorous and appropriate learning goals, SLOs may be meaningless without high

quality measures to evaluate the degree to which students achieved these learning goals. In fact,

the quality of the measures may be the Achilles Heel in the entire SLO process, because outside

of a few core content areas, the quality of the available measures is quite variable at best.

However, rather than using concerns about potential measures as a reason to abandon the SLO


process, the SLO approach should motivate an upgrade of the quality of measures and

assessments available for teachers to document student learning.

Educators should rely on the best measures available to evaluate the specific SLOs. The use of

the measures should be driven by the fit between the particular learning targets and the

assessments used to evaluate the SLOs. The highest quality assessments should be used to

evaluate the SLOs, but these assessments should be the ones that best match the specific learning

goals. It will be a challenge in the early years to find high quality assessments to evaluate the

SLOs, but this should be seen as an opportunity to improve the quality of local assessments.

This is one of the main reasons why it makes sense to focus first on status-based SLOs. It will

be hard enough to develop or select at least one high quality assessment to evaluate SLOs

without the challenge of needing to find both a high quality pretest and posttest (again, see

Marion, et al., 2012). The following recommendations are intended to help guide the assessment

component of the SLO process.

1. Common district assessments, tied to the specific learning goals, created by the district or

other entities shall be used to evaluate SLOs to the extent that the assessment provides a

valid measure of the learning goal. Determining what constitutes a valid measure of the

learning goal is not an easy task and there will be other resources available, such as

quality criteria for assessments, to help districts evaluate the technical quality of various

assessments.

2. WDE and a consortia of districts shall be encouraged to facilitate the development of

resources/tools (e.g., common rubrics, common assessments) as examples to aid in the

assessment of SLOs in non-tested subjects and grades. It makes little sense for every

district to tackle this challenge on its own, so this recommendation is intended to

encourage cross-district collaboration to build higher quality SLOs and associated

assessments than would be possible if each district was working on its own. Because the

Advisory Committee is concerned about the cost, both in terms of time and money, of

creating new common assessments for courses and grades where there are currently no

state-supported assessments, criteria for quality student assessments will be established,

State Models and examples will be provided, and local districts and schools will be

provided professional development on creating quality assessments. This is an important

aspect of building professional human capacity.

5. Educator performance on the SLOs should generally be scored using three categories of

performance (e.g., exceeded SLO, met SLO, and did not meet SLO).

Oversight and Support


Designing and implementing an SLO process assumes that teachers and leaders have the

knowledge and skills to establish appropriate learning goals, set ambitious and meaningful

performance targets, locate or develop assessments suitable for measuring student learning

relative to these goals, and evaluate educator performance according to how well the students

performed. Educators will need professional development to gain the knowledge and skills

necessary to sustain wide-scale implementation of the SLO process. Further, some level of

monitoring and oversight at the state level is necessary to promote comparability in SLO

processes and outcomes. Comparability of SLOs and SLO outcomes is a major concern of the

Wyoming Advisory Committee. As such, the recommendations discussed below are intended to

help ensure comparability of goals and objectives starting from the classroom (i.e., multiple

SLOs within the same classroom and across classrooms should be comparable) to the school,

district, and state. The recommendations that follow are intended to address the support

necessary to successfully implement an SLO approach for documenting educator contributions to

student learning as well as to provide guidance around the type of monitoring and support the

Advisory Committee recommends for the state and districts.

1. WDE, based on recommendations from the Advisory Committee, shall create clear

guidance for creating a local SLO process that includes the items described in this

document. This guidance shall describe criteria for developing and evaluating high

quality SLOs and should provide examples of both high quality and weaker (for contrast)

SLOs.

2. A State SLO Advisory Review Committee shall be established to review and support the

SLO process including evaluating the quality and rigor of objectives, assessment

measures, and performance expectations (what counts as “good enough”). This SLO

Advisory Review Committee will be designed to ameliorate differences in SLOs across

districts due, in part, to differences in district capacity. At a minimum, districts shall

conduct such processes across schools within their districts.

3. WDE along with contributing schools and districts shall develop a resource bank of

exemplar SLOs and potential assessment instruments.

4. Each district, with WDE support, shall design a structure and process for providing

professional development on the development of an SLO process for its educators and

administrators. This shall include training for educational leaders on how to work with

his/her teachers in establishing meaningful and rigorous learning objectives, how to

establish and support peer teams, and how to determine what types of assessments are

suitable for evaluating SLOs. The support for educators shall include training for how to

use data to establish learning objectives, determining the appropriateness and


meaningfulness of targets, monitoring student progress toward the targets, and using

assessments to evaluate the degree to which students met the targets4.

5. As part of the pilot of the educator evaluation system, special attention should be devoted

to the ways that student growth measures work within the systems. The results of the

pilot process shall be reported and used to inform subsequent modifications to the SLO

process and the weighting of student growth in the Wyoming evaluation system.

4 Note: The Center for Assessment has already created and posted an SLO Toolkit.


APPENDIX B:

Considerations When Calculating Student Performance Results in “Tested” Subjects and

Grades

Incorporating the results of student achievement tests requires the Advisory Committee to

consider and make recommendations about several important issues. The following pages lay

out many of these considerations to provide background information for decisions to be made by

the Advisory Committee.

Tests Included

It is assumed that the grade/ subject tests included in the Wyoming State Model will be the same

as those included in the school accountability system. Creating as much overlap as possible

among the set of included tests is a desirable feature of coherence. The proposed school

accountability system to meet the requirements of WEA 65 includes academic growth based on

state assessment results (PAWS currently) in grades 4-8 in reading and mathematics. Therefore,

these grades and content areas should serve as the basis for inclusion of SGP in the Wyoming

State Model as well.

Obviously, it is not desirable to exclude high schools from SGP calculations. The State Board

and the legislature are currently considering a plan for implementing end of course tests (EOC),

which may open up new options for calculating SGPs at the high school. However, calculating

growth at the high school level is extremely complex, particularly if, as expected, there is

variability in course sequence. Therefore, until we know much more about the developing high

school assessment system, the focus should be on grades 4-8 in reading and mathematics.

Teacher/Leader of Record

Another important consideration in operationalizing growth in Wyoming’s Educator Evaluation

State Model is determining which teacher/leader should be held accountable for a student’s

performance (leaving aside for the moment the discussion of shared attribution). A suitable

definition - and an accompanying data system that permits operationalization of this definition -

should establish the conditions and circumstances governing the connection of educators with

classes and account for the variety of learning environments in Wyoming’s schools. For

example, the Data Quality Campaign (DQC) (2010) advises states seeking to use assessment data

to inform educator evaluation to:

Account for contributions of multiple educators in a single course

Enable teachers to review rosters for accuracy


Account for schedule changes and variable class environments such as virtual classes or

labs

Link attendance records with teachers to track actual days of instruction

Based on the State Model for defining teacher of record offered by DQC (2010b) the following

questions are important to address in order to arrive at an operational definition for included

teacher/ leader of record. Sample responses, intended only as ‘placeholders’ at this time, are

provided. It is recommended that the advisory committee carefully consider each.

What educators and leaders will be included?

o The primary educator who provides instruction contributing to and culminating in

the statewide PAWS test in reading or mathematics

o Elementary and middle school principals

o Other building level leaders/administrators whose role is primarily associated

with instruction

How much instructional time is required to establish a link?

o Teacher has primary responsibility for instruction in the class of record

o Minimum of 90 days of instruction (approximately half of the full academic year)

for the class of record

What prior measures will be required?

o At least one prior year summative state test score in the same content area

Will any courses/ schools be specifically excluded and why?

Will any teachers/ leaders be specifically excluded and why?

What is the minimum n size?

o Class and school growth estimates reported for groups of 20 or more students, but

multiple years of data can be aggregated to reach 20 students.

What is the inclusion rule?

o Class scores are not reported if contributing students represent fewer than 25% of

class size.

o School scores are not reported if contributing students represent fewer than 25%

of school size.

What students will be included?

o Students in grades 4-8 continuously enrolled for the full academic year in the

current year participating in the state PAWS in reading or math.

o All prior test scores in PAWS reading or math regardless of term of enrollment.

Missing/ Incomplete Data

Another ‘data issue’ to address is missing and/or incomplete data. This situation exists when any

of the following occur:


One or more prior (pre) test scores are missing

The current year (post) test score is missing

The student is not continuously enrolled in a single building/class throughout the term of

instruction

The student record is missing or incomplete (e.g. test scores but no identifier)

Missing data can impact the precision and stability of the model and introduce systematic bias in

the resulting estimates (Braun et al, 2010). Moreover, it is generally acknowledged that data are

not Missing At Random (MAR), meaning that it is likely that the performance of students with

missing or incomplete data differ systematically from those with complete records. Consider,

for example, that mobility rates are typically higher for economically disadvantaged students

compared to other students.

There is no single or best approach to dealing with missing data. It is recommended that

Wyoming take these near-term steps moving forward.

Identify business rules to clearly define what data are usable and which are not.

Investigate the extent that data are missing for districts, schools, and classes. Seek to

understand patterns of missing data for various levels of performance and by subgroup.

Such analyses will help determine the extent to which data are MAR or differ in a

systematic manner.

Multiple Educators and Shared Attribution

Another issue to consider is how to handle circumstances where students receive instruction

from multiple educators. This may be regarded as a special case of the teacher/ leader of record

issue, but merits specific attention.

There are three general cases that lead to this occurrence. First, the student may receive planned,

ongoing instruction from multiple teachers, as with a team teaching approach or scheduled

support sessions. Second, changes can occur throughout the year, such as a leave of absence for

the primary instructor or the student transitions to another class. Finally, additional instruction

can occur in a variety of contexts, such as when a student receives tutoring outside of class.

Whatever the case, multiple sources of instruction will likely have an impact on student

achievement.

Some researchers have hypothesized that a ‘dosage’ model may be appropriate in such

circumstances. That is, if Ms. Smith provides 70% of instruction and Mr. Jones provides 30% of

instruction, then the outcomes are assigned to the educators consistent with the proportion of


instruction provided. While it may be useful to research the feasibility of this approach, the

following caveats should be considered:

It is unlikely that proportional contribution to instruction can be captured with precision,

particularly when it is unscheduled. Also, it will be necessary to create potentially

complex connections in the state data system to account for this.

The proportional contribution to instruction may not be governed by time alone. For

example, an hour spent introducing new concepts to a class may not represent the same

‘instructional contribution’ as an hour spent overseeing time allotted for student directed

study.

The research on attributing a student’s academic performance to teachers and leaders is

emerging – even for the least ambiguous circumstances when the teacher of record is well

defined. Much less is known about the credibility of results based on proportional

attribution of scores.

Therefore, we strongly recommend using the shared attribution State Model discussed in the

main part of this document and base decisions on which results get shared by which teachers on

an explicit theory of action or improvement for the school.

Performance Standards

Coherence

In order to maximize the coherence between school and educator evaluation system it is

desirable for performance expectations for growth at the class level to be similar by design to

growth targets at the school level. By so doing, the likelihood that outcomes will be favorable

for schools but not educators at that school (or vice versa) will be minimized. Additionally, it is

critical to ensure that the system does not create incentives that are in conflict.

More specifically, it is expected that growth outcome for classes will be the median student

growth percentile (MGP) and that standards for meeting and exceeding targets will be coherent

with those established for the school. At the time of writing this document, the growth targets

for schools have not been finalized, but draft plans call for three categories of performance—

high, typical, and low—at the school level in grades 4-8 based on PAWS.

Before moving forward with growth standards for the educator accountability system, there are

at least three critical considerations that should be addressed by the Advisory Committee. The

first is to determine the number and type of growth levels that need to be produced to support the

intended purposes and uses of the system. The second consideration is to explore the extent to

which the proposed growth rates are both attainable and meaningful at the class level. Based on

the documentation provided to date, it appears that the school targets were selected normatively.


That is, performance cutscores were selected based on the percentages of schools that would end

up in each of the three categories. However, it is less clear if the growth rates in the proposed

meets and exceeds range are sufficient to establishing meaningful growth to be on track to

achieve or maintain proficiency or readiness.

Finally, it is important to deal with the inherent unreliability of class level outcomes. Given that

class level results will be much more variable and subject to sampling error than school level

results, mechanisms must be put in place to deal with the lack of stability of outcomes in order to

have a greater degree of confidence in the results. The remaining two sections will address these

issues.

Reporting Outcomes

It is essential to determine the number and type of growth outcomes necessary to support the

purposes and uses of the educator evaluation system. In general, there is a tension between

reporting high-level results that are more reliable and the desire to report more nuanced but less

precise outcomes for multiple indicators. For example, there will be a much higher level of

confidence in classifications of class effects as low, typical, or high compared to a class effects

described on a ten point scale from 1 (ineffective) to 10 (highly effective). In the latter case,

stakeholders may regard this information as useful to understand more fine grained degrees of

difference, but such a scale may carry only the appearance of precision that is not supported by

evidence, particularly for adjacent ratings. The same issue is generally true for reporting units.

That is, results for individual content areas or classes will be much less defensible (and results

based on strands or subscores will be almost certainly indefensible) than aggregate results for

multiple classes. The goal, of course, is to find the balance between the necessary specificity of

outcomes and an acceptable level of precision. As a matter of best practice, is advisable to

privilege technical defensibility, in order to provide the best case for results to be meaningfully

interpreted and utilized.

Norm and Criterion Referenced Growth

Broadly, approaches to identifying growth standards can be characterized as either norm-

referenced or criterion-referenced. A norm-referenced approach compares student achievement

to an expectation often based on a distribution of observed performance. Alternatively,

criterion-referenced growth standards establish a specific target outcome. For example,

requiring students who are not proficient to grow at a rate such that they achieve proficiency in a

set amount of time is a criterion referenced approach.

Each approach has advantages and limitations. Setting a norm-referenced expectation is useful

for identifying comparably high or low growth. Indeed, it seems intuitively reasonable to


describe valued growth as that which is significantly higher than other students. However, a

limitation is that some students who grow at very high rates relative to their peers may not

achieve proficiency in a reasonable amount of time. A criterion-referenced standard resolves this

potential ‘growth to nowhere’ problem, but raises a new issue: some students may be so far

below standard that even at exceptionally high rates of growth the student will not achieve

proficiency in a reasonable time frame. Particularly when growth is used for accountability

purposes, this can create a condition where some classes are uniformly disadvantaged.

Conversely, very high performing classes could exhibit little or no growth and meet standard.

While the Advisory Committee recommended blending both normative and criterion approaches

for evaluating growth for school accountability purposes, standards for growth in educator

evaluation systems should only be normative. This is due to the fact that students, rightfully so

in many cases, are not randomly assigned to teachers. Requiring teachers to equally advance

students toward meaningful outcomes (e.g., proficient) does not take into account that this is

much more challenging for students far below proficient than for students closer to the proficient

cut. However, expecting all teachers to have their students grow at meaningful rates compared

to each student’s academic peers in a normative sense is fairer to all educators in the system.

Reliability

Reliability refers to the consistency or stability of a measure. In this case, we are interested in

the reliability of the measures of teacher/leader effectiveness based on a system influenced by

growth estimates. Reliability is challenging in this context due to the error in achievement

measures and growth measures and the likely variation in the performance of teachers – about

which, little is known. We know little, except anecdotally, about the extent to which

performance differs across content areas for the same teacher. For example, would we expect a

teacher to be effective in ELA but not math? If so, to what extent would the levels of

effectiveness differ? Further, how stable is teaching effectiveness across years? Could a teacher

be effective one year but not the next and if so, to what would we attribute this variability?

Ultimately, it is challenging to disentangle measurement error from true variation in

performance. In the end, an educator evaluation system is built on the assumption that

performance is “stable-enough” to reliably detect some differences in true effectiveness.

One way to mitigate issues of unreliability is to base overall outcomes on aggregations of results

within content areas for the current year and across multiple years. For example, if a teacher

teaches three sections of the same mathematics class, the median growth informing the

performance category is based on all students across sections. Additionally, if that teacher has

results for the prior year, the teacher’s outcome for the current year could be based on the median

of the two years combined. The idea behind this approach is to both minimize uncertainty. The

reliability of overall outcomes will also be improved by the manner in which additional elements


aside from academic growth are incorporated into the system (e.g. professional practices), but

that will be addressed separately.

Shared and Individual Attribution of Student Performance Results

The Advisory Committee recognizes the challenges of properly attributing the results of student

performance to individual teachers. Therefore, the State Model relies on a mix of shared

attribution and individual attribution of student performance results. The SGP results, based on

PAWS tests in grades 3-8 should, depending on the specific theory of improvement for the

particular school, be shared among educators at the same grade and/or teaching the same subject

areas. SLO results, assuming groups of educators are working on the same SLO, may also be

shared among educators at the same grade and/or content area. However, SLOs allow for more

control than state test results and the State Model requires that at least some portion of the SLOs

used to document student performance by attributed to the individual educator of record.

References

Data Quality Campaign. (2010a). Strengthening the teacher-student data link to inform teacher

quality efforts. Retrieved from: www.DataQualityCampaign.org/resources/947.

Data Quality Campaign. (2010b). Developing a definition of teacher of record. Retrieved from:

http://dataqualitycampaign.org/files/Teacher%20of%20Record.pdf.

National Research Council. 2010. Getting value out of value-added. H. Braun, N. Chudowsky,

and J. Koenig (eds.). Washington, DC: National Academy Press.

http://www.dataqualitycampaign.org/resources/947

http://dataqualitycampaign.org/files/Teacher%20of%20Record.pdf

Date post:	02-Oct-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

DRAFT: July 18, 2013 Introduction S… · WY State Model Educator Support and Evaluation System....

Documents