Does Helping John Help Sue? Evidence of Spillovers in ...

ThisworkisdistributedasaDiscussionPaperbythe

STANFORDINSTITUTEFORECONOMICPOLICYRESEARCH

SIEPRDiscussionPaperNo.16-020

DoesHelpingJohnHelpSue?

EvidenceofSpilloversinEducation

By

IsaacM.Opper

StanfordInstituteforEconomicPolicyResearchStanfordUniversityStanford,CA94305(650)725-1874

TheStanfordInstituteforEconomicPolicyResearchatStanfordUniversitysupportsresearchbearingoneconomicandpublicpolicyissues.TheSIEPRDiscussionPaperSeriesreportsonresearchandpolicyanalysisconductedbyresearchersaffiliatedwiththeInstitute.WorkingpapersinthisseriesreflecttheviewsoftheauthorsandnotnecessarilythoseoftheStanfordInstituteforEconomicPolicyResearchorStanford

University

Does Helping John Help Sue?

Evidence of Spillovers in Education∗

Isaac M. Opper†

April 28, 2016‡

Abstract

Using the fact that multiple elementary schools feed into the same middle school, I

demonstrate that the positive impact that teachers have on their own students spills

over to affect their students’ future peers. Although this indirect effect on any particular

individual is small a teacher impacts many more students indirectly than directly, so the

indirect value is a sizable portion of a teacher’s total value; I find that ignoring teachers’

indirect effects underestimates their value by roughly 35%. Because the spillovers also

affect teacher value added estimates, I develop a method of moments estimator of teacher

value added that accounts for the spillovers and show that accounting for the spillovers

does not have a large impact on the ranking of teachers in New York City. I conclude

by showing that the spillovers occur within groups of students who share the same race

and gender, which highlights the crucial importance that the social network plays in

disseminating the effect.

∗ This paper would not have been possible without the guidance, advice, and support of Ran Abramitzky,Liran Einav, Caroline Hoxby, and Raj Chetty. I’ve also been helped, either directly or indirectly, by nearlyevery member of the Stanford faculty, including Tim Bresnahan, Mark Duggan, Matthew Gentzkow, DaveDonaldson, Jonathan Levin, Susanna Loeb, Petra Persson, Brad Larsen, Melanie Morton, and PascalineDupas. I’m also grateful to Magne Mogstad and Enrico Moretti for taking the time to talk with me aboutthe project. In addition to help from faculty, the paper (and the author) has benefited greatly from manydiscussions with my fellow students, including, but not limited to: Zoe Cullen, Lindsay Fox, Rui Xu, AndresDrenek, Diego Perez, Pietro Tebaldi, and Joseph Orsini. Finally, I want to give special thanks to Igor Popovand Michael Dinerstein who have both helped the project from the very beginning, and without whom theproject would not have managed to morph from “idea” to “paper.” But don’t blame any of the people abovefor potential errors you might find; those are solely my fault.† Mailing Address: Stanford University Department of Economics, 579 Serra Mall, Stanford, CA 94305.

Email: [email protected].‡The most recent version of the paper can be found at: https://web.stanford.edu/~imopper

mailto:[email protected]

https://web.stanford.edu/~imopper

Isaac M. Opper

I Introduction

Convincing the average economist, school administrator, or parent that teachers matter

is not a difficult task. For the empirically minded, the importance of teachers has been

unequivocally demonstrated in the last half-century by volumes of research that document

large and persistent differences in the effectiveness of individual teachers.1 More recently,

improvements in data quality have allowed researchers to better quantify the impact of

teachers on their students’ short-term achievement, whether measured by test scores, at-

tendance, or discipline, as well as longer-term outcomes, such as high school graduation,

college attendance, and later-in-life earnings.2

Yet the value of teachers potentially extends beyond the impact they have on their own

students. One reason for this is that by increasing the ability of their own students, effective

teachers increase the peer ability for a much larger group of students. If students are affected

by their peers’ ability, broadly defined, the teacher thus indirectly affects all of his or her

students’ future peers. In this paper, I quantify this effect directly and demonstrate that

ignoring it leads to substantial underestimates of a teacher’s total value.

Estimating these spillovers is complicated by the well-known reflection problem (Manski

(1993)), along with the difficulties associated with correlated unobservables and endogenous

group formation. Further complicating matters, the estimation of the spillovers requires me

to distinguish between a student having high quality peers and a student having peers

who had high quality teachers. To understand why this distinction is important, suppose

that after having a good teacher, a student is more motivated to work hard in school and

that it is this increased motivation that positively affects the student’s subsequent peers.

This increase in motivation is partly captured by test scores, but test scores capture many

different components. So applying the literature on peer effects to this question will not

1Hanushek (1971) and Murnane (1975) were two of the first papers that used an empirical approach todemonstrate the importance of teachers.

2See Jackson et al. (2014), Koedel et al. (2015) and Staiger and Rockoff (2010) for three recent overviewsof this research.

2

Isaac M. Opper

necessarily lead to the correct spillover estimates.3

The approach in this paper is to estimate the spillovers by using the fact that multiple

elementary schools feed the same middle school. This way, I am able to treat with a high

quality teacher one subgroup in a larger class.4 Suppose, for example, that two elementary

schools, which for concreteness I will call Milton Fien and Robert J. Christen, feed one

middle school, and that in 2008 an effective teacher enters Milton Fien. By comparing the

students who attended Milton Fien after 2008 to those who attended Milton Fien before

2008, it is possible to estimate the direct effect of the teacher. Instead, I focus on the middle

school students who did not attend Milton Fien. While these students were not directly

affected by the new teacher, those who made the transition to middle school after 2008

entered a middle school with peers who had received better education than those who made

the transition earlier. Comparing the middle school test scores of students who attended

Robert J. Christen after 2008 to those who attended Robert J. Christen before 2008, I

directly estimate the indirect effect of the teacher.

This approach is valid as long as teacher turnover in a student’s neighboring elementary

school is uncorrelated with unobservable determinants of his or her test scores. A natural

concern is unobserved neighborhood shocks that both draw high quality teachers to the

local schools and lead independently to higher test scores. To determine whether this is a

concern, I conduct two tests. First, I test whether teacher transitions are correlated within

neighborhoods, and show that an effective teacher entering one elementary school does

not affect the probability that effective teachers enter the neighboring elementary schools.

While this suggests that there are no neighborhood shocks, I also use a placebo test to rule

out most other endogeneity concerns. After showing that the quality of a student’s current

peers’ previous teachers affects his or her test scores, I demonstrate that the quality of his

3A similar point is made with more mathematical rigor and is explored in detail in Fruehwirth (2014).4Keeping the peer groups fixed and exogenously treating a portion of them is often referred to as the

“partial population approach,” and is discussed in detail in Moffitt (2001). Other papers that have usedthe partial population approach in different contexts include Angelucci et al. (2010), Avvisati et al. (2014),Bobonis and Finan (2009), Dahl et al. (2014), Duflo and Saez (2003), Kuhn et al. (2011), Kremer and Miguel(2007), and Lalive and Cattaneo (2009).

3

Isaac M. Opper

or her future peers’ previous teachers does not affect his or her test scores.

The estimation itself is done using administrative data on all students who attended a

public school in New York City from 1990 to 2010. I show that an effective teacher not only

impacts his or her own students, but also individuals who later share a class with them.

This effect is both statistically and economically significant. My results suggest that an

increase in the average quality of a student’s peers’ previous teachers affects his or her test

scores by around 40% as much as an increase in his or her own teacher’s quality.

These spillovers have a large effect on teacher value estimates. In particular, I find

that ignoring a teacher’s effect on his or her student’s future peers understates a teacher’s

value by around 35%. These spillovers also lead naturally to concerns about direct value

added measures. To see why, suppose that half of teacher j’s students previously had an

ineffective teacher. The negative effect that this teacher had on his or her own students

is controlled for when estimating teacher j’s value added, but the negative effect that this

teacher had on the other half of the class is not controlled for. In practice, this negative

effect is misattributed to teacher j when constructing his or her value added. Motivated

by this, I develop method of moment estimator that simultaneously estimates teacher value

added and the spillover parameter. By comparing these teacher value added measures to

the traditional measures of teacher value added, I demonstrate that, in New York City at

least, accounting for the spillovers does not have a large effect on the ranking of teachers.

I conclude by shedding some light on why the spillovers occur, by providing some evi-

dence about what characteristics are generating the spillovers and more precisely defining

the relevant peer group. To understand the question of what spills over, I show that the

spillovers occur within-subjects and not across-subjects; that is, after showing that a stu-

dent’s English test score depends on the quality his or her peers’ previous English teachers,

I show that they do not also depend on the quality of his or her peers’ previous math

teachers.5 Finally, I show that the estimated spillovers occur within groups of students who

5The opposite is true for math test scores, that they depend more on the quality of their peers’ previousmath teachers than English teachers, but the difference is less pronounced for math than for English.

4

Isaac M. Opper

are the same race and gender, as opposed to within the entire school system. This illus-

trates the crucial importance of social networks in disseminating the effect, and suggests

that much of the spillovers are due to peer-to-peer interactions, rather than the entire effect

being mediated by changes to the classroom dynamics at the middle school.

This paper ties in to the large literature on teacher value, which has generally con-

centrated on whether teacher value added measures are biased,6 stable over time,7 stable

over measures,8 and correlated with longer-term outcomes.9 Nearly every paper on teacher

value, however, has focused only on how teachers affect their own students. The only other

paper I am aware of that quantifies different channels through which teachers can add value

to the school system is Jackson and Bruegmann (2009). While Jackson and Bruegmann

(2009) focuses on the fact that teachers can affect their peers, I focus on the fact that

students can affect their peers.

While my focus is on teacher value, the channel I investigate means that my paper is

also related to the literature on how students are affected by their peers.10 Unlike most peer

effect papers, I do not use exogenous changes in a peer group’s composition to identify peer

effects, instead using an exogenous treatment that only affects a portion of a larger group.

This means that my paper does not speak directly to the question of how regrouping students

affects their test scores, but provides evidence on how an educational treatment targeted at

6The question of whether teacher VA measures are biased has received much attention in both the mediaand the academic literature. Recent papers include: Kane and Staiger (2008), Rothstein (2010), Paufler andAmrein-Beardsley (2014), Goldhaber and Chaplin (2015), Kinsler (2012), Koedel and Betts (2011), Kane etal. (2013), Chetty et al. (2014a), Bacher-Hicks et al. (2014), Angrist et al. (2015), Deming (2014), Glazermanet al. (2013) and Rothstein (2014)

7Many studies have explored how stable the VA measures are over time, including McCaffrey et al. (2009),Loeb and Candelaria (2012), Goldhaber and Hansen (2013), Chetty et al. (2014a), and Glazerman et al.(2010).

8In addition to papers such as Corcoran et al. (2013), Lockwood et al. (2007) and Papay (2011), whichexplore whether the VA measures depend on the test, a number of papers investigate whether VA measuresare correlated with subjective performance reviews. These include: Jacob and Lefgren (2008) and Rockoffand Speroni (2010). More recently, the Measures of Effective Teaching Project (MET) explores in detailhow different measures of teacher effectiveness are correlated and how they can best be aggregated. Formore details on the MET project see Kane et al., eds (2014)

9In particular, Chetty et al. (2014b) and Jackson (2014) together show that high VA teachers increasethe probability that their students graduate from high school, the likelihood that they attend college, andtheir later-in-life earnings.

10Recent overviews of this research include Sacerdote (2011), Epple and Romano (2011), and Sacerdote(2014).

5

Isaac M. Opper

a population within a school can end up affecting the entire school system. The fact that

these two questions are conceptually different, and therefore can have different answers, was

first discussed by Manski (1993), in the context of endogenous versus exogenous peer effects.

Since most peer effect papers use an exogenous resorting of students as the identification

strategy, very few are able to separately identify endogenous and exogenous peer effects.11

This rest of this paper paper proceeds as follows. I start in Section II by describing the

data. I then fill in the details of the empirical strategy in Section III. Section IV presents the

main regression results and provides evidence in support of the identification assumption

by running two placebo tests and a specification test. Section V discusses how the spillovers

affect teacher value calculations and describes how I simultaneously estimate teacher value

added measures and the spillovers. Finally, Section VI provides some evidence on what

spills over and more precisely estimates the relevant peer group.

II Data, Context, and Teacher Value Added Estimation

In this section, I will briefly discuss the data I use and give a short description of the

teacher value added measure I use as a measure teacher quality. This section is meant only

to provide a brief overview, and I leave the details to Appendix A.

II.A Data and Context

The data used for my analysis consist of student-level administrative data from the New

York City Department of Education.12 It includes yearly information on the roughly 1.9

million students who attended grades 3-8 in New York City from the 1990-1991 school year

until the 2010-2011 school year. To simplify notation, in the rest of the paper I’ll denote

each school year by the year it began, e.g. the 2005-2006 school year as the 2005 school

year.

11Two exceptions are: Bramoulle et al. (2009) and Fruehwirth (2013).12This paper would be incomplete without it recognizing my enormous debt to Suzanne Elgendy at the

NYCDOE. Suzanne shepherded me throughout the entire process, always responding to my questions quicklyand without complaint.

6

II.A Data and Context Isaac M. Opper

For each observation, the data links every student to their school and grade. It also

maps each student to his or her math teacher and English teacher, who are generally the

same person in elementary school. In addition, I observe the year-end math and English

test scores for each student. I follow convention and normalize the test scores measures

for each exam by year, grade, and subject, so that the distribution of test scores for each

grade in every year has an average of zero and a standard deviation of one for both the

English test and the math test. This makes teacher VA and the regression coefficients easily

interpretable, and adjusts for changes in the scale of the test scores that occurred during this

time period.13 It also means that the estimates are not affected by aggregate changes to the

education system in New York City, since these changes are absorbed by the renormalization

of the test scores each year. Finally, I observe some demographic information about the

student, most notably his or her gender and his or her race.

Sample Restrictions. I restrict the sample in a few important ways. First, I drop

students in all non-standard grade codes and those in classes that have a disproportionate

number of the students classified as special education students. These tend to be separate

special education classrooms, which are usually co-taught and in which many students are

exempt from the year-end tests.

Second, I correct information that appears to be a misclassification. In particular, I code

as missing elementary school teachers who are initially assigned to more than 50 students

or less than 10 students in a year. For middle school, I assume that any teacher matched

to more than 200 students in one year is a misclassification and code these individuals as

not being matched to any teacher.

Finally, as discussed in Section III, my empirical strategy rests on comparing how a

group of students who transitioned from a particular elementary school (e.g. Milton Fien)

13The scale of the test changed during the time in part because the testing regime varied over the timeperiod. During the early years all tests were district specific, before the state of New York mandatedstatewide math and English tests in 4th and 8th grade in the late 1990s. Finally, in 2006, all tests becamestatewide as a result of No Child Left Behind. See Kelleher (2014) for more information about the recenthistory of testing in New York City, as well as a description of other changes that occurred during the timeperiod in question.

7

II.B Teacher Value Added Isaac M. Opper

to a particular middle school (e.g. Riverdale/Kingsbridge Academy) in a particular year

score on their tests relative to a group of students who made the same transition in the

previous year. Thus any students who make a non-traditional school transition, potentially

because their parents moved to a different part of New York City, is implicitly dropped

from the regressions since there is no comparison group of students who made the same

transition in the previous year.14 Even with these restrictions, I am still left with around 7

million student-subject-year observations.

II.B Teacher Value Added

In addition to the above variables, the analysis requires a measure of teacher quality. I

follow convention and measure the quality of teachers by estimating their effect on the

contemporaneous test scores of their students. This measure is commonly referred to as

the teacher’s value added (VA). I estimate VA using the same technique as in Chetty et al.

(2014a), so I will not elaborate on all the technical details in this paper.15,16 Instead, I’ll

quickly sketch the main steps of the technique to provide some intuition, and leave a more

detailed description to Appendix A and Chetty et al. (2014a).

Estimating Teacher Value Added. The Chetty et al. (2014a) method for estimating

VA begins by removing the determinants of students’ test scores that a teacher cannot

affect. This is done by regressing students’ test score on a vector of student observables.17

While it is important to include as controls a flexible function of the students’ previous test

scores, adding additional controls does little to change the VA estimates. Thus, my main

specification includes only cubics of the students’ lagged math and English test scores. I

14I address the potential selection concern that this causes in Appendix B.15This means I estimate teacher VA using a technique that assumes there are not spillovers, and then use

those estimates to test whether spillovers exit. I develop a VA estimation procedure that accounts for thespillovers later. Since this procedure does not change the results in any meaningful way I’ve delayed thatdiscussion until later in the paper.

16For some of the analysis I use the Stata ado file, called vam.ado, which was provided by Chetty et al.(2014a). This is possible in part due to the hard work of Michael Stepner, who ensured that the ado file wasflexible enough for me to use and clear enough for me to be confident about the details of the program.

17In practice, students have two test scores per year: English and math. I estimate teacher VA separatelyfor these two subjects, as well as for elementary and middle school teachers.

8

II.B Teacher Value Added Isaac M. Opper

also conduct a number of robustness checks which use different control vectors to estimate

VA.18

The regression results in student-subject-year level residuals, which are then aggregated

to the teacher-subject-year level. To simplify wording, I will generally start referring to a

“student-subject” combination as simply a “student” and a “teacher-subject” combination

as simply a “teacher.” These teacher-year measures combine the impact of the teacher on

his or her students with all the uncontrolled for determinants of the students’ test scores.

To remove the contemporaneous error terms from the eventual teacher VA measures, the

Chetty et al. (2014a) technique predicts the teacher-year residuals with all other teacher-

year residuals from the same teacher. It is this prediction that becomes the teacher VA

measure for that year. I will henceforth denote the estimated VA of teacher j in year t as:

µ̂j,t.

Descriptive Statistics. Figure 1 plots the distributions of the estimated teacher VA. The

standard deviation of µ̂j,t is 0.13 for math and 0.10 for English. In general, elementary

school teachers who are good at increasing their students’ math scores also tend to be good

at increasing their students’ English test scores. This is shown in Figure 2, which plots the

within-teacher-year correlations between estimated English VA and math VA.

Although a teacher’s estimated VA fluctuates over time, the vast majority of variation

in teacher VA is across teachers. This can be seen in the autocorrelation estimates, which

are shown in Figure 3. While autocorrelations of teacher-year residuals is never above 0.5,

autocorrelations of a teacher’s VA measure are much higher, being anywhere from 0.8 to

0.95.

18These are shown in Appendix B, which demonstrates both the magnitude of the coefficients and theirt-statistics are nearly identical to the ones presented in Table 1 when using VA estimates that include allthe controls used in Chetty et al. (2014a).

9

Isaac M. Opper

III Empirical Strategy and Identification

III.A Empirical Strategy

Harkening back to the stylized example in the introduction, suppose that two elementary

schools (Milton Fien and Robert J. Christen) feed one middle school (Riverdale/Kingsbridge

Academy). Teacher entry into and exit out of Milton Fien only affects Riverdale/Kingsbridge

students who attended Robert J. Christen through the change in the quality of their peers’

previous teachers. Thus, estimating how the middle school test scores of these former-

Robert J. Christen students depend on teacher transitions at Milton Fien cleanly identifies

the channel in question. In reality, some Riverdale/Kingsbridge students attended neither

Milton Fien nor Robert J. Christen, and some Milton Fien students attended a different

middle school. This subsection discusses how these complications are dealt with in a re-

gression framework.

The main regression specification is straightforward; I simply regress changes in mean

test scores across cohorts on changes in the mean teacher VA of their peers’ previous teachers

and on changes in a vector of other controls. To formalize this, I need to add some notation;

first, let (i, t) be a cohort of individuals who transferred from the same elementary school to

the same middle school, whose combination is denoted by i, in year t. Second, let yi,t denote

the average test scores of these students and ∆yi,t = yi,t − yi,t−1. We can likewise define

∆Xi,t as the change in the average value of various control variables of the two cohorts.

These controls vary across the different specifications, and at various times include the

cohort’s previous teachers’ own teacher VA, their current teachers’ VA, their own previous

test scores, and their new peers’ baseline test scores. Finally, letting c(i, t) denote the set

of students who are cohort i’s peers in year t, we can define ∆µ̂rawc(i,t),t−1 as the changes

in the mean teacher VA of their peers’ previous teachers. This gives rise to the following

regression:

∆yi,t = α+ β∆Xi,t + γ∆µ̂rawc(i,t),t−1 + ∆εi,t (1)

Using the raw average to construct ∆µ̂rawc(i,t),t−1 means that the measure varies for three

10

III.A Empirical Strategy Isaac M. Opper

reasons: first, it varies because of teachers entering or exiting the neighboring schools;

second, it varies because of changes in the sorting patterns of students to teachers at the

neighboring elementary schools; third, it varies because of changes in the way students at the

neighboring elementary school get sorted to middle schools. I want to restrict the identifying

variation to be variation in the teacher quality at i’s neighboring elementary schools and

not variation in sorting patterns.19 I therefore construct a measure that excludes other

variation, which I denote as ∆µ̂c(i,t),t−1. For notation, let µse,t−1 be the average 5th grade

teacher VA of elementary school se in time t− 1 and ∆µse,t−1 = µse,t−1 − µse,t−2. Finally,

I will denote the overall fraction of students in middle school sm in year t who attended

elementary school se in year t− 1 as αse,sm,t.

If cohort i attended elementary school se and is attending middle school sm in period

t, I construct the measure as:

∆µ̂c(i,t),t−1 =∑∀s′e 6=se

αs′e,sm,t∆µs′e,t−1 (2)

To understand this measure, it is worth discussing the two ways it differs from raw

average of i’s new peers’ previous teacher’s VA. First, I estimate the change in average

teacher VA at each elementary school and then take a weighted sum of these changes instead

of taking a weighted sum of teacher VA at each elementary school and then estimating how

that changed.20 Doing so ensures that changes in the way students at the neighboring

elementary school get sorted to middle schools has no effect on the measure. Second, I

estimate the change in average teacher VA at each elementary school, instead of the change

in average teacher VA of students who attend middle school sm; this ensures that changes

in the type of students at s′e who choose sm will not affect the measure, nor will changes in

19If the changes in sorting patterns are exogenous, using this measure reduces my statistical power. Whilethere is no evidence that the small sorting changes in my data are endogenous, since statistical power is notan issue in my analysis, I err on the side of caution and use this measure.

20Mathematically, this means that my measure is∑∀s′e 6=se

αs′e,sm,t∆µse,t−1 instead of∑∀s′e 6=se

∆αs′e,sm,tµse,t−1.

11

III.B Identification Isaac M. Opper

how the teachers at elementary school s′e are assigned to students.21

There are two more details worth mentioning. First, when estimating the components

of ∆µs′e,t−1 (i.e µs′e,t−1 and µs′e,t−2) I exclude individuals who attended elementary school

s′e in either year t − 1 or t − 2. This, which is identical to the approach in Chetty et al.

(2014a), removes any mechanical correlation between changes in cohort (i, t)’s new peer’s

underlying quality and my measure of their previous teachers’ quality. Second, during the

early to mid-1990s, I am unable to match a number of students to their teachers.22 To

account for this, I estimate αs′e,sm,t as the overall fraction of students in middle school sm

in year t who attended elementary school se in year t−1 and who have non-missing teacher

VA estimates. 23 This weighting, however, only matters in the early years of the data when

the missing data is an issue, and I get similar results when I run regressions that only use

the later years.24

III.B Identification

For my approach to correctly identify the indirect effect of a teacher, teacher turnover in a

student’s neighboring elementary school must be uncorrelated with unobservable determi-

nants of his or her test scores. Note that this identification assumption is quite similar to

that of a number of other papers, including Chetty et al. (2014a) and Jackson and Brueg-

mann (2009). There is one difference, however; while I assume that teacher turnover in

a student’s neighboring elementary school is uncorrelated with unobservable determinants

of his or her test scores, they assume that teacher turnover in a student’s own elementary

school is uncorrelated with unobservable determinants of his or her test scores.

A natural identification concern is that there are neighborhood shocks that both draw in

21Even with these adjustments, however, the measure I construct is quite correlated (ρ = 0.71) with theraw average and the results are similar when using the raw average as the regressor instead of the measurediscussed.

22For a more detailed discussion of this, see Appendix A.23This is akin to assuming that the net effect of teacher transitions among the unmatched teachers is zero.24By shrinking the measure toward zero when I am missing VA measures, I am ensuring that the parameter

is estimated using variation at the schools and years that are not missing VA measures. It is thereforeunsurprising that doing this explicitly, by running the regressions on more recent years, gives the sameresult as doing this implicitly.

12

Isaac M. Opper

high quality teachers and lead independently to higher test score growth for the neighbor-

hood students. If this was the case, we would expect to see intra-neighborhood correlations

in teacher entry and exit, good teachers entering one elementary school being correlated

with good teachers entering the neighboring elementary schools. Figure 4a shows that this

intra-neighborhood correlation is low. In addition, Figure 4b and 4c shows that, within a

single elementary school, a good teacher entering the school in one year does not increase

the probability that good teachers enter the school in subsequent years and that a good

teacher entering 5th grade slightly decreases the chance that a good teacher enters the 4th

grade in the same year. Although I test my identification assumption later using a placebo

test and a specification test, these results are suggestive evidence that the identification

assumption holds.25

IV Are There Spillovers?

IV.A Regression Results

I now present the results of the baseline regression which illustrates the main result of the

paper: students are affected by their peers’ previous teachers in a statistically significant

and economically meaningful way. This is shown in Table 1. Although the magnitude

and statistical significance differ slightly across the four columns, the results are broadly

consistent: a coefficient around 0.4, with a standard error of around 0.13. In addition, Table

2 shows that the main result holds regardless of whether using English or math test scores

and regardless of whether focusing exclusively on the elementary-to-middle school transition

or exploiting the fact that some students transfer between schools in every grade.26

To understand the magnitude of the estimate, imagine that the family of a young el-

ementary student decides to move to rural Montana, where there is only one teacher per

25It is also worth noting that I control for these measures in the regressions discussed below, so even if thecorrelations shown in Figure 4 were strongly positive, it would not necessarily cause a problem. Instead, astrongly positive correlation would be suggestive of a concern, rather than proof of it.

26I conduct a number of additional robustness checks, which I discuss in Appendix B.

13

IV.A Regression Results Isaac M. Opper

grade. To make the example more concrete, I’ll call the young student Richard and assume

that he is about to enter 2nd grade. My results suggest that school’s 1st grade teacher had

a large effect on Richard, even though Richard entered in 2nd grade. In fact, the coefficients

reported in Table 1, suggest that, because all of Richard’s new peers had the same 1st grade

teacher, the 1st grade teacher had around 40% as much of an impact on Richard’s second

grade test scores as Richard’s new second grade teacher.27 Note that this estimate only

measures the 1st grade teacher’s affect on Richard that is due to the teacher’s direct effect

on his or her students. It ignores a number of other ways that he or she could have an

impact on Richard, such as affecting the 2nd grade teacher or affecting Richard directly due

to their personal interactions at the school.

This hypothetical example needs to be presented with the caveat that it reflects a

relatively large extrapolation of my results out of sample. New York City is not Montana,

and it is very rarely the case in New York City that a student’s peers all had the same

teacher in the previous year. This means that the standard deviation of a student’s peers’

previous teachers’ average VA is less than the standard deviation of a student’s teacher’s

VA. We can instead ask how a one standard deviation increase in Richard’s peers’ 1st grade

teachers’ average VA affects his test scores, relative to a one standard deviation increase

in Richard’s 2nd grade teacher’s VA. Using the coefficients in Table 1 and the fact that,

in New York City, standard deviation of a student’s peers’ previous teachers’ average VA

is a little over 60% of the standard deviation of a student’s teacher’s VA, I find that had

Richard grown up in New York City, the draw from the distribution of his peers’ 1st grade

teacher’s VA would have affected his 2nd grade test scores by about 25% as much as the

draw from the distribution of his teacher’s VA.

This example hints at a point I will return to later; the importance of these spillovers

depends on the variance of the distribution of a student’s peers’ lagged teacher VA. In

areas where the students in a class had a number different teachers in the previous year

27Implicit in this is the fact that an increase in a student’s teacher’s value added increases the student’stest score one-for-one, as shown by Chetty et al. (2014a).

14

IV.B Assessing the Identification Assumption Isaac M. Opper

and where teachers do not sort based on quality, the law of large numbers suggests that

the distribution of student’s peers’ lagged teacher VA will have small variance and thus the

quantitative importance of the spillovers will be minimal. The law of large numbers does

not apply, however, if the average of teacher’s VA is not an average over random draws (i.e.

if teachers sort into schools, neighborhood, or districts) or if there are a small number of

draws (i.e. if most of the class and the same teacher in the previous year).

I’ve focused so far on how 5th grade teachers affect their student’s 6th grade peers, but

it is possible that 4th grade teachers also matter to their student’s 6th grade peers. In fact,

understanding whether or not they do is important when measuring the teachers indirect

value in Section V. Table 3 presents regressions similar to the ones reported in Table 1;

however, I now include both the teacher VA of a student’s peers’ 5th grade teachers and

the teacher VA of a student’s peers’ 4th grade teachers. Not surprisingly, the effect of a

student’s peers’ 5th grade teachers on his or her 6th grade test scores is larger than the

effect of a student’s peers’ 4th grade teachers, but the student’s peers’ 4th grade teachers

do matter.

IV.B Assessing the Identification Assumption

So far, I have shown that the previous teacher’s VA of a student’s peers is correlated with

the student’s own test scores, even if the student had no personal interaction with his or

her peers’ previous teachers. In addition, I showed evidence at the end of the Section III

that the teacher transitions I’m basing this finding off of appear to be exogenous. This

leads me to believe that the correlation I show is indeed the causal impact of a student’s

peers’ teachers on him or her, which acts through the change in his or her peers.28 But can

I be sure? This section presents two placebo tests and a specification test that support the

causal interpretation.

28The focus here is on demonstrating that the parameter I estimate is “causal,” in the sense that itaccurately reflects what would happen if I had the power to randomly swap a low VA teacher with a highVA teacher in New York City. I spend Section VI trying to better understand the underlying causes of theeffect.

15


IV.B.1 Placebo Tests

Is a student affected by past or future VA changes at the neighboring elementary schools?

In the main specification, I regress changes in mean test scores across cohorts on changes

in the mean teacher VA of their peers’ previous teachers. Yet there is no reason why the

changes in the mean teacher VA of their peers’ previous teachers need to be measured in

the same year as the changes in mean test scores across cohorts. In particular, instead of

running the regression specified in Equation 1, I can instead run the following specification:

∆yi,t = α+ β∆Xi,t +

2∑k=−2

γk∆µ̂c(i,t),t−1+k + ∆εi,t (3)

If the estimates in Table 1 are causal, γ̂0 will be similar γ̂ in Table 1, and γ̂−2 = γ̂−1 =

γ̂1 = γ̂2 = 0.

The results are shown in tabular form in Table 4 and as a figure in Figure 5a. In all

specifications, changes in the relevant year affect the individuals more than changes in either

lag years or lead years. In each of the specifications it is not possible to reject an F-test

that each γ̂k 6= γ̂0 are equal to zero. In contrast, it is possible to reject the null hypothesis

that all γ̂k are equal to each other. Finally, comparing the results from Table 4 to Table 1,

we see that the spillover estimates are similar. Thus, the results provide demonstrate that

the contemporaneous change in a student’s peers’ previous teacher VA is correlated with

their own test scores, but past or future changes are not.

Is a student affected by VA changes at the neighboring elementary schools before he or she

enters middle school?

For the second placebo test, I estimate the effect of 4th grade teacher transitions at a

student’s neighboring elementary schools on his or her 5th grade test scores, instead of

estimating the effect of 5th grade teacher transitions at a student’s neighboring elementary

school on his or her 6th grade test scores. Stated another way, I now measure the correlation

between a student’s test scores and his or her future peers’ previous teacher’s VA, instead

16


of his or her current peers’ previous teacher’s VA. If the correlation I demonstrate in the

previous section is due to a confounding variable, it will not matter which transition I use.

If instead it is a causal effect, I should find no correlation when using the 5th grade test

scores.29

The results of the placebo test are presented in Table 5. All of the specifications give

coefficients that are statistically insignificant from zero. This is not simply due to larger

standard errors, as the estimated coefficients in Table 5 are always much lower than in Table

1. It is also possible to combine the two placebo tests, which is shown in Figure 5a.

Combined with the baseline results, this gives rise to a compelling story: changes to

the quality of a student’s eventual peers’ teachers has no effect on the student before they

become his or her peers; once they become his or her peers, however, it has a large and

significant impact.

IV.B.2 Specification Test

Is a student more affected by changes to the VA at the neighboring elementary schools

when more of his or her middle school peers had attended them?

If the correlation I have demonstrated so far is causal, changes in the average teacher VA at

students’ neighboring elementary schools will affect them more when more of their middle

school peers come from the neighboring elementary schools. This section provides more

evidence in support of the identification assumption by testing this proposition directly.

The key to this test is that the regression separately controls for average teacher VA

of student i’s neighboring elementary schools and for the interaction between the average

teacher VA of student i’s neighboring elementary schools and the fraction of students at

middle school sm who did not attend student i’s neighboring elementary schools. If my

results are indeed due to the changes in a student’s peers’ underlying ability caused by the

quality of his or her teacher, the interaction term will matter and the un-interacted term

will not. If the results are due to spurious correlation, it is likely that this correlation will

29I leave the details of the regression to Appendix C

17

Isaac M. Opper

be picked up in the un-interacted term and not in the interacted term. Because the rest of

the regression is nearly identical to the previous ones, I leave the details to Appendix C.

Table 6 demonstrates that as the percentage of student’s peers who are affected by

changes in teacher VA increases, the effect on the student increases. In all of the speci-

fications, the interaction term is positive, statistically significant, and very similar to the

results in Table 1. The coefficient on the unweighted change, in contrast, is a quite precisely

estimated zero. This provides one more piece of evidence that my results are indeed due to

the proposed causal mechanism.

V How Do Spillovers Affect Teacher Value?

The previous section estimated the spillovers from the perspective of a student. In this

section, I instead view the results from the perspective of a teacher by asking two questions.

First, I ask how much of a teacher’s value is due to the spillover effect I estimated in the last

section. Second, I ask whether the accounting for these spillovers changes who is thought

of as a good teacher.

V.A How Much of Teachers’ Value is Due to Their Spillovers?

An important implication of the previous section’s results is that studies which ignore the

spillover effects of a teacher are underestimating the value of an effective teacher. Yet it is

not immediately obvious from Table 1 how large this underestimation is. This subsection

provides the answer by quantifying how large the indirect value of a teacher on his or her

own students’ future peers’ test scores is relative to the teacher’s direct value on his or her

own student’s test scores.

V.A.1 Indirect Value Calculations

The first step is to estimate each teacher’s VA measure, which is described in Section II

and Appendix A. Once I have these estimates, it is easy to calculate the teacher’s direct

18

V.A Importance of Indirect Value Isaac M. Opper

value; I simply multiply his or her VA measure by the number of students he or she taught.

To more easily compare the direct value calculations to the indirect value calculations, it is

worth expressing this as a mathematical formula. Letting j(i, t) denote student i’s teacher

in year t, we can write the above description of a teacher j′’s direct value in year t described

as:30

DirectV aluej′,t =∑∀i

dyi,tdµj(i,t),t

·dµ̂j(i,t),t

dµ̂j′,t· µ̂j′,t (4)

The formula is helpful because a very similar one can be used to calculate a teacher’s

indirect value. As a reminder, I denote student i’s peers’ previous teacher’s VA as µ̂c(i,t+1),t.

Then teacher j′’s indirect value in year t is:

IndirectV aluej′,t =∑∀i

dyi,t+1

dµ̂c(i,t+1),t

·dµ̂c(i,t+1),t

dµ̂j′,t· µ̂j′,t (5)

Section IV provides estimates ofdyi,t+1

dµ̂c(i,t+1),t, which is the γ parameter reported in Table

1, so I need only determinedµ̂c(i,t+1),t

dµ̂j′,t, or how teacher j′ affects student i’s peers’ previous

teacher VA.

To do so, however, I will need to make one additional assumption in order to more

clearly define µ̂c(i,t+1),t. More specifically, I need to determine whether this average should

include all of student i’s time t+ 1 peers, or all of the student i’s year t+ 1 peers who had

a different teacher than i in year t. Fundamentally, this depends on the mechanisms of the

estimated peer effects. Suppose, for example, that the reason for the indirect VA is that

peers learn from each other. In this case, an effective teacher improves the knowledge of

their students, who then pass that knowledge onto their new peers in the subsequent year.

This suggests that two students who had the same effective teacher do not positively affect

each other, since they have no additional knowledge to “pass on” to each other. I call this

the “Learning from Peers Model.”

In contrast, it is possible that the reason for the indirect VA is because students are

30This is simplified to the description above becausedµ̂j(i,t),t

dµ̂j′,t= 1 if and only if student i had teacher j′

in year t and because Chetty et al. (2014a) demonstrates thatdyi,t

dµj(i,t),t= 1.

19


easier to teach after they’ve had a good teacher. Maybe, for example, the student is more

motivated and so needs less attention from their next teacher. This frees up time for their

new teacher to focus on the other students in the class, improving their test scores. In this

case, a student is positively affected by their current peers having had an effective teacher,

even if the student also had that teacher. I call this the “Easier To Teach Model.” With

little empirical evidence to determine the correct model, I calculate teacher’s indirect value

under both models.

I also run two specifications which vary in how I account for the dynamic nature of the

estimates. Equations (4) and (5) both provide the direct and indirect value calculations

for what I call the “Static Calculations.” These include only the first year each student is

affected by the teacher.

Another approach, which I call the “Dynamic Calculations” calculates the total affect a

teacher has in the three years after he or she taught a group of students.31 These calculations

are done using the following equations:

DirectV aluej′,t =∑∀i

[( dyi,t+2

dµj(i,t),t+

dyi,t+1

dµj(i,t),t+

dyi,tdµj(i,t),t

)·dµ̂j(i,t),t

dµ̂j′,t

]· µ̂j′,t (6)

IndirectV aluej′,t =∑∀i

[( dyi,t+2

dµ̂c(i,t+2),t

·dµ̂c(i,t+2),t

dµ̂j′,t

)+( dyi,t+1

dµ̂c(i,t+1),t

·dµ̂c(i,t+1),t

dµ̂j′,t

)]· µ̂j′,t

(7)

The Dynamic Calculations credits the teacher with his or her effect on his or her student’s

test scores three times; conceptually, this might be triple-counting direct value.32 Likewise,

the Dynamic Calculations potentially double-counts the effect of a teacher on his or her

student’s future peers, but it also accounts for the fact that, for example, a student’s peers

31I only focus on the three years after a teacher has taught a group of students due to limitations inestimating spillover dynamics.

32How to handle dynamics is complicated by the findings of Chetty et al. (2014b) which show that theeffect of a teacher on a student’s test scores fade out rapidly, yet the teacher still has long-term effects on thestudent’s college enrollment decisions and lifetime earnings. Although surprising, the result that a treatmenthas short-term fade-out and long-term re-emergence has been also been shown in the context of Head Start(Deming (2009)), Project START (Chetty et al. (2011)), and the Perry Preschool Program (Heckman et al.(2013)).

20


in 2009 can differ from a student’s peers in 2010.

There are two additional important assumptions that I use in all calculations used to

construct Table 7. First, I assume that a 0.1 standard deviation test score increase for ten

students is valued equivalently to one student having a 1.0 standard deviation test score

increase. Because the per-person indirect effect is much smaller than the direct effect, any

change to this weighting will have enormous impacts on the results in Table 7. If society

believes that ten students increasing their test scores by 0.1 has more “value” than one

student increasing his or her test score by 1, indirect VA becomes larger than direct VA. On

the other hand, if large increases are valued disproportionately more than small increases,

indirect VA ends up being relatively unimportant.

Second, I define peer groups at a school-grade-year level. In practice, the social structure

is more complicated than that. More likely, a peer group should be defined as a classroom or

a social group.33 That said, this assumption does not have much of an effect on the results

in Table 7. A more narrowly defined peer group would increase the per-person indirect

effect, but decrease the number of people affected. Given the linearity assumption, the

total change would be minimal.34

V.A.2 Results

As shown in Table 7, indirect value is around 20 - 30% of the total value, depending on

whether I use Dynamic Calculations or Static Calculations and the Learning From Peers

Model or the Easier to Teach Model. Put another way, the total value of a teacher is

around 25 - 45% higher than what is usually estimated, since the previous estimates do not

include indirect value in their calculations. While the indirect value is a larger percent of

the total value under the Easier to Teach Model than the Learning From Peers Model, the

difference only amounts to about 4 percentage points. The intuition for the small difference

is straightforward. The difference between the models is driven by how I treat students in

33I attempt to determine the correct peer group definition in Section VI.34In fact, if you believe in the Easier to Teacher Model and prefer the Static Calculations, the peer group

definition has no effect on the results.

21

V.B Effect on Direct VA Estimates Isaac M. Opper

the same peer group who previously had the same teacher; in New York City, this is usually

a small fraction of the total students in the peer group.

It is important to note that the results do not suggest that teachers affect the individuals

who later share a class with their students by half as much as they affect their own students.

The previous sentence requires a comparison of per-person effects, where the results in Table

7 reports aggregate effects. As an example, suppose that there is a class of 30 students, 10

of whom previously had Ms. Smith as a teacher. Using the Static Calculation, Ms. Smith’s

per-person direct effect is simply her VA. Her per-person indirect effect, however, is a bit

more difficult to calculate. Since she taught one-third of the class,dµ̂c(i,t+1),t

dµ̂j′,t= 1

3 . As shown

in Table 1, a one-unit increase in µ̂c(i,t+1),t leads to a 0.35 increase in a student’s test score,

so her per-person indirect effect is one-third times her VA times 0.35. Thus, her per-person

indirect effect is about 11 percent of her per-person direct effect.35 Yet, under the Easier

to Teach Model, she affects three times as many students indirectly as she does directly, so

her total indirect effect is 35 percent her total direct effect.36

V.B Do Spillovers Affect Direct Teacher Value Added Estimates?

So far, I have taken as given the direct VA value for each teacher, yet the presence of the

spillovers has the potential to affect the direct VA measures. To see why, suppose that half

of teacher j’s students previously had an ineffective teacher. The negative effect that this

teacher had on his or own students is controlled for when estimating teacher j’s VA. The

negative effect that this teacher had on the other half of the class, however, is not controlled

for; in practice, this negative effect is misattributed to teacher j when estimating his or her

direct VA.

The above example suggests that one should control for the spillovers when estimating

VA, otherwise the direct VA estimates are biased.37 Doing so, however, is more difficult

35This is calculated as ( 13· 0.35 · VA)/VA.

36Under the Easier to Teach Model, she affects the entire class indirectly, and 10 students directly. Underthe Learning From Peers Model, she affects 20 students indirectly, and 10 students directly.

37It is important to note, however, that this bias is not a problem when comparing two different teacherswho teach the same grade at the same school, since their students would have been taught by similar quality

22


than simply including another variable in the control vector used to residualize the student’s

test score. It is impossible to control for the spillovers without having teacher VA measures,

and yet the teacher VA measures are biased unless they are estimated while controlling for

the spillovers. I resolve this issue by simultaneously estimating teacher VA and the spillover

parameter using a method of moments estimator.

V.B.1 Method of Moments Estimator

The method of moment estimator is explicitly designed to mimic as closely as possible

both the Chetty et al. (2014a) technique for estimating teacher VA and the regressions I’ve

already discussed. Thus, the first steps of the method of moments estimator are the same as

Chetty et al. (2014a). First, I regress student i’s year t test score, denoted yi,t on the same

vector of student i observables used before, denoted as Xi,t. I then use this to construct

student-level residuals, y∗i,t = yi,t − β̂Xi,t.

These student-level residuals are then aggregated to the teacher-year level, which I

denote Aj,t. Thus, denoting c(j, t) as the set of students that teacher j teachers in year t:

Aj,t ≡∑∀i∈c(j,t)

yi,t − β̂Xi,t (8)

The validity of the VA measures relies on the fact that these teacher-year values, Ai,t,

represent the true teacher VA plus a mean-zero error term. But the results from Section

IV suggest that Ai,t also consists of spillovers. Mathematically, this means that:

Aj,t = µj,t + γµc(j,t),t−1 + νj,t (9)

where µj,t is teacher j’s true VA in year t, µc(j,t),t−1 is the t − 1 average teacher VA of

the students that teacher j teaches in year t, γ is the spillover parameter, and νj,t is a

teachers. This means the bias might not be a problem, depending on how the VA estimates are used, but italso means that using within-school randomization of teachers to validate VA measures, such as like Kaneand Staiger (2008) and Kane et al. (2013), will not find evidence of the bias even if it does exist.

23


mean-zero error term.38 From this, it is clear that using Aj,t − γµc(j,t),t−1 in the place of

Aj,t will correct for the bias in teacher VA measures; the difficulty is that we cannot do so

without already knowing γ and µc(j,t),t−1.

Instead of using the true γ and µc(j,t),t−1, I initially use the estimated γ̂0 in Table 1 and

the teacher VA estimates derived from the traditional approach to estimate:

B(γ̂0)j,t ≡ Aj,t − γ̂µ̂c(j,t),t−1 (10)

I then estimate teacher VA using B(γ̂)j,t in the same way that Chetty et al. (2014a) uses

Aj,t, which is described in Appendix A. This gives rise to a new set of value added estimates,

which I will denote µ̂j,t(γ̂0) since they depended on the initial estimate of γ̂0.39 Armed with

these estimates, it is possible to run the same regression I used for the estimates presented

in Table 1. As a reminder, this is:

By definition, this regression provides an estimate of γ̂1 such that:

∑(∆yi,t − α̂− β̂∆Xi,t − γ̂1∆µ̂c(i,t),t−1(γ̂0)

)·(

∆µ̂c(i,t),t−1(γ̂0))

= 0 (11)

The problem with this estimate of γ̂1 is that it was generated using teacher VA measures

that were estimated by assuming that γ was γ̂0. To correct for this inconsistency I iterate

this program K times until γ̂K ≈ ˆγK−1.40 More formally, this process defines a method of

moments estimator of γ, denoted γ̂MM , which solves for the γ such that:

∑(∆yi,t − α̂1 − β̂1∆Xi,t − ˆγMM∆µ̂c(i,t),t−1(

ˆγMM ))·(

∆µ̂c(i,t),t−1(ˆγMM )

)= 0 (12)

38Note that this implicitly assumes the “Easier to Teacher” model of the spillovers, discussed in SectionV. It also ignores the dynamic results in Table 3, which suggest that I should also control for µc(j,t),t−2 inthe specification.

39To simplify notation, I will leave implicit that these estimates also depend on the initial estimates ofteacher VA.

40I run the program until the two estimates of γ do not differ by more than 0.001. I also repeat thisprocedure using different starting values of γ̂0.

24


V.B.2 Results

The method of moment estimation gives estimates of both the spillover parameter (γ̂MM )

and teacher value added (µ̂j,t(γ̂MM )). While the estimate of γ̂MM is slightly larger than the

estimated γ̂ from the reduced form regressions, it is less precisely estimated and the main

conclusions do not change. My focus here will be how the teacher VA estimates differ from

teacher VA estimates that do correct for the spillovers. These latter estimates implicitly

assume that γ = 0, so I will denote them as µ̂j,t(0) and call them “conventional” teacher

VA estimates.

As illustrated in Figure 6, the estimates of µ̂j,t(γ̂MM ) are quite similar to the estimates

of µ̂j,t(0). In fact, the correlation between the conventional estimates of teacher VA and the

adjusted teacher VA estimates is 0.986. For comparison, this is roughly the same correlation

between the estimates of teacher VA that include teacher experience and those that do not,

as shown in Table 6 in Chetty et al. (2014a).

This seems a bit surprising. How is it true that student’s are affected by their peers’

lagged teacher’s VA, but controlling for this does not change the VA estimates? While these

two results might seem to be conflicting, the explanation is clear. The bias in teacher VA

due to the spillovers depends not only on how large the spillovers are, but also on how much

variation there is in µc(j,t),t−1. The low variation in µc(j,t),t−1 in my data explains why the

correlation between µ̂j,t(γ̂MM ) and µ̂j,t(0) is so high.

The low variation in µc(j,t),t−1 is, in turn, driven by three aspects. The first is that

even the best teachers have a limited impact on their students’ test scores, which means

that the variance of µ̂j,t is small relative to the variance of student test scores. The second

two are features of education in New York City: that teachers do not sort based on VA

across New York City and that students move between many schools in New York City.

The fact that students move between many schools means that µc(j,t),t−1 is an average over

a large number of teacher VA measures. The fact that teachers do not sort on VA means

that these averages consist of nearly independent draws of teacher VA. Together, the law

25

Isaac M. Opper

of large numbers means that the variance of µc(j,t),t−1 is small.41

VI What Spills Over and To Whom?

So far I have tried to determine whether or not, on average, a student is affected by his or

her peers’ previous teachers. Now that I have provided evidence that they do, the natural

followup question is to wonder why. Although I will not provide a definitive answer, I shed

some light on the question in this section.

To do so, I’ll use the fact that each teacher transition can be thought of as a mini-

experiment, affecting slightly different groups of students in slightly different ways. Ex-

ploiting these differences I first explore whether the spillovers occur within-subjects or

across-subjects and then more precisely determine the relevant peer group in which the

spillovers occur.

VI.A What Spills Over?

Although test scores are explicitly designed as a measure of a student’s subject-specific

knowledge, test scores have also been shown to serve as a proxy measures for other non-

cognitive characteristics of a student.42 While this means that they serve as a good measure

of “student achievement,” broadly defined, it also means that the results in Table 1 are

difficult to interpret. Is it actually a student’s peers’ knowledge that affects his or her test

scores, or is it their non-cognitive skills? Given that the only outcomes I see are test scores,

these are impossible to fully separate. But I can provide suggestive evidence on the question

by exploring whether the spillovers are subject specific or not. That is, does having peers

who had a high VA math teacher increase not only my math test score, but also my English

test scores?

41Missing data on teacher VA artificially lowers the variance of µc(j,t),t−1, and therefore artificially increasesthe correlation between µ̂j,t(γ̂

MM ) and µ̂j,t(0). I therefore run the method of moment estimator using onlypost-1998 data, when the missing data becomes less important.

42See Borghans et al. (2011). More generally, Almlund et al. (2011) and Heckman and Kautz (2012) aretwo good overviews of the research on non-cognitive skills.

26

VI.B What is the Relevant Peer Group? Isaac M. Opper

To answer this, I run the following regression:

∆yi,s,t = α+ β∆Xi,t + γs∆µ̂c(i,t),s,t−1 + γ−s∆µ̂c(i,t),−s,t−1 + ∆εi,t (13)

which is identical to the main specification in Equation (1) except that it includes not

only the change in student i’s peers’ previous teacher value added in the same subject as

the test score, which is now denoted as ∆µ̂c(i,t),s,t−1, but it also includes the change in

student i’s peers’ previous teacher value added in the opposite subject as the test score,

denoted as ∆µ̂c(i,t),−s,t−1. Thus, comparing γs to γ−s determines whether the spillovers

occur within-subject or across-subject.

As shown in Table 8, the spillovers mainly occur within-subject.43 Whether this is

because a high VA math teachers motivates his or her students to work harder in math,

for example, or whether it is actually the mathematical knowledge is unclear. But it seems

likely that a student’s non-cognitive skills are less likely to vary across subjects than a

student’s cognitive skills; if so, this result does suggest that spillovers have some cognitive

component to them.

VI.B What is the Relevant Peer Group?

In addition to wondering what spills over, it is natural to wonder what the mechanism

is. In general, there are two plausible explanations. One possibility is that the effect is

due to peer-to-peer interactions. An example of this would be if a student who previously

had an excellent math teacher is motivated to work in math class, which provides a good

example to his or her friends. Another potential explanation for the effect is that it is due

to changes in the middle school classroom dynamics. One example of this is if a student

who previously had a good math teacher meant that he or she is better prepared for math

class, which potentially frees up time for his or her current math teacher to focus on the

43As shown in Table 9a and 9b, this result differs depending on the subject. If your peers had a goodEnglish teacher, it does increase your math test scores in a similar way as if they had a good math teacher,but not vica versa. This finding echoes the findings of Master et al. (2014), which shows an individualstudent’s math test scores are increased if they previously had a good English teacher, but not vica versa.

27

VI.B What is the Relevant Peer Group? Isaac M. Opper

other students in the classroom.

One way to shed light on the mechanism is to determine the relevant peer group. If

the spillovers predominately occur within groups of friends, it is likely that peer-to-peer

interactions are important. Since the administrative data does not have information on

which students are friends, I answer this question by using the fact that students are more

likely to be friends with people like themselves than with other students. I therefore test

whether the teacher quality of, for example, a Hispanic male student’s Hispanic male peers

affect him more than the teacher quality of his non-Hispanic female peers.

I define a group, denoted g, within a school as all the individuals who are the same race

and gender. I can then run the following regression:

∆yi,g,t = α+ β∆Xi,t + γg∆µ̂c(i,t),g,t−1 +∑∀g′ 6=g

γg′∆µ̂c(i,t),g′,t−1 + ∆εi,t (14)

which is almost identical to the main specification in Equation (1). It now includes both

the change in the previous teacher value added of students in i’s same school and grade who

are also in the same group as he or she is, which is now denoted as ∆µ̂c(i,t),g,t−1, and the

change in the previous teacher value added of students in i’s same school and grade who

are in different groups as he or she is, denoted as ∆µ̂c(i,t),g′,t−1. The measures ∆µ̂c(i,t),g′,t−1

are themselves constructed using the same formula as in Equation (2), except that the

flow rates from elementary-schools to middle-schools are allowed to vary by groups. Thus,

separately identifying the γg’s in Equation (14) comes from the fact that the Hispanic males

at a particular middle school on average came from different elementary schools than the

Hispanic females or non-Hispanic males.44

44One could also explicitly use the fact that Hispanic males, on average, had different teachers within aparticular elementary school than non-Hispanic females. This turns out to not add much statistical power,potentially because most principals ensure the classrooms are balanced on their race and gender makeup.Another possibility is to allow for the same teacher to be better at teaching males than females, for example.This creates identifying variation, even if the male and female students had the same teachers. This ispossible when defining groups based on gender, but estimating teacher value added separately for each ofrace, and especially for each combination of race and gender, is impossible. I show this additional sourceof variation does not change the main conclusion that the spillovers occur within genders in Appendix B.C,where I also show that the result also holds when defining groups based on whether or not the students areclassified as English Language Learners.

28

Isaac M. Opper

As shown in Table 10, the previous teacher value added of individuals in the same school

and grade as student i do not affect his or her test scores if they are not the same race as

i. If they are the same race as i, in contrast, their previous teachers do affect his or her

test scores. But the effect is small, unless the students are both the same race and the

same gender as student i. Tables 11a and 11b show the result separately for race and

gender. They too demonstrate the main result: students are only affected by the quality

of the teachers who previously taught the other students at the school who are similar to

themselves.

This result illustrates the crucial importance that the social network plays in dissemi-

nating the effect, which has a number of important implications for policies. First, it implies

that there is a limit in how much policy makers can exploit the peer effects. The re-sorting

of students, for example, inherently disrupts the social network, which minimizes the role of

a student’s peers.45 Yet the result is not entirely negative; these estimates suggest that less

intensive treatments that do not disrupt the social network can generate large spill overs.

Second, the importance of social networks suggests that by using elementary-to-middle

school transitions, the estimates in this paper might underestimate the spillovers that a

within-school treatment, such as a targeted mentoring or tutoring program, might generate.

This is because the elementary-to-middle school transition is itself very disruptive to the

the social network, and there is likely more peer interaction between students at the middle

school who attended the same elementary school than with students who attended a different

one.

VII Conclusion

Although discussions of teacher value are pervasive, nearly all of the discussion has focused

exclusively on how teachers affect their own students. In this paper, I show that these

45This process is nicely demonstrated in Carrell et al. (2013) and is one potential way to explain differencesbetween the the estimates in this paper and the estimates in Hoxby and Salyer (2006), Imberman et al. (2012),and Angrist and Lang (2005).

29

Isaac M. Opper

discussions miss an important channel through which effective teachers add value to the

school system. More specifically, I show that the positive effect which teachers have on

their own students spills over to affect their student’s future peers.

To quantify the spillovers, I use the fact that multiple elementary schools feed into the

same middle school. In particular, I estimate how the entry of an effective teacher at one

elementary school eventually affects the students at the local middle school who did not

attend the elementary school the teacher entered. I show evidence that teacher transitions

at a student’s neighboring elementary school are uncorrelated with unobserved changes in

the student’s test scores, which suggests that this technique correctly identifies a teacher’s

indirect effect.

These spillovers have a large impact on teacher value estimates. Because teachers affect

many students indirectly, ignoring this effect on their student’s future peers understates a

teacher’s value by around 35%. It also leads to improper estimates of the teacher’s effect on

their own students, since all the spillovers are misattributed to the current teacher. Because

of this, I develop a method of moments estimator to simultaneously estimate each teacher’s

value added and the degree to which these gains spill over. In New York City, accounting

for the spillovers does not lead to a large change in the ranking of teachers.

There is a different reason, however, why the spillovers could affect the ranking of

teachers. In this paper, I have assumed that two teachers who are equally good at increasing

their student’s test scores are also equally good at increasing their student’s future peers’

test scores. However, it is quite possible that teachers differ both in their direct value-added

and in how this value spills over. If teachers do differ in both dimensions, direct value-added

estimates alone do not lead to accurate teacher rankings, even if estimated correctly and

without error.

A interesting followup question to this paper is thus: how do teachers differ in the degree

to which their direct value-added spills over? One natural approach to this question is to

separately estimate direct value-added and indirect-value added estimates for each teacher.

Doing so, however, will require more structure than the current approach to value-added

30

REFERENCES Isaac M. Opper

estimates and likely require exploiting heterogeneity within a classroom. Another approach

is to determine which of the many components that feed into test scores are the ones that

spill over. If there was a clear answer to this, non-test score measures of teacher quality

would likely help predict the teachers who have disproportionately large indirect effects.

While I provided some evidence to direct this search, by showing that the spillovers occur

within-subject and not across-subject, there is much more work to be done in this respect.

The fact that the spillovers occur within groups of students who are the same race and

gender also highlights how important it is to understand the social network within a school

and, just as importantly, how education interventions affect these relationships. Without a

clear understanding of these issues, it is impossible to predict which policies will generate

the spillovers and who exactly will be affected.

In short, by providing evidence on an additional channel through which teachers affect

the school system, this paper reinforces just how important it is to have effective teachers

in our schools. But it also raises as many questions as it answers.

References

Almlund, Mathilde, Angela Lee Duckworth, James Heckman, and Tim Kautz,

“Personal Psychology and Economics,” Handbook of the Economics of Education, 2011,

4.

Angelucci, Manuela, Giacomo De Giorgi, Marcos A. Rangel, and Imran Rasul,

“Family Networks and School Enrolment: Evidence from a Randomized Social Experi-

ment,” Journal of Public Economics, 2010, 94, 197–221.

Angrist, Joshua D. and Kevin Lang, “Does School Integration Generate Peer Effects?

Evidence from Boston’s Metco Program,” American Economic Review, 2005, 94 (5),

1613–1634.

31


, Peter Hull, Parag Pathak, and Christopher Walters, “Leveraging Lotteries for

School Value-Added: Testing and Estimation,” 2015.

Avvisati, Francesco, Marc Gurgand, Nina Guyon, and Eric Maurin, “Getting

Parents Involved: A Field Experiment in Deprived Schools,” Review of Economic Studies,

2014, 81 (1), 57–83.

Bacher-Hicks, Andrew, Thomas J. Kane, and Douglas O. Staiger, “Validating

Teacher Effect Estimates Using Changes in Teacher Assignments in Los Angeles,” October

2014.

Bobonis, Gustavo J. and Frederico Finan, “Neighborhood Peer Effects in Secondary

School Enrollment Decisions,” Review of Economics and Statistics, 2009, 91, 695–716.

Borghans, Lex, Bart H.H. Golsteyn, James J. Heckman, and John Eric

Humphries, “Identification Problems in Personality Psychology,” NBER, 2011.

Bramoulle, Yann, Habiba Djebbari, and Bernard Fortin, “Identification of Peer

Effects Through Social Networks,” Journal of Econometrics, May 2009, 150 (1), 41–55.

Carrell, Scott E., Bruce I. Sacerdote, and James E. West, “From Natural Variation

to Optimal Policy? The Importance of Endogenous Peer Group Formation,” Economet-

rica, 2013, 81 (3), 855–882.

Chetty, Raj, John N. Friedman, and Jonah E. Rockoff, “Measuring the Impacts

of Teachers I: Evaluating Bias in Teacher Value-Added Estimates,” American Economic

Review, 2014, 104 (9), 2593–2632.

, , and , “Measuring the Impacts of Teachers II: Teacher Value-Added and Student

Outcomes in Adulthood,” American Economic Review, 2014, 104 (9), 2633–2679.

, , Nathaniel Hilger, Emmanuel Saez, Diane Schanzenbach, and Danny Ya-

gan, “How Does Your Kindergarten Classroom Affect Your Earnings? Evidence From

Project STAR,” Quarterly Journal of Economics, 2011, 126 (4), 1593–1660.

32


Corcoran, Sean, Jennifer L. Jennings, and Andrew A. Beveridge, “Teacher Effec-

tiveness on High- and Low-Stakes Tests,” 2013.

Dahl, Gordon B., Katrine V. Loken, and Magne Mogstad, “Peer Effects in Program

Participation,” American Economic Review, 2014, 104 (7), 2049–2074.

Deming, David J., “Early Childhood Intervention and Life-Cycle Development: Evidence

From Head Start,” American Economic Journal: Applied Economics, 2009, 1 (3), 111–

134.

, “Using School Choice Lotteries to Test Measures of School Effectiveness,” American

Economic Review, 2014, 104 (5), 406–411.

Duflo, Esther and Emmanuel Saez, “The Role of Information and Social Interactions

in Retirement Plan Decisions: Evidence from a Randomized Experiment,” Quarterly

Journal of Economics, 2003, 118 (3), 815–842.

Epple, Dennis and Richard E. Romano, “Peer Effects in Education: A Survey of the

Theory and Evidence,” in “Handbook of Social Economics,” Vol. 1B 2011, chapter 20,

pp. 1053–1163.

Fruehwirth, Jane Cooley, “Identifying Peer Achievement Spillovers: Implications for

Desegregation and the Achievement Gap,” Quantitative Economics, 2013, 4, 85–124.

, “Can Achievement Peer Effect Estimates Inform Policy? A View from Inside the Black

Box,” Review of Economics and Statistics, 2014.

Glazerman, Steven, Ali Protik, Bing ru Teh, Julie Brunch, Jeffrey Max, and

Elizabeth Warner, Transfer Incentives for High-Performing Teachers: Final Results

from a Multisite Randomized Experiment, United States Department of Education, 2013.

, Susanna Loeb, Dan Goldhaber, Douglas O. Staiger, Stephen Raudenbush,

and Grover Whitehurst, “Evaluating Teachers: The Important Role of Value-Added,”

Brown Center on Education Policy at Brookings, 2010.

33


Goldhaber, Dan and Duncan Chaplin, “Assessing the ”Rothstein Falisification Test.”

Does it Really Show Teacher Value-added Models are Biased?,” Journal of Research on

Educational Effectiveness, 2015, 8 (1), 8–35.

and Michael Hansen, “Is It Just a Bad Class? Assessing the Stability of Measured

Teacher Performance,” Economica, 2013, 80 (319), 589–612.

Hanushek, Eric A., “Teacher Characteristics and Gains in Student Achievement: Esti-

mation using Micro Data,” American Economic Review, 1971, 61 (2), 280–288.

Heckman, James J. and Tim Kautz, “Hard Evidence on Soft Skills,” Labour Eco-

nomics, 2012, 19 (4), 451–464.

Heckman, James, Rodrigo Pinto, and Peter Savelyev, “Understanding the Mech-

anisms Through Which an Influential Early Childhood Program Boosted Adult Out-

comes,” American Economic Review, 2013, 103 (6), 2052–2086.

Hoxby, Caroline and Gretchen Weingarth Salyer, “Taking Race Out of the Equation:

School Reassignment and the Structure of Peer Effects,” 2006.

Imberman, Scott A., Adriana D. Kugler, and Bruce I. Sacerdote, “Katrina’s

Children: Evidence on the Structure of Peer Effects from Hurricane Evacuees,” American

Economic Review, 2012, 102 (5), 2048–2082.

Jackson, C. Kirabo, “Non-Cognitive Ability, Test Scores, and Teacher Quality: Evidence

From 9th Grade Teachers in North Carolina,” NBER, 2014.

and Elias Bruegmann, “Teaching Students and Teaching Each Other: The Importance

of Peer Learning for Teachers,” American Economic Journal: Applied Economics, 2009.

, Jonah E. Rockoff, and Douglas O. Staiger, “Teacher Effects and Teacher-Related

Policies,” Annual Review of Economics, 2014, 6 (1), 801–825.

34


Jacob, Brian A. and Lars Lefgren, “Can Principals Identify Effective Teachers? Ev-

idence on Subjective Performance Evaluations in Education.,” Journal of Labor Eco-

nomics, 2008, 26 (1), 101–136.

Kane, Thomas J. and Douglas O. Staiger, “Estimating Teacher Impacts on Student

Achievement: An Experimental Evaluation,” NBER, 2008.

, Daniel F. McCaffrey, Trey Miller, and Douglas O. Staiger, Have We Identified

Effective Teachers? Validating Measures of Effective Teaching Using Random Assign-

ment, Seattle, WA: Bill and Melinda Gates Foundation, 2013.

, Kerri A. Kerr, and Robert C. Pianta, eds, Designing Teacher Evaluation Systems:

New Guidance from the Measures of Effective Teaching Project, Jossey-Bass, 2014.

Kelleher, Maureen, New York City’s Children First Center for American Progress Jan-

uary 2014.

Kinsler, Joshua, “Assessing Rothstein’s Critique of Teacher Value-Added Models,” Quan-

titative Economics, 2012, 3 (2), 333–362.

Koedel, Cory and Julian R. Betts, “Does Student Sorting Invalidate Value-Added

Models of Teacher Effectiveness? An Extended Analysis of the Rothstein Critique,”

Education Finance and Policy, 2011, 6 (1), 18–42.

, Kata Mihaly, and Jonah E. Rockoff, “Value-Added Modeling: A Review,” Eco-

nomics of Education Review, August 2015, 47, 180–195.

Kremer, Michael and Edward Miguel, “The Illusion of Sustainability,” The Quarterly

Journal of Economics, 2007, 122 (3), 1007–1065.

Kuhn, Peter, Peter Kooreman, Adriaan Soetevent, and Arie Kapteyn, “The

Effects of Lottery Prices on Winners and Their Neighbors: Evidence from the Dutch

Postcode Lottery,” American Economic Review, 2011, 101 (5), 2226–2247.

35


Lalive, Rafael and Alejandra Cattaneo, “Social Interactions and Schooling Decisions,”

Review of Economics and Statistics, 2009, 91, 457–477.

Lavy, Victor and Edith Sand, “On the Origins of Gender Human Capital Gaps: Short

and Long Term Consequences of Teachers’ Stereotypical Biases,” 2015.

Lockwood, J. R., Daniel F. McCaffrey, Laura S. Hamilton, Brian Stecher, Vi-

Nhuan Le, and Jose Felipe Martinez, “The Sensitivity of Value-Added Teacher Ef-

fect Estimates to Different Mathematics Achievement Measures,” Journal of Educational

Measurement, 2007, 44 (1), 47–67.

Loeb, Susanna and Christopher A. Candelaria, “Value-Added Stability Across Years,

Subjects, and Student Groups,” Carnegie Knowledge Brief, 2012.

, James Soland, and Lindsay Fox, “Is a Good Teacher a Good Teacher For All? Com-

paring Value-Added of Teachers with Their English Learners and Non-English Learners,”

Education Evaluation and Policy Analysis, 2014, 36 (4), 457–475.

Manski, Charles F., “Identification of Endogenous Social Effects: The Reflection Prob-

lem,” The Review of Economic Studies, July 1993, 60 (3), 531–542.

Master, Benjamin, Susanna Loeb, and James Wyckoff, “Learning that Lasts: Un-

packing Variation in Teachers’ Effects on Students’ Long-Term Knowledge,” Working

Paper, 2014.

McCaffrey, Daniel F., Tim R. Sass, J. R. Lockwood, and Kata Mihaly, “The

Intertemporal Variability of Teacher Effect Estimates,” Education Finance and Policy,

2009, 4 (4), 572–606.

Moffitt, Robert A., “Policy Interventions, Low-Level Equilibria, and Social Interactions,”

in “Social Dynamics,” Cambridge: MIT Press, 2001, pp. 45–82.

Murnane, Richard, The Impact of School Resources on the Learning of Inner City Chil-

dren, Cambridge, MA: Ballinger, 1975.

36


Papay, John P., “Different Tests, Different Answers: The Stability of Teacher Value-

Added Estimates Across Outcome Measures,” American Educational Research Journal,

2011, 48 (1), 163–193.

Paufler, Noelle A. and Audrey Amrein-Beardsley, “The Random Assignment of

Students into Elementary Classrooms: Implications for Value-Added Analyses and Inter-

pretations,” American Education Research Journal, 2014, 51 (1), 328–362.

Rockoff, Jonah E. and Cecilia Speroni, “Subjective and Objective Evaulations of

Teacher Effectiveness,” American Economic Review: Papers and Proceedings, May 2010,

100, 261–266.

Rothstein, Jesse, “Teacher Quality in Educational Production: Tracking, Decay, and

Student Achievement,” The Quarterly Journal of Economics, 2010, 125 (1), 175–214.

, “Revising the Impacts of Teachers,” 2014.

Sacerdote, Bruce, “Peer Effects in Education: How Might They Work, How Big Are They

and How Much Do We Know Thus Far?,” Handbook of the Economics of Education, 2011,

3.

, “Experimental and Quasi-Experimental Analysis of Peer Effects: Two Steps Forward?,”

Annual Review of Economics, 2014, 6, 253–272.

Staiger, Douglas O. and Jonah E. Rockoff, “Searching for Effective Teachers with

Imperfect Information,” Journal of Economic Perspectives, Summer 2010, 24 (3), 97–

118.

37

Isaac M. Opper

VIII Tables and Figures

VIII.A Tables

Table 1: Indirect Effect Estimates

(1) (2) (3) (4)VARIABLES TestScore TestScore TestScore TestScore

Peers'PreviousTeacherVA 0.463*** 0.407*** 0.448*** 0.399***(0.129) (0.129) (0.144) (0.131)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XPeer'sBaselineTestScore XGrades 5-8 5-8 5-8 5-8Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishNumberofClusters 14357 14357 11616 11836NumberofCohorts 204097 204097 120718 136455NumberofStudents 6654088 6654088 5528598 5586047*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an 2SLS regression that uses the measure described indescribed in Section III as an instrument for the previous teacher quality of the student's peers who previously attended differentschools. The constructed measure captures the teacher quality at the schools that feed the students' current school, but which heor she did not attend. Unlike the raw average, the measure is constructed to exclude variation caused by changes in how studentsare matched to teachers or changes in the peer composition of the current school. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade students at a particularmiddle school who went to a particular elementary school did in their math test scores, relative to the group of individuals whowent to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond the testscorestwo-yearspriortothecurrenttestscore.

38

VIII.A Tables Isaac M. Opper

Table 2: Indirect Effect Estimates By Subject

(a) Math Test Scores


Peers'PreviousTeacherVA 0.398*** 0.390** 0.488*** 0.477**(0.147) (0.165) (0.177) (0.191)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Years All All All AllSubjects Math Math Math MathNumberofClusters 14293 11520 6308 4043NumberofCohorts 104614 62306 43447 26430NumberofStudents 3420388 2859747 735390 538974

*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an 2SLS regression that uses the measure described indescribed in Section III as an instrument for the previous teacher quality of the student's peers who previously attended differentschools. The constructed measure captures the teacher quality at the schools that feed the students' current school, but which heor she did not attend. Unlike the raw average, the measure is constructed to exclude variation caused by changes in how studentsare matched to teachers or changes in the peer composition of the current school. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade students at a particularmiddle school who went to a particular elementary school did in their math test scores, relative to the group of individuals whowent to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond the testscorestwo-yearspriortothecurrenttestscore.

(b) English Test Scores


Peers'PreviousTeacherVA 0.562*** 0.536** 0.472** 0.380(0.186) (0.220) (0.211) (0.240)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Years All All All AllSubjects English English English EnglishNumberofClusters 14249 11454 6289 3997NumberofCohorts 99483 58412 42823 25711NumberofStudents 3233700 2668851 715370 522416*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an 2SLS regression that uses the measure described indescribed in Section III as an instrument for the previous teacher quality of the student's peers who previously attended differentschools. The constructed measure captures the teacher quality at the schools that feed the students' current school, but which heor she did not attend. Unlike the raw average, the measure is constructed to exclude variation caused by changes in how studentsare matched to teachers or changes in the peer composition of the current school. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade students at a particularmiddle school who went to a particular elementary school did in their English test scores, relative to the group of individuals whowent to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond the testscorestwo-yearspriortothecurrenttestscore.

39


Table 3: Dynamic Spillovers



Peers'TwicePreviousTeacherVA 0.310** 0.263*(0.137) (0.136)

OwnPreviousTeacherVA X XOwnTwicePreviousTeacherVA XGrades 6-8 6-8 6-8 6-8Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishNumberofClusters 14684 13991 13991 13991NumberofCohorts 216409 190881 190881 190881NumberofStudents 6984703 6253113 6253113 6253113

*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described in Section III,which captures the teacher quality at the schools that feed the students' current school, but which he or she did not attend. All variablesare constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th gradestudents at a particular middle school who went to a particular elementary school did in their math test scores, relative to the group ofindividuals who went to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered attheschool-yearlevel,andallregressionsareweightedbythenumberofstudentsinthecohort.

40


Table 4: First Placebo Test

(1) (2) (3) (4)

VARIABLES TestScore TestScore TestScore TestScore

Peers'PreviousTeacherVA 0.540*** 0.478*** 0.492*** 0.377**

(0.156) (0.156) (0.170) (0.151)

F-Test:AllLagandLeadCoefficients=0 0.217 0.264 0.188 0.216

F-Test:AllLag,Current,andLeadCoefficientsAreEqual 0.0190 0.0230 0.0280 0.0260

OwnPreviousTeacherVA X X

CurrentTeacherVA X X

OwnBaselineTestScore X

OwnPreviousTestScore X

Peer'sBaselineTestScore X

Grades 5-8 5-8 5-8 5-8

Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglish

NumberofClusters 9470 9470 8442 8498

NumberofCohorts 78917 78917 56511 58076

NumberofStudents 4423939 4423939 3828849 3851672

*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described in Section III, which

captures the teacher quality at the schools that feed the students' current school, but which he or she did not attend. It also includes two

lags and two leads of this measure. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-

year average, for example how the 6th grade students at a particular middle school who went to a particular elementary school did in their

math test scores, relative to the group of individuals who went to the same middle school and elementary school a year before. Standard

errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The

baselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

41


Table 5: Second Placebo Test


FuturePeers'PreviousTeacherVA 0.166 0.0931 0.139 -0.0721(0.134) (0.134) (0.111) (0.101)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XPeer'sBaselineTestScore XGrades 5 5 5 5Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishNumberofClusters 9327 9327 8697 8713NumberofCohorts 89388 89388 78391 82274NumberofStudents 1309825 1309825 1239857 1246452*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described indescribed in Section IV and Appendix C. The constructed measure captures the teacher quality at the schools that feed thestudents' future school, but which he or she did not attend. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 5th grade students who will go to a particular middle schooland whoare at a particular elementary school did in their math test scores, relative to the group of individuals who will go thesame middle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level,and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond the test scores two-yearspriortothecurrenttestscore.

42


Table 6: Specification Test


Peers'PreviousTeacherVA(Unweighted) -0.0169 -0.0301 -0.0445 0.00294(0.0562) (0.0559) (0.0577) (0.0517)

Peers'LaggedTeacherVA(Unweighted)xFractionofPeers 0.426*** 0.398*** 0.442*** 0.362***(0.138) (0.138) (0.153) (0.132)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XPeer'sBaselineTestScore XGrades 5-8 5-8 5-8 5-8Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishNumberofClusters 14598 14598 11914 12112NumberofCohorts 210540 210540 122909 138744Observations 6856986 6856986 5664107 5723250*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression. "Peers' Lagged Teacher VA" is the aveage Teacher VA at theschools that feed a student's current school, but which he or she did not attend. "Fraction of Peers" corresponds to the fraction of students at theindividual's current school, who previously attended a different school. For more details, see Section IV and Appendix C. All variables are constructed asthe year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade students at a particular middleschool who went to a particular elementary school did in their math test scores, relative to the group of individuals who went to the same middle schooland elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by thenumberofstudentsinthecohort.Thebaselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

43


Table 7: Total Teacher Value

LearningFromPeersModel EasiertoTeachModel LearningFromPeersModel EasiertoTeachModel

DirectValueAdded 52.91 52.91 31.27 31.27

IndirectValueAdded 12.88 15.42 13.57 16.35

TotalValueAdded 65.79 68.33 44.84 47.62

SpilloverParameterValueFunction Linear Linear Linear Linear

LearningFromPeersModel EasiertoTeachModel LearningFromPeersModel EasiertoTeachModel

DirectValue 80.4% 77.4% 69.7% 65.7%

IndirectValue 19.6% 22.6% 30.3% 34.3%

This table demonstrates the fraction of the total value added of a teacher that is direct value added (i.e. increasese in the testscores of his or her students) versus the indirect value added (i.e. increasese in the test scores of his or her students' future peers).The different columns reflect different assumptions about the mechanisms of peer effects and the way dynamics are handled. Formoreinformationonthemodels,seeSectionVofthepaper.

DynamicCalculations StaticCalculations

Thistabledemonstratestheaverageteacher'sdirectvalueadded(i.e.increaseseinthetestscoresofhisorherstudents)andindirectvalueadded(i.e.increaseseinthetestscoresofhisorherstudents'futurepeers).Thedifferentcolumnsreflectdifferentassumptionsaboutthemechanismsofpeereffectsandthewaydynamicsarehandled.Formoreinformationonthemodels,seeSectionVofthepaper.

DynamicCalculations StaticCalculations

44


Table 8: Do Spillovers Occur Within Subjects?


Peers'PreviousTeacherVA-SameSubject 0.354*** 0.305*** 0.305*** 0.336***(0.0977) (0.0973) (0.115) (0.104)

Peers'PreviousTeacherVA-OtherSubject 0.0529 0.0561 0.0894 0.0222(0.0955) (0.0956) (0.114) (0.105)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XPeer'sBaselineTestScore XGrades 5-8 5-8 5-8 5-8Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishNumberofClusters 14663 14663 12094 12269NumberofCohorts 213310 213310 122737 138073NumberofStudents 6951316 6951316 5730985 5790219*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described in Section III,which captures the teacher quality at the schools that feed the students' current school, but which he or she did not attend. All variablesare constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th gradestudents at a particular middle school who went to a particular elementary school did in their math test scores, relative to the group ofindividuals who went to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered atthe school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond thetestscorestwo-yearspriortothecurrenttestscore.

45


Table 9: Do Spillovers Occur Within Subjects?

(a) Math Test Scores


Peers'PreviousTeacherVA-SameSubject 0.319** 0.253 0.398* 0.336(0.146) (0.160) (0.217) (0.218)


OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects Math Math Math MathNumberofClusters 14637 12054 6764 4370NumberofCohorts 109161 63314 43608 26545NumberofStudents 3568503 2961570 756381 555652*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described in Section III,which captures the teacher quality at the schools that feed the students' current school, but which he or she did not attend. All variablesare constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th gradestudents at a particular middle school who went to a particular elementary school did in their math test scores, relative to the group ofindividuals who went to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered atthe school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond thetestscorestwo-yearspriortothecurrenttestscore.

(b) English Test Scores


Peers'PreviousTeacherVA-SameSubject 0.399** 0.363 0.325 0.287(0.198) (0.235) (0.264) (0.304)


OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects English English English EnglishNumberofClusters 14583 11996 6754 4323NumberofCohorts 104149 59423 43082 25906NumberofStudents 3382813 2769416 736599 538668*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described in Section III,which captures the teacher quality at the schools that feed the students' current school, but which he or she did not attend. All variablesare constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th gradestudents at a particular middle school who went to a particular elementary school did in their math test scores, relative to the group ofindividuals who went to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered atthe school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond thetestscorestwo-yearspriortothecurrenttestscore.

46


Table 10: What is the Relevant Peer Group?


Peers'LaggedTeacherVA-SameRaceandGender 0.294*** 0.274*** 0.217** 0.284***(0.102) (0.101) (0.103) (0.0858)

Peers'LaggedTeacherVA-SameRaceandDifferentGender 0.140 0.127 0.214** 0.0970(0.0972) (0.0970) (0.0980) (0.0819)

Peers'LaggedTeacherVA-DifferentRaceandSameGender 0.0491 0.0275 0.0173 0.0568(0.0899) (0.0894) (0.0890) (0.0770)

Peers'LaggedTeacherVA-DifferentRaceandGender -0.0174 -0.0316 -0.0438 -0.0541(0.0890) (0.0884) (0.0949) (0.0773)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XGrades 5-8 5-8 5-8 5-8Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishNumberofClusters 13660 13660 11614 11929NumberofCohorts 383604 383604 253216 324402NumberofStudents 6068253 6068253 4667529 5540593

*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measures described in Section VI as the independent variables. Themeasure captures teacher quality at the elementary schools that feed the student's middle school, but which he or she did not attend. The fact that different peer types (onaverage) previously attended different schools, generates variation between the four measures of "Peers' Lagged Teacher VA." All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-peer group-year average, for example how the 6th grade Hispanic female students at a particular middle schoolwho went to a particular elementary school did in their math test scores, relative to the group of 6th grade Hispanic females who went to the same middle school andelementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in thecohort.Thebaselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

47


Table 11: What is the Relevant Peer Group?

(a) Race


Peers'LaggedTeacherVA-SamePeerGroup 0.357*** 0.369*** 0.305** 0.292**(0.104) (0.112) (0.132) (0.138)

Peers'LaggedTeacherVA-OtherPeerGroup 0.0613 -0.0973 0.193 0.129(0.104) (0.115) (0.136) (0.140)

OwnPreviousTeacherVA X X X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishPeerGroupDefinition Race Race Race RaceNumberofClusters 14040 11867 6342 4343NumberofCohorts 289344 183176 103066 67352NumberofStudents 6554689 5197023 1325436 983844*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measures described in Section VI as the independentvariables.Themeasurecapturesteacherqualityattheelementaryschoolsthatfeedthestudent'smiddleschool,butwhichheorshedidnotattend.Thefactthat different peer types (on average) previously attended different schools, generates variation between "Peers' Lagged Teacher VA - Same Peer Group"and "Peers' Lagged Teacher VA - Other Peer Group." All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-peer group-year average, for example how the 6th grade Hispanic students at a particular middle school who went to a particular elementary school did intheir math test scores, relative to the group of 6th grade Hispanic students who went to the same middle school and elementary school a year before.Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The baselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

(b) Gender


Peers'LaggedTeacherVA-SamePeerGroup 0.299*** 0.224** 0.348*** 0.262**(0.106) (0.113) (0.131) (0.130)

Peers'LaggedTeacherVA-OtherPeerGroup 0.142 0.160 0.159 0.211(0.107) (0.115) (0.135) (0.135)

OwnPreviousTeacherVA X X X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishPeerGroupDefinition Female Female Female FemaleNumberofClusters 14135 11936 6495 4428NumberofCohorts 241149 156920 100312 65059NumberofStudents 6715379 5378328 1414867 1048997*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measures described in Section VI as the independentvariables.Themeasurecapturesteacherqualityattheelementaryschoolsthatfeedthestudent'smiddleschool,butwhichheorshedidnotattend.Thefactthat different peer types (on average) previously attended different schools, generates variation between "Peers' Lagged Teacher VA - Same Peer Group"and "Peers' Lagged Teacher VA - Other Peer Group." All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-peer group-year average, for example how the 6th grade female students at a particular middle school who went to a particular elementary school did intheir math test scores, relative to the group of 6th grade females who went to the same middle school and elementary school a year before. Standard errors,in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

48

VIII.B Figures Isaac M. Opper

VIII.B Figures

Figure 1: Teacher Value Added Distributions

01

23

4

-1 -.5 0 .5 1Value Added

Math VA English VA

Note: The distributions above show estimated value added distributions of elementaryschool math and English teachers.

49


Figure 2: Math Value Added and English Value Added Correlation

Note: This figure shows the correlation between a teacher’s math value added and his orher English value added. Each dot represents a different teacher-year. The red line showsthe results of a regression of a teacher’s English value added on his or her math value added.For both the regression that generated the red line and the estimated correlation, I weighteach teacher-year using the number of students the teacher taught in that particular year.

50


Figure 3: Autocorrelation Measures

0.2

.4.6

.81

Auto

corre

latio

n

0 2 4 6 8Lag

Math Residuals English ResidualsEstimated Math VA Estimated English VA

Note: This figure shows the within-teacher autocorrelations of different measures. The solidlines show the correlation between the mean teacher-year residuals across different years theteacher is in the data. The residuals are derived from a regression of a student’s tests scoreon a flexible cubic function of their lagged math and English test scores. The dashed insteadshow the correlation between teacher-year value added measures across different years theteacher is in the data.

51


Figure 4: Changes in Teacher Value Added

(a) Intra-Neighborhood Correlation

(b) Inter-temporal Correlation (c) Inter-Grade Correlation

Note: In Figure 4a, each dot represents an elementary-by-middle school-by-year-by-subjectobservation. The x-axis measures the year-to-year change in the 5th grade teacher VA attheir own elementary school. The y-axis measures the same difference, but instead indicatesthe year-to-year change in the average 5th grade teacher VA at their middle school peers’elementary schools, instead of at their own elementary school. In the remaining panels,each dot represents an elementary school-by-year-by-subject observation. Figure 4b showshow changes in 5th grade teacher VA in one year is correlated with changes in 5th gradeteacher VA two years prior. I use the twice lagged change instead of the lagged change toavoid any mechanical correlation between the two measures. Figure 4c, in contrast, showshow changes in 5th grade teacher VA is correlated with changes in the 4th grade teacherVA at the same elementary school. The graphs above show the results using only math testscores and not English, but the results are similar for English test scores.

52


Figure 5: Placebo Tests

(a) First Placebo Test

-1-.5

0.5

1

Estim

ated

Spi

llove

r Coe

ffici

ent

-2 -1 0 1 2Lead or Lag of Change

Current Peers' Previous Teachers 95% CI

(b) Second Placebo Test

-.20

.2.4

.6

Estim

ated

Spi

llove

r Coe

ffici

ent

-2 -1 0 1 2Lead or Lag of Change

Current Peers' Previous TeachersFuture Peers' Previous Teachers

Note: Figure 5a plots the coefficients from the regression specified in Equation (3) as well asthe 95% percent confidence interval. Figure 5b repeats this procedure (without plotting theconfidence intervals), and also plots the coefficients from the placebo regression discussedin Section IV.B. All regressions are done while controlling for the individuals previous testscore, their current teacher value added, and their peers baseline test scores and use mathtest scores as the outcome. Because the transition from elementary to middle school occursbetween 5th and 6th grade, the Current Peers’ Previous Teachers regressions are run using6th grade test scores and the Future Peers’ Previous Teachers regressions are run using 5thgrade test scores.

53


Figure 6: Conventional and Adjusted Value Added Estimates

Note: This figure shows the correlation between a teacher’s value added measures whenestimated conventionally and his or her value added when estimated using the method ofmoments estimator I discuss in Section V.B. For both the regression and the estimatedcorrelation, I weight each teacher-subject-year by the number of his or her students.

54

Isaac M. Opper

A Data, Context, and Teacher Value Added Estimation

A.A Data and Context

Missing Data As shown in Figure A1, the data is missing teacher VA estimates for a large

fraction of students in the early-to-mid-1990s. This is due mostly to the fact that the data

system used to keep track of student to teacher matches was slowly phased in during this

time. The weighting scheme discussed in Section III implicitly accounts for this, but I

ensure that it does not affect the results by running the same specification using data only

from 1998-2010. These results are shown in Table A1, which closely matches Table 1.

A.B Teacher Value Added

Estimating Teacher Value Added The Chetty et al. (2014a) method for estimating VA

proceeds in four main steps. The first is to remove determinants of student i’s test score

that a teacher cannot affect. This is done by regressing student i’s year t test score, denoted

as yi,t, on a vector of student i observables, denoted asXi,t. Importantly, Xi,t contains cubics

of student i’s lagged test scores. In my data, adding additional controls do little to change

the VA estimates, a finding the resembles that of Chetty et al. (2014a).46 The regression to

estimate the effect of Xi,t on yi,t includes teacher fixed effects, which removes the possibility

that the estimate is biased by teachers sorting based on the X’s. Once β is estimated, I

construct student level residuals y∗i,t = yi,t − β̂Xi,t.

Once these student-level residuals are constructed, the next step is to aggregate them

to the teacher-year level. For teacher j, I denote his or her year t measure as Aj,t. To be

clear, Aj,t is just the sum of his or her students’ residuals: Aj,t ≡∑∀i∈c(j,t) yi,t − β̂Xi,t,

where c(j, t) indicates the set of students that teacher j teaches in year t.

The two steps above provide me with a measure Aj,t for every teacher-year. This measure

combines teacher j’s affect on his or her year t students with all the other uncontrolled for

determinants of his or her student’s test score residuals. To remove the contemporaneous

error terms from Aj,t, the Chetty et al. (2014a) estimation technique uses the inter-temporal

correlation between Aj,t and Aj,−t, where Aj,−t is a vector of every Aj,t′ measure such that

t′ 6= t. In particular, it assumes a stationary process for both the true teacher VA and for the

student-level error terms and estimate Cov(Aj,t, Aj,t−s) ≡ σAs for all s ∈ {1, 2, 3, 4, 5, 6, 7},assuming that the correlations stabilize after seven years.

Once these inter-temporal covariances are estimated, the last step is to predict teacher

j’s value of Aj,t using Aj,−t. This is done using the estimates of σ̂As and the measures in

46My main specification includes only cubics of a student’s lagged math and English test scores, but boththe magnitude of my coefficients and their t-statistics increase slightly when including all the controls usedin Chetty et al. (2014a).

55

Isaac M. Opper

Aj,−t. These predictions become the estimated VA of teacher j in year t, which I will denote

as µ̂j,t. As an example, suppose that teacher j was teaching in New York City from 2005

to 2009. Then teacher j’s estimated VA in 2007 is:47

µ̂j,2007 = σ̂A2Aj,2005 + σ̂A1Aj,2006 + σ̂A1Aj,2008 + σ̂A2Aj,2009 + σ̂A3Aj,2010 (15)

and teacher j’s estimated value added in 2010 is:

µ̂j,2007 = σ̂A5Aj,2005 + σ̂A4Aj,2006 + σ̂A3Aj,2007 + σ̂A2Aj,2008 + σ̂A1Aj,2009 (16)

There are two important characteristics of these VA measures. First, notice that the test

scores of teacher j’s year t students have absolutely no impact on µ̂j,t. This ensures that

the correlation I find is not generated by student i having an unusually good group of peers,

but by student i having a group of peers who had previously had unusually good teachers.

Second, note that teacher j’s VA can change over time for two reasons. First, a different

group of students is left out for every teacher VA measure; µ̂j,2007 is estimated by excluding

the 2007 group and including the 2008 group, while µ̂j,2008 is estimated including the 2007

group and excluding the 2008 group. The second reason is that, because σAs ≤ σAs′ for

every s < s′, recent years are weighted more strongly than distant years when estimating

µj,t. In practice, however, there is very little within-teacher variation in VA.

B Robustness Checks

B.A Other VA Measures

Given the high correlation between different value added measures, it is unlikely that the

results would be affected by the value added model. This subsection ensures that is the case

by running the same baseline regression, while using the same control vector to estimate

VA measures as used in Chetty et al. (2014a). In addition to a student’s lagged test

scores, this specification includes student-level information on their: gender, lagged days

absent, relative age, race, absences, and discipline incidents. It also includes information

on whether the student has repeated the grade, whether or not he or she is classified as an

English Language Learner, and whether or not he or she is classified as having a learning

disability. This control vector also includes interactions of the cubic function of a student’s

test scores with the the student’s grade, to allow test score growth to differ depending on

the student’s age. It includes classroom-level averages of all the previous controls as well as

47For a more detailed discussion of how to account for the fact that teachers teach for a different numberof years and for the fact that the variance of Aj,t differs for every (j, t) pair, see Appendix A of Chetty etal. (2014a).

56

B.B Pseudo-Zoned Schools Isaac M. Opper

controls for the number of other students in the class.

The results from this specification is reported in Table A2; as can be seen, they closely

match the results in Table 1.

B.B Pseudo-Zoned Schools

As discussed in Section III, the measure I use in each regression is not affected by changes in

the way students at the neighboring elementary schools get sorted to middle schools. Yet it

is affected by where the student attends middle school. Although most students do attend

their closest middle school, students do have the flexibility to choose their middle school

in the later years of the analysis. Since I always compare how students score, relatively

to those who attended the same elementary and middle school in the previous year, it is

unclear how, or if, this choice would bias the results. This section ensure that this choice

does not have any effect on the results presented.

In theory, the best way to handle this choice is to to construct the measure that assumes

that all students attend their zoned middle school, i.e. the school that they are defaulted

in to. Since I do not have this information I instead use a different approach that has

similar flavor as using a student’s zoned school, which involves using what I call a student’s

“psuedo-zoned school.” This technique involves constructing the measure by assuming that

the student has the same probability of attending each middle school as the average person

at his or her elementary school.

More specifically, this means the measure, denoted as µ̂pseudo zonedc(i,t),t−1 , becomes:

∆µ̂pseudo zonedc(i,t),t−1 =∑∀s′m

βse,s′m,t∑∀s′e 6=se

αs′e,s′m,t∆µs′e,t−1 (17)

where ∆µs′e,t−1 is change the average 5th grade teacher VA in time t−1 at elementary school

s′e, αs′e,s′m,t is the fraction of students at middle school s′m that attended elementary school

s′e, and βse,s′m,t is the fraction of students who attended elementary school se that move on

to attend middle school s′m. Note that µ̂pseudo zonedc(i,t),t−1 is only a function of the elementary

school the student went to, and not a function of the middle school the student attended.

This ensures that this choice of middle school does not affect the measure. I then run the

same specification outlined in Equation (1), but now uses ∆µ̂pseudo zonedc(i,t),t−1 as an instrument

for ∆µ̂rawc(i,t),t−1 instead of using ∆µ̂c(i,t),t−1.

The regression results are demonstrated in Table A3. As is clear, the point estimates

are not different than those in Table 1, but the standard errors have increased, which is to

be expected.

57

B.C Within-Group Spillovers Isaac M. Opper

B.C Within-Group Spillovers

In Section VI, I use the fact that the flow rates from elementary-schools to middle-schools

differ slightly by subgroups to show that previous teacher value added of individuals in the

same school and grade as student i only affects student i if they are his or her same race

and/or gender. In this Subsection, I demonstrate the robustness of this result by using a

different source of variation: the fact that some teachers might be better at teaching male

students than female students, or vice versa. I also use the same approach to show that the

spillovers occur within subgroups of students, if those subgroups are defined by whether or

not they are classified as being an English Language Learner (ELL).48

Unsurprisingly, teachers who are good at increasing one subgroup of students tend to be

good at increasing others, a finding illustrated for the subgroups of gender and ELL status

in Figures A2a and A2b, respectively. Yet there are persistent differences in how effective

teachers are for different subgroups. I thus run regressions using the same specification of

Equation (14), but allow the value added measures to vary depending on the subgroup in

question. These results are shown in Tables A4a and A4b, which are broadly consistent with

the results in Tables 11a and 11b. Again, they demonstrate that students are only affected

by the quality of the teachers who previously taught the other students at the school who

are similar to themselves.

C Placebo and Specification Test Details

C.A Placebo Test Details

As discussed in Section IV.B, the placebo test is designed to estimate the correlation between

a student’s test scores and his or her future peers’ teacher’s VA using a similar procedure

as my main specification. If no students switched schools between 4th and 5th grade, this

specification would be identical to the one in Equation (1), but I would now use changes

in the 4th grade teacher quality to as the measure of ∆µ̂c(i,t),t−1 and changes in the 5th

grade test scores to as the measure of ∆yi,t. In practice, the number of students who change

schools between 4th and 5th grade is small, but some do switch. This section discusses how

to account for these.

The first step is to estimate a new set of weights, denoted as αs4e,s6m,t, which is the

fraction of the students at middle school s6m who attended the elementary school s4e. The

superscripts make explicit that these calculate fraction of students attending 6th grade at

48Most of the evidence on whether a teacher’s effectiveness differs across genders has focused on genderbias. These biases have been shown to have a lasting negative effect in Lavy and Sand (2015). Evidencethat some teachers’ effectiveness is different for ELL students than for non-ELL students is shown in Loebet al. (2014).

58

C.B Specification Test Details Isaac M. Opper

middle school s6m who attended fourth grade at s4e. If no student changed schools between

4th and 5th grade, these α’s would be the same as the αse,sm,t that I used to construct the

measure which entered the main specification.

Given these weights, αs4e,s6m,t, I then construct a measure for the lagged teacher VA of

student i’s future peers, denoted as µ̂placeboc(i,t),t−1. This is done similar to before:

µ̂placeboc(i,t),t−1 =∑∀s′4e 6=s4e

αs′4e ,s6m∆µs′e,t−1 (18)

There are two other changes that I need to make to account for the fact that some

students change schools between 4th and 5th grade. First, I now defined a cohort of indi-

viduals who attended the same 4th grade, 5th grade, and 6th grade. Second, the switching

of schools means that some individuals who went to school with a student in 6th grade and

did not go to school with him or her in 4th grade, did go to school with him or her in 5th

grade. These students appear in the measure, causing some positive correlation between it

and a student’s 5th grade test scores. These students appear in the placebo measure, caus-

ing some positive correlation between it and a student’s 5th grade test scores. To account

for this, I include an additional control in the placebo tests: the true lagged teacher VA

of a student’s peers. This makes very little change to the results, but in theory makes the

coefficients a more accurate test for the existence of spurious correlation.

Given these changes in the definition of cohort i and the control vector ∆Xi,t, the

placebo regression stays the same as Equation (1):

∆yi,t = α+ β∆Xi,t + γplacebo∆µ̂placeboc(i,t),t−1 + ∆εi,t (19)

While the main results of this specification are shown in Table 5, Tables A5a and A5a

show this placebo result separately for math and English test scores and demonstrate that

the results do not differ based on the subject.

C.B Specification Test Details

To understand the specification test, remember that the measure I use is defined as:

∆µ̂c(i,t),t−1 =∑∀s′e 6=se

αs′e,sm,t∆µs′e,t−1 (20)

where As′e,t−1 is the average teacher value added at elementary school s′e in t − 1 and

αs′e,sm,t is the fraction of students at middle school sm who came from elementary school s′e.

Multiplying and dividing each side by (1 − αse,sm,t), which corresponds to the fraction of

students at middle school sm who did not attend elementary school se, gives an equivalent

59

C.B Specification Test Details Isaac M. Opper

representation:

∆µ̂c(i,t),t−1 = (1− αse,sm,t) ·∑∀s′e 6=se

αs′e,sm,t

1− αse,sm,t∆µs′e,t−1 (21)

Note thatαs′e,sm,t

1−αse,sm,tis the fraction of students at middle school sm who attended s′e, con-

ditional on not having attended elementary school se. We can then write the measure as

(1− αse,sm,t)µ̂′c(i),t−1, if we define

µ̂′c(i),t−1 ≡∑∀s′e 6=se

αs′e,sm,t

1− αse,sm,t∆µs′e,t−1 (22)

In words, the measure is the average teacher VA of student i’s neighboring elementary

schools in time t− 1 times the fraction of students at middle school sm who did not attend

student i’s neighboring elementary schools.

Given this representation, the regression I run for the specification test is:

∆yi,t = α+ β∆Xi,t + γ0∆µ̂′c(i,t),t−1 + γ1(1− αse,sm,t)∆µ̂′c(i,t),t−1 + ∆εi,t (23)

Other than adding separating the interaction terms, the rest of the regression is identical

to the main regression discussed in Section III. While the main result is reported in Table

6, Table A6 reports additional specifications as measures of robustness.

The basic specification test assumes linearity in (1−αse,sm,t), but that is not necessary.

Figure A3 plots the γk coefficients from the following specification:

∆yi,t = β∆Xi,t +10∑k=1

αkIk + γkIk∆µ̂′c(i,t),t−1 + ∆εi,t (24)

where Ik is an indicator variable that equals one if and only if (1− αse,sm,t) ∈[k−110 ,

k10

]. It

also includes in the background the distribution of (1− αse,sm,t).

60

Isaac M. Opper

D Appendix Figures and Tables

D.A Appendix Tables

Table A1: Excluding Early Years


Peers'PreviousTeacherVA 0.547*** 0.453*** 0.528*** 0.418**(0.144) (0.148) (0.167) (0.163)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishYears 1998-2010 1998-2010 1998-2010 1998-2010NumberofClusters 11106 9609 4885 3458NumberofCohorts 169142 109115 74331 48811NumberofStudents 5615320 4886302 1181478 934575*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an 2SLS regression that uses the measure described indescribed in Section III as an instrument for the previous teacher quality of the student's peers who previously attended differentschools. The constructed measure captures the teacher quality at the schools that feed the students' current school, but which heor she did not attend. Unlike the raw average, the measure is constructed to exclude variation caused by changes in how studentsare matched to teachers or changes in the peer composition of the current school. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade students at a particularmiddle school who went to a particular elementary school did in their math test scores, relative to the group of individuals whowent to the same middle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond the testscorestwo-yearspriortothecurrenttestscore.

61

D.A Appendix Tables Isaac M. Opper

Table A2: Different Value Added Measure


Peers'PreviousTeacherVA 0.408*** 0.356** 0.439*** 0.432**(0.147) (0.157) (0.169) (0.183)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishValueAddedControlVector Chetty,Friedman,andRockoff2014a Chetty,Friedman,andRockoff2014a Chetty,Friedman,andRockoff2014a Chetty,Friedman,andRockoff2014aNumberofClusters 13497 10954 6300 4137NumberofCohorts 171384 107658 85912 51865NumberofStudents 5082303 4389179 1442962 1054543

*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an 2SLS regression that uses the measure described in described in Section III as an instrument for the previous teacherquality of the student's peers who previously attended different schools. The constructed measure captures the teacher quality at the schools that feed the students' current school, but whichhe or she did not attend. Unlike the raw average, the measure is constructed to exclude variation caused by changes in how students are matched to teachers or changes in the peercomposition of the current school. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade studentsat a particular middle school who went to a particular elementary school did in their math test scores, relative to the group of individuals who went to the same middle school and elementaryschool a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. The baseline test scorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

62


Table A3: Pseudo-Zoned Specification



OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XPeer'sBaselineTestScore XGrades 5-8 5-8 5-8 5-8Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishNumberofClusters 15179 14357 11616 12154NumberofCohorts 243788 204097 120718 145957NumberofStudents 8034247 6654088 5528598 5893814

*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an 2SLS regression that uses the measure described indescribed in Appendix B as an instrument for the previous teacher quality of the student's peers who previously attendeddifferent schools. The constructed measure captures the teacher quality at the elementary schools that feed the middle schoolsthat are usually attended by people at the students' elementary school, but which he or she did not attend. All variables areconstructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6thgrade students at a particular middle school who went to a particular elementary school did in their math test scores, relative tothe group of individuals who went to the same middle school and elementary school a year before. Standard errors, inparenthesis, are clustered at the school-year level, and all regressions are weighted by the number of students in the cohort. Thebaselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

63


Table A4: What is the Relevant Peer Group?

(a) Gender Specific VA Estimates


Peers'LaggedTeacherVA-SamePeerGroup 0.299*** 0.224** 0.348*** 0.262**(0.106) (0.113) (0.131) (0.130)

Peers'LaggedTeacherVA-OtherPeerGroup 0.142 0.160 0.159 0.211(0.107) (0.115) (0.135) (0.135)

OwnPreviousTeacherVA X X X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishPeerGroupDefinition Female Female Female FemaleNumberofClusters 14135 11936 6495 4428NumberofCohorts 241149 156920 100312 65059NumberofStudents 6715379 5378328 1414867 1048997*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measures described in Section VI as the independentvariables. The measure captures teacher quality at the elementary schools that feed the student's middle school, but which he or she did not attend. ValueAdded is calculated separately for each, which along with the fact that different peer types (on average) previously attended different schools, generatesvariation between "Peers' Lagged Teacher VA - Same Peer Group" and "Peers' Lagged Teacher VA - Other Peer Group." All variables are constructed as theyear-to-year change in the school-by-lagged school-grade-subject-peer group-year average, for example how the 6th grade female students at a particularmiddle school who went to a particular elementary school did in their math test scores, relative to the group of 6th grade females who went to the samemiddle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weightedbythenumberofstudentsinthecohort.Thebaselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

(b) English Language Learner Specific VA Estimates


Peers'LaggedTeacherVA-SamePeerGroup 0.347** 0.424*** 0.456** 0.527**(0.154) (0.162) (0.196) (0.206)

Peers'LaggedTeacherVA-OtherPeerGroup 0.188 0.118 0.288* 0.286*(0.119) (0.127) (0.153) (0.156)

OwnPreviousTeacherVA X X X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects MathandEnglish MathandEnglish MathandEnglish MathandEnglishPeerGroupDefinition EnglishLanguageLearner EnglishLanguageLearner EnglishLanguageLearner EnglishLanguageLearnerNumberofClusters 8546 7163 3400 2422NumberofCohorts 75429 50323 25444 16343NumberofStudents 3649431 3186339 641727 495409*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measures described in Section VI as the independentvariables. The measure captures teacher quality at the elementary schools that feed the student's middle school, but which he or she did not attend. ValueAdded is calculated separately for each, which along with the fact that different peer types (on average) previously attended different schools, generatesvariation between "Peers' Lagged Teacher VA - Same Peer Group" and "Peers' Lagged Teacher VA - Other Peer Group." All variables are constructed as theyear-to-year change in the school-by-lagged school-grade-subject-peer group-year average, for example how the 6th grade female students at a particularmiddle school who went to a particular elementary school did in their math test scores, relative to the group of 6th grade females who went to the samemiddle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weightedbythenumberofstudentsinthecohort.Thebaselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

64


Table A5: Second Placebo Test

(a) Math



OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XPeer'sBaselineTestScore XGrades 5 5 5 5Subjects Math Math Math MathNumberofClusters 9313 9313 8673 8686NumberofCohorts 45025 45025 39770 41648NumberofStudents 665389 665389 630612 633644*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described indescribed in Section IV and Appendix C. The constructed measure captures the teacher quality at the schools that feed thestudents' future school, but which he or she did not attend. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 5th grade students who will go to a particular middle schooland whoare at a particular elementary school did in their math test scores, relative to the group of individuals who will go thesame middle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level,and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond the test scores two-yearspriortothecurrenttestscore.

(b) English



OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore XOwnPreviousTestScore XPeer'sBaselineTestScore XGrades 5 5 5 5Subjects English English English EnglishNumberofClusters 9302 9302 8663 8682NumberofCohorts 44363 44363 38621 40626NumberofStudents 644436 644436 609245 612808*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression that uses the measure described indescribed in Section IV and Appendix C. The constructed measure captures the teacher quality at the schools that feed thestudents' future school, but which he or she did not attend. All variables are constructed as the year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 5th grade students who will go to a particular middle schooland whoare at a particular elementary school did in their math test scores, relative to the group of individuals who will go thesame middle school and elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level,and all regressions are weighted by the number of students in the cohort. The baseline test scores correspond the test scores two-yearspriortothecurrenttestscore.

65


Table A6: Specification Test

(a) Math


Peers'PreviousTeacherVA(Unweighted) -0.00953 -0.0434 -0.0718 -0.0700(0.0624) (0.0643) (0.170) (0.167)

Peers'LaggedTeacherVA(Unweighted)xFractionofPeers 0.389** 0.428** 0.618** 0.638**(0.154) (0.171) (0.279) (0.280)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects Math Math Math MathNumberofClusters 14560 11846 6589 4232NumberofCohorts 107847 63409 43741 26624Observations 3521403 2927500 748982 549459*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression. "Peers' Lagged Teacher VA" is the aveage Teacher VA at theschools that feed a student's current school, but which he or she did not attend. "Fraction of Peers" corresponds to the fraction of students at theindividual's current school, who previously attended a different school. For more details, see Section IV and Appendix C. All variables are constructed asthe year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade students at a particular middleschool who went to a particular elementary school did in their math test scores, relative to the group of individuals who went to the same middle schooland elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by thenumberofstudentsinthecohort.Thebaselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

(b) English


Peers'PreviousTeacherVA(Unweighted) -0.0333 -0.0480 -0.189 -0.443**(0.0845) (0.0915) (0.213) (0.208)

Peers'LaggedTeacherVA(Unweighted)xFractionofPeers 0.480** 0.451* 0.666* 0.923**(0.207) (0.242) (0.344) (0.360)

OwnPreviousTeacherVA X XCurrentTeacherVA X XOwnBaselineTestScore X XGrades 5-8 5-8 6 6Subjects English English English EnglishNumberofClusters 14500 11781 6582 4189NumberofCohorts 102693 59500 43125 25906Observations 3335584 2736607 728965 532669

*** p<1%, ** p<5%, * p<10%. Each column reports coefficients from an OLS regression. "Peers' Lagged Teacher VA" is the aveage Teacher VA at theschools that feed a student's current school, but which he or she did not attend. "Fraction of Peers" corresponds to the fraction of students at theindividual's current school, who previously attended a different school. For more details, see Section IV and Appendix C. All variables are constructed asthe year-to-year change in the school-by-lagged school-grade-subject-year average, for example how the 6th grade students at a particular middleschool who went to a particular elementary school did in their math test scores, relative to the group of individuals who went to the same middle schooland elementary school a year before. Standard errors, in parenthesis, are clustered at the school-year level, and all regressions are weighted by thenumberofstudentsinthecohort.Thebaselinetestscorescorrespondthetestscorestwo-yearspriortothecurrenttestscore.

66

D.B Appendix Figures Isaac M. Opper

D.B Appendix Figures

Figure A1: Fraction of 6th Grade Students Missing 5th Grade Teacher Value Added Esti-mates

.2.4

.6.8

1Pe

rcen

t of I

ndiv

idua

ls M

issi

ng V

A M

easu

res

1990 1995 2000 2005 2010Year

Note: This figure shows the fraction of 6th grade students who are missing a teacher VAestimate for their 5th grade teacher. As is clear, very few individuals are matched toteachers in the early years of the data, but that the match rate increases and stabilizes ata little over 80%. It never reaches 100% matches for two reasons. First, any student who isnew to the New York City public school system will not be matched to a previous teacher.Second, I cannot estimate VA for teachers who teach for fewer than three years in NewYork City, because I exclude two years of data from the estimation; see Section III for moreinformation.

67


Figure A2: Correlations Between Subgroup Specific Value Added

(a) Gender Specific Value Added Estimates

(b) ELL Specific Value Added Estimates

Note: These figures shows the within-teacher-year correlation between different teachervalue added measures. The top figure shows how a teacher’s value added for students whoare males is correlated with the same teacher’s value added for students who are female;the bottom figure shows how a teacher’s value added for students who are classified as anEnglish Language Leaner (ELL) is correlated with the same teacher’s value added measuresof those who are not. In both figures, and for both the regression that generated the redline and the estimated correlation, I weight each teacher using the number of students whoattended the elementary school.

68


Figure A3: Specification Test

-.20

.2.4

.6

Estim

ated

Impa

ct o

fN

eigh

borin

g El

emen

tary

Sch

ool T

each

ers'

VA

0 .2 .4 .6 .8 1Fraction of Your Peers From Neighboring Elementary Schools

Estimated Effect Linear Estimated EffectKernal Density Estimation

Note: The above figure shows the estimated coefficients from Equation (24), as well asthe line implied from the linear specification. In addition, it plots the distribution of thefraction of current peers who previously attended a different school than the student, asopposed to previously attending the same one.

69

Date post:	25-Oct-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Does Helping John Help Sue? Evidence of Spillovers in ...

Documents