Download - Abandonment and Crime - Columbia University

Abandonment and Crime

Matthew Yeaton∗

ECON w4999 - Senior Honors Thesis Workshop

April 18, 2013

Abstract

We construct a novel measure of housing abandonment, and build a proportionalhazard model in order to estimate the ratios in probability of various explanatoryvariables, most notably violent crime. We find that violent crime has a strong causalimpact on abandonment, and furthermore that while the existence of a positive causalrelationship remains strong across various types of abandonment and various subsetsof the data, the magnitude of the impact varies substantially and predictably in accor-dance with theory along these same measures. We are also able to make similar claimsregarding other predictive variables, including rent levels and various large negativeidiosyncratic shocks.

∗Matthew Yeaton, [email protected].

1

mailto:[email protected]

I. Introduction

The relationship between abandoned buildings and crime is one that often finds itself at

the forefront in popular culture and political debates about the status of American cities.

Richard M. Daley, the former mayor of Chicago, once famously declared in a press release

that “vacant buildings are ugly. They attract gangs,”1 summarizing public opinion about

the intertwined role of crime and abandonment quite simply and succinctly. Certainly, from

a layman’s perspective, this jump is not hard to fathom. Abandoned buildings can act as

the symbol of a fallen, crime-ridden city. Others may cite the broken windows theory, first

articulated by Wilson and Kelling (1982), such that general property dilapidation acts as a

signal that a neighborhood is ill-cared for and as a signal for general urban disorder, and

thus leads to more serious problems like crime.

Empirically, the correlative relationship between abandonment and crime seem to be

fairly robust. Spelman (1993), for example, found that crime rates on blocks with abandoned

buildings were twice as high as rates on similar blocks without abandoned buildings. The

question then becomes whether or not there is a causal relationship between the two, and if so,

which direction the causality moves. In this vein, the problem of whether or not abandonment

leads to increased crime is fairly well studied, such as Skogan (1990) or Immergluck and Smith

(2006). However, the causal relationship may work in both directions, rather than simply

abandonment causing increased crime rates. Hence, it may also be true that higher crime

rates increase the rate at which buildings become abandoned in a given area. This problem

is not nearly as well studied, and as such, this paper will attempt to give some insight into

this particular direction of causality.

Simply put, we are interested in whether or not differences in crime rates may impact

the probability of abandonment, i.e. whether or not crime rates are related to the possibly

different rates at which occupied units become abandoned. Theoretically, we may think of

crime impacting abandonment hedonically through a demand channel. In the first place,

1Spielman (2000), p. 24

2

crime has been demonstrated to negatively impact housing values, as shown in Hellman

and Naroff (1979) or Lynch and Rasmussen (2001). Hence, crime in the area surrounding

a building may decrease the demand for housing within the building and therefore decrease

the profitability of the building, and through this channel increase the likelihood that the

owner abandons the building.

Furthermore, crime may impact abandonment beyond this direct demand channel. Be-

yond profitability, an owner may find maintaining and even visiting a building in a high-crime

area to be particularly undesirable. Specifically, the owner’s fear of violent crime may influ-

ence an owner to abandon a unit at a higher profitability threshold than he or she otherwise

might in the absence of violent crime, even if we hold demand for the housing units within the

building constant (and consequently hold profitability constant). Hence, under this scenario,

violent crime would have an additional positive impact on the probability of abandonment

even if we hold demand constant.

Before we are able to answer any of these questions, though, we must first address what

we mean by abandonment. Hence, there are two primary questions we are interested in

answering: first, how should we define and measure abandonment; and second, in what

scenarios, if any, do high crime rates differentially impact the probability of abandonment

as compared to low crime rates.

A. What Is Abandonment?

As per O’Flaherty (1993), there are three different things we could mean by the term aban-

donment : “the cessation of an owner’ provision of maintenance and operating services to

a building,” the “loss of the owner’s legal right to the property,” and “the demotion of a

building.”2 For the purposes of this paper, we are focusing rather generally on all three of

these definitions. Clearly, the second two of these definitions are fairly black-and-white. One

results in a change in property ownership, while the other results in the disappearance of a

2O’Flaherty (1993), pg. 45

3

building. The first definition, however, is a bit trickier.

It is often ambiguous what constitutes this sort of abandoned property, and in partic-

ular, we can not equate abandoned with vacant, which is what is measured in most data.

Furthermore, many have found that there is a nontrivial difference in the rates calculated

using the all-inclusive measures and the the more limited abandonment measures. Hence,

it pays to attempt to tease out which is which in our data in order to better estimate our

model. Specifically, measurement of this first definition will give us some pause, so that we

ought to think rather carefully about what it means.

A.1. Vacancy vs. Abandonment

To begin addressing the problem of the maintenance cessation definition and measurement

of abandonment, we must consider the distinction between vacancy and abandonment. Ob-

viously, not all vacant housing units are created equally, nor would all vacant housing units

be of the type to decrease the desirability of nearby properties. For example, an apartment

in mint condition vacated in the past week will not have nearly the same effect as an un-

inhabitable, mold-filled apartment with broken walls and windows in abject condition that

has been abandoned for the past two years. Hence, what we are interested in is not so much

vacancy as abandonment, and we must be careful not to blur the lines between the two (at

least to the extent we are able).

However, discerning the difference between the two is not always clear. In fact, Apgar

(1990) found that in many cases housing units in abject condition and on the verge of

abandonment are considered to be vacant in in standard vacancy calculations, instead of

being removed from from the set of available housing. In fact, Park (2000) finds that some

of the rising trend in higher vacancy rates of low rent units in particular can be attributed

to an increase in the sample of units in extremely poor condition that perhaps should be

considered abandoned rather than vacant.

Hence, we should not necessarily be satisfied with conventional measures of vacancy in

4

investigating what impact abandonment density has on unit value. Park (2000) notes that “if

a unit is in such bad shape that it is unlivable—either boarded up or the interior is exposed

to the elements—it is inappropriate to include it in the pool of vacant units available for

rent.”3 With this in mind, Park develops several methods of determining when a housing unit

is vacant and on the verge of abandonment, but primarily through looking at the vacancy

state of a unit over time, and removing those units that were consistently vacant. Park

found that removing these housing units that were abandoned and on the verge of being

abandoned significantly affected vacancy rates. Our approach will be theoretically similar,

but will include some methodological differences in order to be able to identify those units

which are vacant and on the verge of abandonment. Because of the detailed nature of our

data on vacancy, we will be able to give a well specified description and estimate of the true

level of impending abandonment in neighborhood. This will allow us to sufficiently address

this first definition of abandonment.

B. Crime and Demand

Our model will not be a hedonic model, since we are not interested in demand per se, but

rather abandonment. Nevertheless, we are interested in demand in as much as it impacts

abandonment. Hence, we may draw inspiration from the hedonics literature in order to help

us create and form hypotheses about our abandonment model.

Hedonic analysis, or hedonic regression, is a method of estimating the contributory effects

of the constituent elements of demand. This method of analyzing the market value of complex

goods, and in particular the market value of housing, was first elaborated by Rosen (1974).

Since then, hedonic models have been used to understand the effects on housing demand of

a variety of housing and locational characteristics. The locational studies are clearly most

interesting to our analysis, and even these range from analysis of the effects of differential

3Park (2000), p. 97.

5

school quality4 to the impact on demand of nearby subsidized rental housing5.

However, most relevant within the hedonics literature for our purposes are those studies

interested in the relationship between housing demand and crime. These studies suggest

that crime levels do indeed affect property values. Thaler (1978) and Gibbons (2004) found

that property crime rates decreased housing prices in Rochester, New York and London,

respectively. Hellman and Naroff (1979) found a similar relationship using total crimes in

Boston, and Lynch and Rasmussen (2001) saw this again using violent crimes in Jacksonville,

Florida. Interestingly, they also found that while violent crimes were associated with de-

creased demand, property crimes were associated with higher property demand. There are

a few reasons why this might be the case: first, it may be that smaller property crimes are

reported at a higher rate than they otherwise would in low-rent neighborhoods due to a

decreased tolerance for general societal disorder in these neighborhoods and higher rates of

policing and police presence, as suggested in Lynch and Rasmussen (2001). It also may be

that potential property thieves make the decision to commit crimes in the areas where their

efforts will be most fruitful, i.e. wealthy, high-rent, high demand neighborhoods. Further-

more, while an individual may fear for their personal safety in the case of violent crime, they

will likely not fear for their personal safety in the case of property crime (at least certainly

not to the same extent). Thus, it is unsurprising that at least in some locations the incentive

to commit crimes in wealthy neighborhoods in conjunction with the higher rate of report-

ing property crimes in these neighborhoods will outweigh the potentially negative demand

pressures from higher risks of property crime.

Thus, the impact of property crime on demand is ambiguous, such that in some locations

it associated with lower demand and in other locations associated with higher demand.

Violent crime, on the other hand, is unambiguous in decreasing demand. Thus, in as much

as demand impacts the abandonment decision through decreased building profitability, we

would expect violent crime to increase the probability of abandonment.

4Cheshire and Sheppard (2004)5Ellen et al. (2007)

6

C. Hazard Rate Analysis

We are now faced with a question of model specification. How can we estimate the impact of

crime on the probability of abandonment? Survival analysis, and by extension hazard rate

analysis, can help us to answer this question.

Survival analysis is classically used in the biological sciences in order to estimate patients’

time to death, or in various engineering fields in order to estimate when a piece of machinery is

likely to break down. We can easily draw a parallel to our inquiry. Let us consider a building

an organism with a lifetime. We can then build and estimate a model of our building’s “time

to death.” Thus, rather than dealing with changes in the stock of abandoned buildings

between different neighborhoods, as we might be forced to in a linear model in order to

account for existing stock of abandoned buildings in a neighborhood, this type of model will

allow us to evaluate the factors that impact the probability that an individual unit that was

previously occupied becomes abandoned, and as such should be substantially more accurate

as we will be able to derive results on the micro-level without having to resort to aggregation

of data or having to worry about mis-measurement of the stock of abandoned buildings.

Survival analysis has been used in the context of the housing market before, but never

in this context. It has been often used to examine trends in housing market cycles, as in

Diebold and Rudebusch (1990) or Cunningham and Kolet (2007), however these are largely

macro-level analyses. The most similar paper with regard to our purpose is Ambrose (2005),

which analyzes the decision to leave assisted housing programs. While this study is the most

methodologically similar, it nonetheless has an entirely separate focus from this study.

Hazard rate analysis is a subset of survival analysis that focuses on the factors that

determine death, system failure, or in our case abandonment, rather than the time to aban-

donment. This will be the focus of our study, as we are primarily interested in the impact of

crime on the probability of abandonment under various contexts, rather than the time until

abandonment. We will develop a model based on Cox (1972) that will allow us to remain

agnostic as to the underlying distributions of our variables.

7

The structure of the rest of the paper will be as follows: first, this paper will expand

upon the earlier vacancy and abandonment literature by using the theoretical framework of

the Park method for discerning the difference between vacant and abandoned housing units

described above. We will then elaborate upon our hazard rate models and methodologies, and

finally present the results of our analysis, where we find that crime has a significant impact

on the probability of abandonment, and furthermore that the impact varies in accordance

with theory in conjunction with various types of abandonment and under a range of subsets

of the data.

II. Data

This study uses data from a few different sources, primarily: detailed unit-level property

characteristics data for both vacant and occupied units from the New York City Housing

and Vacancy Survey, and precinct-level crime data from the New York Police Department.

We construct an abandonment measure using data from the New York City Housing and

Vacancy Survey, and a violent crime measure and a property crime measure using the first

principal components of violent crime and property crime, respectively.

A. New York City Housing and Vacancy Survey

The New York City Housing and Vacancy Survey (NYCHVS) data is conducted and spon-

sored by the New York City Department of Housing Preservation and Development in order

to comply with rent regulation laws in both New York State and New York City, and is

made available by the United States Census Bureau. It is compiled and published every

three years with respondents in a stratified random sample of New York City housing units.

The sample is based on the decennial census, and as such the units in the sample are by

necessity changed every ten years. However, within any given ten year period, the units in

the sample are kept consistent, and the data is actually panel data. With this in mind, this

8

study uses data for the period 2000-2010, which includes the 2002, 2005, and 2008 NYCHVS.

The sample includes both vacant and occupied units, and both public and privately

owned units. Sample units for the NYCHVS come from three primary sources: housing units

included in the 2000 decennial census master address files, housing units constructed since

the 2000 decennial census through a file of addresses listing all residential units, citywide,

issued Certificates of Occupancy for new construction between January 1, 2000 through

October 31 of the year prior to the survey year. Finally, sample units were drawn from a

list of housing units located in structures owned by the city because the owner failed to pay

taxes on the property (in rem units).

For the 2002 and 2005 surveys, approximately 18,000 units throughout New York City

were chosen as a representative sample of the housing in the five boroughs, and each unit

represents approximately 180 similar units. For 2008, the number was increased to 21,000

units each representing approximately 170 similar units. This was done to take into account

the much higher rate of new constructions that had been added since the 2005 NYCHVS

than had been typical.

Field representatives hired by the New York City Department of Housing Preservation

and Development conducted interviews between January and May of the given year. Inter-

view response rates were 98%, 96%, and 98% in 2002, 2005, and 2008, respectively.

Table II shows the total number of units in our panel. We have 53,417 unit-year obser-

vations, 4,021 of which are vacant while the remaining 49,396 are occupied. This represents

22,115 unique units over the sample period.

Beyond our abandonment measure, which is discussed in detail below, we include several

other variables from the NYCHVS. Table III presents the number of units for a series of

binary variables by year. Specifically, whether or not the unit is owner-occupied, whether

or not the unit is currently experiencing legal problems, for example those stemming from

a will, a lawsuit, settlement of an estate, or some other legal matter that places the unit in

limbo, whether or not the owner is experiencing personal problems, for example old age or

9

sickness, whether or not the building has more than five units, and whether or not the unit

is in the bottom quartile of units in terms of rent for the given year. Note that the totals

for each year, and thus the full sample is lower here than in Table II. This is due to entries

missing coverage in one or more of these variables.

Each of these variables should theoretically have an explanatory impact on the probability

of abandonment. If a building is owner occupied, it is less likely to be a purely investment

building, and thus the abandonment decision becomes not only one of profitability but also a

question of simply having a roof over one’s head. Therefore we might expect the probability

of abandonment to be lower in the case of owner occupied buildings.

The motivation to include legal problems and personal problems is very similar. Both re-

flect unexpected negative idiosyncratic shocks that could increase the likelihood of a building

tipping into abandonment. Thus, we should expect these both to increase the probability

of abandonment. As can be seen in Table III, these are very uncommon occurrences, with

incidence hovering between 0.1-0.2% of cases.

We include a many-units variable to capture differences in building types and sizes. We

denote a building as a many unit building if it includes more than five units. For extremely

large and expensive buildings there would need to be an absolutely enormous unexpected

shock to increase the probability of abandonment. Additionally, we might expect the types

of large institutional investors that have access to large apartment buildings to be better able

to estimate the relevance of crime rates in their investment decision. For all of these reasons

we should expect the probability of abandonment to be lower for many-unit buildings as

compared to smaller buildings.

The low rent variable is included to help control for demand. We characterize a unit as

low rent if it is in the bottom quartile for rent for a given year. This partition has been

selected to help us identify those units who might be most vulnerable to abandonment from

a demand perspective. Thus, in the event of additional high levels of violent crime and/or

an unexpected negative shock like those described above, these units would have the highest

10

probability of becoming abandoned.

B. Abandonment Measure

The abandonment measure used in this paper is derived from several aspects of the New York

City Housing and Vacancy Survey Data. We will outline the various components contribut-

ing to the measure before summarizing the definition of the measure below. Recalling the

definitions of abandonment from O’Flaherty (1993), we can begin to address abandonment

by looking at the second two definitions, namely “the loss of the owner’s legal right to the

property,” and “the demotion of a building.” We thus include a building if it has in rem

status, and if the unit is being held for planned demolition.

The first definition of abandonment, “the cessation of an owner’ provision of maintenance

and operating services to a building,” is more nuanced, as mentioned above. The NYCHVS

does not report abandonment, simply vacancy, and as mentioned above, several others, such

as O’Flaherty (1993) and Park (2000), have found that it can often be difficult to distinguish

units that are simply vacant from those that are on the verge of abandonment. However, Park

(2000) and Apgar (1988) find that there are significant differences in vacancy rates that have

been appropriately adjusted to remove these units on the margins of abandonment. Clearly

then, using vacancy as our rough gauge of abandonment is insufficient. Our goal will be to

apply Park’s method but in reverse: while she wanted to exclude the abandoned units to

more accurately measure vacancy, we want to exclude truly vacant units in order to more

accurately measure abandonment.

Park’s primary method of discerning units that are vacant but on the verge of aban-

donment is to remove units from the sample that are abandoned in the subsequent period.

However, unlike in the American Housing Survey (which Park uses), the NYCHVS does not

list a reason for non-response in the survey data. While the NYCHVS lacks this, it does

include a large number of detailed characteristics on each abandoned property, ranging from

amount of time that the housing unit has been vacant to whether or not the unit’s walls are

11

structurally intact. Hence, we must construct an alternative method of measuring units that

are vacant and on the verge of abandonment. Park (2000) notes that “if a unit is in such bad

shape that it is unlivable—either boarded up or the interior is exposed to the elements—it

is inappropriate to include it in the pool of vacant units available for rent.”6 We can use

this as a guiding principle as we construct our measure.

We begin in a similar way to Park, by using a longitudinal measure. As such, units that

have been consistently vacant for over a year are included in our pool. The New York City

Department of Housing Preservation and Development notes in Housing New York City 2008

that

In the City, which has been characterized by an acute housing shortage for thelast several decades, a long-term rental vacancy duration raises questions as toeither the absolute desirability of the rental unit within a rent context or its trueavailability. In other words, in the City’s rental housing market, an increase invacancies lasting three or more months could mean that these units are probablybeing rejected by prospective renters as unsuitable or not preferable for one ora combination of reasons. . . [namely that] their housing and/or neighborhoodphysical and other conditions are not acceptable7.

Thus, in order to remain conservative with respect to our measure, we only include units

with a consistent vacancy duration of one year or more, far longer than the lower bound for

long-term vacancy.

As mentioned above, the NYCHVS includes a large number of property characteristics,

even for vacant units. We can use those characteristics to get a better picture of which

vacant units are in Park’s category “such bad shape that it is unlivable.” With this in

mind, we include several different categories of building dilapidation: problems with walls,

problems with windows, problems with stairways, and problems with floors. These problems

are summarized in the Table I below.

If a unit possesses any of these dilapidation criteria, it is added into the pool. Essentially,

this is anything that would indicate that the unit had not been lived in for some time, or could

6Park (2000), p. 97.7Lee (2011), p. 376

12

Table I. Summary of Problems Dilapidation Criteria

Walls Windows

· missing outside wall material · broken or missing windows· major cracks in outside walls · boarded up windows· loose or hanging cornice, roofing,

or other material

Stairways Floors

· loose, broken, or missing steps · sagging or sloping floors· slanted or shifted doorsills or · holes or missing flooring

door frames

not handle a new occupant in the immediate future. These, along with our duration criterion,

are used to define abandonment. Like Park and Apgar, we find that this pool is significantly

different from the vacancy pool at very high significance levels, and that more interestingly,

their community district distribution is significantly different than that of vacancy’s. An

explicit definition of our verge-of-abandonment variable is given in equation 1 below:

abandonment = 1

(if in rem status), OR

(if scheduled for demolition), OR

(if vacant for more than one year AND

one or more Dilapidation Criteria from Table I)

(1)

Table IV shows tabulations of abandoned and abandoned buildings by year. We can see

that about 5% of units in a given year become abandoned. Table V shows various NYCHVS

and crime binary variables by abandonment status. There are a few things that immediately

jump out at us as interesting. First, note that both legal disputes and personal problems

are much more common among abandoned units than might be expected by the relatively

small number of abandoned units. In fact, personal problems is actually more common

under abandonment. Also note that small buildings are abandoned at a much higher rate

than large buildings. We see a similar effect for low rents, such that low rent units are much

13

more likely to be abandoned than high rent units. Both of these are in accordance with our

intuition and hypotheses, but more formal analysis is required to ensure that these effects

are more than simple correlations.

Figure 3 shows a heat map of abandonment rate by community district for each of the

years in the sample. We notice that abandonment rate seems to be essentially stagnant over

the sample period. Furthermore, we see that abandonment seems to be highly concentrated

in blocks of contiguous community districts. This suggests that there may be larger region

effects at play (potentially crime).

B.1. Comparison of Abandonment to Vacancy

While the construction of this abandonment measure may be interesting academically, the

measure is little more than an interesting interlude if it captures the same thing as vacancy.

Thus, we must confirm the difference between the two measures.

We can begin to do this by considering kernel density estimations for vacancy and aban-

donment, as shown in Figure ??. We aggregate vacancy and abandonment at the level of

the community district and are interested in whether or not the rates of each seem to reflect

the same distribution. Figure ?? seems to suggest that the distributions are indeed different,

and certainly we can see that they have different means, which is confirmed by a Student’s

t-test with p-value ¡ 0.0001. Beyond their means, however, the underlying distributions seem

to be rather different. Abandonment rate is much steeper and tighter than vacancy rate,

and vacancy rate seems to exhibit a second peak just above 10%.

We can test the equality of the distributions by employing a Kolmogorov-Smirnov test.

The K-S test is a non-parametric test of the equality of two distributions, taking as the

null hypothesis that the two distributions are equal. As shown in Table VII, we are able

to conclude with high confidence that these two variables come from different distributions,

and thus that our abandonment measure is capturing something different from vacancy.

14

C. New York City Crime Data

We also include crime data from the New York City Police Department by precinct for the

years 2002, 2005, and 2008, matching our NYCHVS panel. The data include all major

categories of both violent and property crimes, and includes murder, rape, robbery, as-

saults, burglary, grand larceny, and grand larceny auto. Because of the relationship between

precincts and community districts such that each community district contains nearly exactly

either one or two percents, we are able to move our unit of measure from the level of the

precinct to the level of the community district and hence merge with the NYCHVS panel at

the level of the community district.

We standardize crimes by the number of housing units in the district to account for

differences in both size and densities between the various community districts. Grand larceny

displays outliers, and we Winsorize at the 5% level, which successfully rectifies the problem.

Summary statistics for the crime data are presented in Table ??. We note that there are no

substantial high or low outliers.

While we would like to include all of these crime variables in order to maximize our

measure of crime, the high correlations between the various types of crimes will prevent us

from immediately doing so. For example, correlations between murder, assault, rape, and

robbery range from 80% to over 90%. Thus, we must consider an alternate method of dealing

with these variables if we wish to include them all. Principal component analysis will allow

us to accomplish this goal and include all of these variables in order to encompass a more

full measure of crime.

We will also separate violent crimes (murder, assault, rape, and robbery), from property

crime (burglary, grand larceny, and grand larceny auto) in order to better understand the

possibly different impacts of the two types of crime on the probability of abandonment

discussed above.

15

C.1. Principal Component Analysis

Principal component analysis is sensitive to the relative scaling and centering of the data,

so we standardize the data so that each of the original variables now has mean of zero and

a variance of one. Principal component analysis is also sensitive to the inclusion of outliers

in the data, but our earlier treatment of outliers using Winsorization should have rectified

this issue. The box and whisker plots for each of the three years presented in Figure ??

confirms this. The figure shows the variance-standardized data, and the whiskers here are

one-and-a-half of the interquartile range. Hence, we see some points that fall outside of this

range, but none so egregious as to distort the general shape of the variance of the data.

Overall, the data is generally balanced, and we observe that each variable is on a similar

scale, as desired.

We implement principal component analysis using singular value decomposition (as de-

scribed below) to construct yearly indices of violent crime and property crime. We will

include percentage of murders, assaults, rapes, robberies in the violent crime measure, and

grand larcenies, and grand larcenies auto for the property crime measure. This will success-

fully deal with the high correlation problem while still guaranteeing that we will capture the

maximum variance in the data. We will then implement these complete crime variables into

our analysis.

C.2. PCA Crime Indices

One of the benefits of principal component analysis is that we can project the original vari-

ables onto the space of the first few components in order to make meaningful interpretations.

We can do this using the eigenvectors of the principal components. When the eigenvectors of

the principal components are discussed in the context of the original explanatory variables,

we call them scoring coefficients, and can use these coefficients to interpret projections of

the original data into reduced dimension spaces of our new orthogonal basis. We can also

interpret these projections visually, and it is the combination of seeing the projection of the

16

original vectors onto these simplified spaces, and looking at the specific scoring coefficients

that we can make the best interpretations.

Because of the variance-preserving nature of principal component analysis, we can also

easily construct an index of crime and abandonment that can embody a large amount of the

variation in the full data in only a few principal components. Recall also that the principal

components are by construction orthogonal, and thus independent under normality. Hence,

we must first decide how many principal components to include in our analysis before moving

on to interpretation of the meaning of the principal components. Roughly speaking, the

eigenvalues denote the stretch of the transformation in the direction of the corresponding

eigenvector, and in this case, we can use the relative magnitude of the eigenvalues as a way to

understand the variance explained by the principal component. By design, the first principal

component contains more information than the first, the first more than the second, and so

forth, however, we must select a point at which to stop including components. Figure ??

shows screeplots of the eigenvalues of the principal components of violent crime, and Figure 9

shows screeplots of the eigenvalues of the principal components of property crime. We first

notice that there is a large amount of similarity between the shapes of the screeplots, which is

not entirely surprising, given that there is only three years between each data collection year.

We can also see that there is a significant drop below one in explanatory ability after the

first component for each of the three screeplots for each crime type, particularly for violent

crime, so preliminarily we will construct our index using the first component. Another way of

thinking about this is by seeing the actual variance explained by each component. Figure 7

shows the actual variance explained by each principal component of violent crime, while

Figure 10 shows the actual variance explained by each component of property crime, as well

as the cumulative variance explained by the principal components. Here we can see that the

first component of violent crime explain over ninety percent of the variation in the data, while

the first component of property crime explains approximately 70% of the variance within

the data, which further confirms that construction of two one component indices should be

17

acceptable.

Figure 5 and Figure 8 shows our original explanatory variables and original data points

projected onto the space of the first two principal components. We can see that the first

component of violent crime has very high positive values in all of the violent crime variables,

and the first component of property crime has moderate or high positive values in some

of the property crime variables, with grand larceny causing the most trouble. This further

reinforcing our hypothesis that the violent crime data is all highly jointly correlated, as we

suspected, while property crime may have more mixed origins and relationships to itself and

other variables of interest. We will be able to investigate this more closely within the context

of our model.

Finally, note the relationships between Figure 2 and Figure 3, which forms the basis for

the stylized facts on which we base this study. Building and implementing our model will

allow us to test these hypotheses.

III. Models

We will build a proportional hazards model based on ?. The motivation behind this is

twofold. First, the nonparametric nature of the model will allow us to avoid restrictive

assumptions on our data that will likely not hold. Second, and more importantly, the model

will allow us to estimate the differences in the probabilities that units currently occupied

become abandoned, rather than trying to work with changes in the stock of units, which

is an essentially unmeasurable quantity. However, before we elaborate upon this model, we

must understand its fundamental origins.

A. Hazard Rate Analysis

While linear models may be useful in helping us to understand correlative relationships

between abandonment and crime in New York City, it will not be useful in helping us to

18

understand any causal relationships that may exist between abandonment and crime due to

the endogeneity and possible simultaneous causality that exists between these variables. In

order to rectify this problem, we can also build a hazard rate model to better understand the

relationship between crime and abandonment. Specifically, hazard rate analysis will allow

us to address the question of the differential probability of a unit becoming abandoned in a

given time period in areas with different crime rates. Within our sample, many units begin

occupied and then subsequently are abandoned in later periods. For the context of our

analysis, we will consider this abandonment our “event,” and crime will be our “treatment.”

Let us define the time that a unit becomes abandoned as T , where T is a random variable

with an underlying continuous probability distribution, f(t), where t is some time. Then

the cumulative probability, or lifetime distribution function, F (t), and the survival function,

S(t), are denoted

F (t) =

∫ t

0

f(s)ds = Pr(T ≤ t) = 1− S(t) (2)

i.e. S(t) = Pr(T > t). The survival function gives the probability that the time of abandon-

ment will be at least t. Let a denote the probability that the unit will be abandoned in the

next interval of time, conditional upon the unit not being abandoned before time t. Then a

will be denoted by

a(t,∆t) = Pr(t ≤ T ≤ t+ ∆t|T ≥ t) (3)

Then the hazard function, λ, is defined as the event rate at time t conditional on not being

abandoned until time t or later, i.e.

λ(t) = lim∆t→0+

Pr(t ≤ T ≤ t+ ∆t|T ≥ t)

∆t=f(t)dt

S(t)(4)

In other words, the hazard rate indicates the rate at which units are abandoned at time t

conditional upon the unit remaining un-abandoned until time t.

In order to quantify differences in hazard rates between different groups of characteristics

we can utilize a Cox proportional hazards model. We begin with the baseline hazard function

19

described above. From now on, let us consider this baseline hazard function as λ0(t). Then

the hazard rate function for the Cox model is given by

λ(t|X) = λ0 exp(β′X) (5)

where X is the vector of explanatory variables. Hence, we can construct a partial likelihood

based on this new hazard function as follows

L(β) =J∏

j=1

exp(β′Xj)∑i∈Rj

exp(β′Xi)(6)

where j denotes the event times, Xj is the explanatory variable vector for the for the unit

that became abandoned at time tj, and Rj denotes the risk set at time tj, namely the set of

units which are at risk of abandonment at tj. This will allow us to make claims about the

ratios of the probabilities of abandonment for different crime levels while remaining agnostic

as to the specific underlying hazard rate distribution, a significant advantage of the model.

IV. Methodologies

We detail the technique of principal component analysis, which is integral in our violent crime

and property crime measures. Furthermore, in order to better understand the relationship

between abandonment and crime, we will also build a hazard rate (or survival analysis)

model, which we can estimate and test using the various methods described below. This will

help us to deal with the intrinsic endogeneity between abandonment and crime.

A. Principal Component Analysis

As per Jolliffe (2002), the principal component is given by:

nYTp = nX

Tp Wp (7)

20

where nYTp = (y1, y2, . . . , yp) is the principal component, nX

Tp is the mean-centered transpose

of the data matrix, and pWp is the matrix resulted from singular value decomposition of the

data matrix, pXn, as shown in equation (8):

pXn = pWpΣnVTn (8)

where pWp is the matrix of eigenvectors of the covariance matrix pXnXTp , pΣn is a rectangular

diagonal matrix with the diagonal composed of elements of R+, and nVn is the matrix of of

eigenvectors of nXTp Xn.

Essentially, principal component analysis uses a particular orthogonal transformation

to turn our potentially correlated explanatory variables into a set of linearly independent

vectors called “principal components.” We can think of this as finding a new orthogonal

basis for the space of our explanatory variable that preserves the “structure” of the data

in the sense of maintaining variance. Furthermore, this new basis is constructed to help

us identify the most important gradients in the data. Hence, the first principal component

is in the direction of maximum variance in the data, the second principal component is in

the direction of maximum variance such that it is orthogonal to the first, and as such will

be in the direction of second most variance, etc. We are in effect “rotating” our data to

identify the directions of maximum variance. Note that, as mentioned above, the principal

components are simply a linear combination of the original variables, albeit chosen such that

the first component contains more information than the second, the second more than the

third, etc. for all p components.

Because the principal components are sensitive to the relative scaling of the vectors in

the data matrix, and can give misleading results if not mean-centered, we have standardized

the crime variables as discussed above before implementing the method.

There are several advantages to principal component analysis that we will be able to

take advantage of in the context of this study. Most importantly, we will be able to include

21

more rigorous measures of violent crime and property crime without dramatically violating

the independence assumption of the hazard rate model as we would if we included all of our

violent crime or property crime variables within our analysis. Furthermore, because principal

component analysis by design creates the new basis’ coordinates in order of importance, we

can project our original data onto this new basis, and can visualize what was originally high

dimensional data in the space of R2 or R3, which is a significant advantage in helping us to

interpret the meaning principal components. Hence, we can summarize the most important

information embodied in these two sets of variables in a single variable while maintaining

the assumptions of our model.

B. Hazard Rate Analysis

In order to undertake hazard rate analysis, we must examine the survival and hazard func-

tions. We can use the Nelson-Aalen estimator to get a sense of the cumulative hazard rate

function, Λ, where unsurprisingly

Λ(t) =

∫ t

0

λ(s)ds (9)

where λ is the hazard rate function described in our model above. The Nelson-Aalen es-

timator is non-parametric, and is able to handle censoring of data, so we can both remain

agnostic with regard to the underlying abandonment probability distributions and include

the large number of right-censored units in our data. The Nelson-Aalen estimator is given

by

H(t) =∑ti≤t

dini

(10)

where di is the number of abandonments at time ti, and ni is the total number of units at

risk at time ti. This method will give us some preliminary indication of the differences in

cumulative hazard rates for units across various crime parameters.

We will utilize the Peto log-rank and the Wilcoxon-Breslow-Gehan statistics to test the

22

null hypothesis that the hazard rates are the same across the various crime parameters. Both

statistics are given by

Z = v′V −1v (11)

where V is the covariance matrix, and v is a vector whose components are given by

vi =J∑

j=1

wj(dij − eij) (12)

where i ∈ I, I the number of distinct groups, the summation over J , where J the unique

event times, dij is the number of abandonments in group i ∈ I at time j ∈ J , eij is the

expected value of abandonment in group i at time j, and wj is the weight for time period

j, where wj = 1 ∀j for the Peto log-rank statistic and wj is the number of units at risk for

abandonment at each time j for the Wilcoxon-Breslow-Gehan statistic. Each is distributed

χ2 with degrees of freedom equal to the rank of V . The Cox proportional hazard model can

easily be estimated using maximum likelihood, so we will not go into any depth as to its

estimation.

V. Results

We estimate sets of hazard ratios for three different types of abandonment, and for various

subsets of variables in order to tease out the various relationships within the data. Most

importantly we are able to make some claims about the conditional impact of violent crime

on different subsets of both abandonment and the various explanatory variables.

We begin by using the full set of abandoned firms, before estimating specialized models

for abandonment due to dilapidation and abandonment due to in rem status. This ability

to measure the impact of our explanatory variables on different types of abandonment is an

extremely interesting and valuable benefit of our abandonment measure.

23

A. Hazard Rate and Hazard Ratio Analysis

The various explanatory variables are as described above, with the addition of the high

violent crime and high property crime variables. Units whose community districts have the

first principal component of violent crime and property crime over the period above the

city-wide median are denoted high violent crime and high property crime, respectively.

We begin by analyzing the cumulative hazard rate function, Λ, which we can estimate

with the Nelson-Aalen estimator. This is plotted in Figure 11 below. We can superficially

see that the cumulative hazard rate is consistently much higher for units in high violent

crime versus low violent crime districts, which is unsurprising considering our hypotheses.

We then test the null hypothesis that the underlying hazard rates are the same, us-

ing the Peto log-rank and Wilcoxon-Breslow-Gehan statistics and find that both reject the

null at extremely high significance levels (p-values <0.00001). Hence, we can conclude the

probability of abandonment is significantly affected by the crime level in the unit’s district.

Using the Cox proportional hazard model, we can quantify the difference in probabilities,

and control for a range of other variables. A hazard ratio represents a ratio of probabilities.

Note the hazard ratio (1.271) of high violent crime on the first model estimation in Table VIII.

Thus, the probability that a unit will be abandoned in the next period, conditional upon the

unit not being abandoned in an earlier period is 1.271 times higher for high violent crimes

as for low violent crimes.

Table VIII shows the full model and various subset models for the full set of abandoned

units, Table IX shows the full model and various subset models for the set of units that have

become abandoned due to dilapidation, and Table X shows the full model for the set of units

that have become abandoned due to in rem status. Most importantly, we should not that

violent crime is alway highly significant for the full model, and only loses its significance

for the high rent model. Furthermore, the hazard ratio for violent crime is always largest

under the low rent subset, indicating that the impact of surrounding violent crime on the

probability of abandonment is most important when the distance to abandonment is lowest.

24

As discussed earlier, we would expect violent crime to be most important for dilapidation

abandonment, and we see that the distance or ratio between the low rent and high rent

violent crime hazard ratio is largest in this situation.

We notice that rent level is most important under in rem abandonment, indicating that

this decision may be primarily economic (i.e. exercising the option of abandonment) as

opposed to the possibly larger range of effects at play under dilapidation abandonment.

Finally, we see that legal disputes and personal problems are most important for dilapidation

abandonment when rent is low. This would seem to indicate the risk preferences play a part

in the abandonment process. Specifically, when an individual either suffers a large physical

or financial setback due to sickness or injury, or the ownership of the unit is in question,

units whose profitability and value was already marginal dramatically increase in probability

of abandonment.

Thus we see that the impact of crime seems to vary predictably in magnitude for each

of our abandonment measures. Several other measures, such as sudden negative shocks like

legal trouble or notable sickness seem to exhibit similar effects.

We also considered models that included change in crime rather than level of crime, but

found them to be highly insignificant. While higher crime levels increased the probability that

a unit would become abandoned, we found that higher changes in crime actually decreased

the probability that a unit would become abandoned. It could be because agents might

not be as sensitive to very small changes in crime when making abandonment decisions, so

it might not factor very heavily. Change in assaults also had much less variation between

community districts than level, so that could also explain why it seems to be less of a factor

within this time period. Since changes are small and relatively homogeneous, it could be

that the prevailing idea of a neighborhood has not changed, or at least has not had time

to change, and as such has not been able to affect abandonment decisions. It could also

be because the areas with the smallest change in crime (even smallest percent change in

crime) already have very low crime, so more change is not really relevant to the decision.

25

What I think is most likely, however, is that decreases are mostly taking place in high crime

neighborhoods, so all this is picking up is the level, but in reverse.

VI. Conclusion

Using our novel measure of housing abandonment and our proportional hazard model, we

have been able to estimate the ratios in probability of various explanatory variables, most

notably violent crime. We find that violent crime has a strong causal impact on abandon-

ment, and furthermore that while the existence of a positive causal relationship remains

strong across various types of abandonment and various subsets of the data, the magnitude

of the impact varies substantially and predictably in accordance with theory along these

same measures. We are also able to make similar claims regarding other predictive variables,

including rent levels and various large negative idiosyncratic shocks.

The contribution of this paper is twofold. First, we are able to suggest (for what I believe

is the first time) that violent crime has a causal impact on the probability of abandonment.

Furthermore, through our constructed abandonment measure and our acknowledgement of

the various types of abandonment, we are able to measure the changing effects of our ex-

planatory variables on types of abandonment that may have very different theoretical and

empirical causes. Here, we see that the impact of relatively static demand decreasing pres-

sures, such as rent levels or crime, may have an entirely different impact on differential

probabilities of abandonment than more instantaneous negative shocks, like legal troubles

or dramatically poor health, necessitating the use of more nuanced and specific policies to

address what appear to be somewhat disparate problems.

26

References

Brent W. Ambrose. A hazard rate analysis of leavers and stayers in assisted housing pro-

grams. Cityscape: A Journal of Policy Development and Research, 8(2), 2005. URL

http://www.huduser.org/periodicals/cityscpe/vol8num2/ch4.pdf.

William C. Apgar. Rental housing in the u.s. The Joint Center for Housing Studies of

Harvard University - Working Paper, W88-1, 1988.

William C. Apgar. Preservation of existing housing: A key element in a revitalized housing

policy. The Joint Center for Housing Studies of Harvard University - Working Paper,

W90-1, 1990.

Paul Cheshire and Stephen Sheppard. Capitalising the value of free schools: The impact of

supply characteristics and uncertainty. Economic Journal, 114(499):F397–F424, November

2004. URL http://ideas.repec.org/a/ecj/econjl/v114y2004i499pf397-f424.html.

D. R. Cox. Regression models and life-tables. Journal of the Royal Statistical Society. Series

B (Methodological), 34(2):pp. 187–220, 1972. ISSN 00359246. URL http://www.jstor.

org/stable/2985181.

Rose Cunningham and Ilan Kolet. Housing market cycles and duration dependence in the

united states and canada. 07(2), 2007. URL http://ideas.repec.org/p/bca/bocawp/

07-2.html.

Francis X. Diebold and Glenn D. Rudebusch. A nonparametric investigation of duration

dependence in the american business cycle. Journal of Political Economy, 98(3):pp. 596–

616, 1990. ISSN 00223808. URL http://www.jstor.org/stable/2937701.

Ingrid Gould Ellen, Amy Ellen Schwartz, Ioan Voicu, and Michael H. Schill. Does federally

subsidized rental housing depress neighborhood property values? Journal of Policy Anal-

27

http://www.huduser.org/periodicals/cityscpe/vol8num2/ch4.pdf

http://ideas.repec.org/a/ecj/econjl/v114y2004i499pf397-f424.html

http://www.jstor.org/stable/2985181


http://ideas.repec.org/p/bca/bocawp/07-2.html

http://ideas.repec.org/p/bca/bocawp/07-2.html


ysis and Management, 26(2):257–280, 2007. ISSN 1520-6688. doi: 10.1002/pam.20247.

URL http://dx.doi.org/10.1002/pam.20247.

Steve Gibbons. The costs of urban property crime*. The Economic Journal, 114(499):

F441–F463, 2004. ISSN 1468-0297. doi: 10.1111/j.1468-0297.2004.00254.x. URL http:

//dx.doi.org/10.1111/j.1468-0297.2004.00254.x.

Daryl A. Hellman and Joel L. Naroff. The impact of crime on urban residential property

values. Urban Studies, 16(1):105–112, 1979. doi: 10.1080/713702454. URL http://usj.

sagepub.com/content/16/1/105.short.

D. Immergluck and G. Smith. The impact of single-family mortgage foreclosures on neigh-

borhood crime. Housing Studies, 21(6):851–866, 2006. doi: 10.1080/02673030600917743.

Moon Wha Lee. Housing new york city 2008. Department of Housing Preservation and

Development, 2011.

Allen Lynch and David Rasmussen. Measuring the impact of crime on house prices. Applied

Economics, 33(15):1981–1989, 2001. URL http://EconPapers.repec.org/RePEc:taf:

applec:v:33:y:2001:i:15:p:1981-1989.

Brendan O’Flaherty. Abandoned buildings: A stochastic analysis. Journal of Urban Eco-

nomics, 34(1):43 – 74, 1993. ISSN 0094-1190. doi: 10.1006/juec.1993.1025. URL

http://www.sciencedirect.com/science/article/pii/S0094119083710259.

June Ying Shann-Hwa Park. Increased homelessness and low rent housing vacancy

rates. Journal of Housing Economics, 9(1-2):76 – 103, 2000. ISSN 1051-1377. doi:

10.1006/jhec.2000.0263. URL http://www.sciencedirect.com/science/article/pii/

S1051137700902638.

Sherwin Rosen. Hedonic prices and implicit markets: Product differentiation in pure

28

http://dx.doi.org/10.1002/pam.20247

http://dx.doi.org/10.1111/j.1468-0297.2004.00254.x

http://dx.doi.org/10.1111/j.1468-0297.2004.00254.x

http://usj.sagepub.com/content/16/1/105.short

http://usj.sagepub.com/content/16/1/105.short

http://EconPapers.repec.org/RePEc:taf:applec:v:33:y:2001:i:15:p:1981-1989

http://EconPapers.repec.org/RePEc:taf:applec:v:33:y:2001:i:15:p:1981-1989

http://www.sciencedirect.com/science/article/pii/S0094119083710259



competition. Journal of Political Economy, 82(1):34–55, Jan.-Feb. 1974. URL http:

//ideas.repec.org/a/ucp/jpolec/v82y1974i1p34-55.html.

Wesley.G. Skogan. Disorder and Decline: Crime and the Spiral of Decay in American

Neighbourhoods. University of California Press, 1990. ISBN 9780520076938. URL

http://books.google.com/books?id=ASrAMJh7LngC.

William Spelman. Abandoned buildings: Magnets for crime? Journal of Criminal Justice,

21(5):481–495, 1993. URL http://EconPapers.repec.org/RePEc:eee:jcjust:v:21:y:

1993:i:5:p:481-495.

Fran Spielman. City to target hazards of abandoned buildings. Chicago Sun Times, 15

March:24, 2000.

Richard Thaler. A note on the value of crime control: Evidence from the property mar-

ket. Journal of Urban Economics, 5(1):137 – 145, 1978. ISSN 0094-1190. doi: 10.1016/

0094-1190(78)90042-6. URL http://www.sciencedirect.com/science/article/pii/

0094119078900426.

James Q. Wilson and George L. Kelling. Broken windows: The police and neighborhood

safety. The Atlantic Monthly, 1982.

29

http://ideas.repec.org/a/ucp/jpolec/v82y1974i1p34-55.html

http://ideas.repec.org/a/ucp/jpolec/v82y1974i1p34-55.html

http://books.google.com/books?id=ASrAMJh7LngC

http://EconPapers.repec.org/RePEc:eee:jcjust:v:21:y:1993:i:5:p:481-495

http://EconPapers.repec.org/RePEc:eee:jcjust:v:21:y:1993:i:5:p:481-495

http://www.sciencedirect.com/science/article/pii/0094119078900426

http://www.sciencedirect.com/science/article/pii/0094119078900426

Appendices

A Tables

Year Occupied Vacant Total

2002 15,894 1,263 17,1572005 15,547 1,287 16,8342008 17,955 1,471 19,426Total 49,396 4,021 53,417

Table II. Summary Tabulations for New York City Housing and Vacancy Surveyby Occupancy Status: This table presents the number of units by occupancy status foreach of the years in our panel. We see that

Year Owner Occ. Legal Disp. Pers. Prob. Many Units Low Rent Total

2002 2006 17.2% 14 0.1% 8 0.1% 11501 98.8% 2909 25.0% 116382005 1794 15.8% 19 0.2% 5 0.1% 11241 98.8% 2842 25.0% 113752008 1906 14.7% 28 0.2% 11 0.1% 12884 99.1% 3095 23.8% 12999Total 5706 15.8% 61 0.2% 24 0.1% 35626 98.9% 8846 24.6% 36012

Table III. Summary Tabulations for New York City Housing and Vacancy Sur-vey Binary Variables: This table presents the number of units in each of the followingcategories by year: whether or not the unit is owner-occupied, whether or not the unit iscurrently experiencing legal problems (i.e. those stemming from a will, a lawsuit, settlementof an estate, or some other legal matter that places the unit in limbo), whether or not theowner is experiencing personal problems (i.e. old age or sickness) that impede the rent orsale of a unit, whether or not the building has more than five units, and whether or not theunit is in the bottom quartile of units in terms of rents.

30

Year Unabandoned Abandoned Total

2002 11,013 608 11,6212005 10,806 547 11,3532008 12,366 592 12,958

Total 34,185 1,747 35,932

Table IV. Summary Tabulation for Abandoned Units: this table presents a tabulationof abandoned and unabandoned units by year. We see that abandonment is fairly consistentthroughout the period, but is much lower than abandonment. We see that the rate ofnew abandonment for the entire city is 5.5%, 5.2%, and 3.4% for 2002, 2005, and 2008,respectively.

Variable Unabandoned Abandoned Total

Low Violent Crime 16401 479 16880High Violent Crime 17903 1265 19168Low Property Crime 16638 1296 17934High Property Crime 17666 448 18114Not Owner Occ. 28693 1639 30332Owner Occ. 5611 105 5716Few Units 233 165 398Many Units 34071 1579 35650No Legal Disp. 34269 1715 35984Legal Disp. 35 29 64No Pers. Prob. 34293 1730 36023Pers. Prob. 11 14 25High Rents 26636 436 27072Low Rents 7668 1308 8976

Table V. NYCHVS and Crime Variables by Abandonment Status: This tableshows tabulations for our explanatory variables by abandonment. There are a few thingsthat jump out at us as particularly interesting. First, note that both legal disputes andpersonal problems are much more common among abandoned units than might be expectedby the relatively small number of abandoned units. Also note that small buildings areabandoned at a much higher rate than large buildings. Both of these are in accordance withour intuition and hypotheses.

31

VARIABLES Mean SD Median P5 P95

Murder 0.03 0.03 0.02 0.00 0.07Rape 0.07 0.04 0.06 0.01 0.15Robbery 1.23 0.65 1.17 0.28 2.42Assault 0.90 0.56 0.80 0.17 2.10Burglary 1.12 0.40 1.15 0.48 1.88Grand Larceny 2.15 1.30 1.81 1.01 4.62Grand Larceny Auto 0.70 0.34 0.69 0.18 1.34

Table VI. Summary Statistics for Crime Data: Reporting unit is the CommunityDistrict. Each variable is per total housing units in each community district. We havestandardized in this way to account for the fact that community districts are not preciselyhomogeneous in size. We have Winsorized grand larceny at the 5% level, and subsequentlywe do not observe any outliers. No other variables exhibit outliers.

Smaller group D Stat. P-value Corrected

0: 0.055 0.6121: -0.449 0.000Combined K-S: 0.449 <0.001 <0.001

Table VII. Two-sample Kolmogorov-Smirnov test for equality of Abandonmentand Vacancy distribution functions: This table presents the results of a Kolmogorov-Smirnov test on the equality of the distributions of the percentage of abandonment andthe percentage of vacancy at the level of the community district. The K-S test is a non-parametric test of the equality of two distributions, taking as the null hypothesis that the twodistributions are equal. We are able to conclude with high confidence that these two variablescome from different distributions, and thus that our abandonment measure is capturingsomething different from vacancy.

32

(1) (2) (3) (4) (5)VARIABLES Full Model Low Rent High Rent High Viol. Crime Low Viol. Crime

High Viol. Crime 1.981*** 2.048*** 1.613***(0.133) (0.173) (0.180)

High Prop. Crime 0.383*** 0.364*** 0.495*** 0.407*** 0.319***(0.0246) (0.0289) (0.0555) (0.0288) (0.0509)

Owner-Occupied 0.467*** 0.411*** 0.570*** 0.333*** 0.589***(0.0557) (0.0774) (0.0883) (0.0588) (0.105)

> 5 Units 0.0479*** 0.174*** 0.0211*** 0.0538*** 0.0340***(0.00453) (0.0383) (0.00247) (0.00623) (0.00562)

Legal Disp. 2.064*** 3.000*** 4.204*** 1.748** 4.571***(0.462) (1.087) (1.219) (0.480) (1.787)

Pers. Prob. 5.180*** 3.831* 3.114*** 8.596*** 3.706***(1.437) (2.734) (0.945) (3.633) (1.377)

Low Rent 6.291*** 6.430*** 5.892***(0.406) (0.518) (0.649)

Observations 35,003 8,393 26,610 18,524 16,479Standard Errors in parentheses*** p<0.01, ** p<0.05, * p<0.1

Table VIII. Cox Proportional Hazard Model Estimations Under Various Condi-tions: This table presents hazard ratios for a range of explanatory variables under a range ofmodel specifications. Abandonment here is the full sample of abandoned units, and includesall types of abandonment.

33

(1) (2) (3) (4) (5)VARIABLES Full Model Low Rent High Rent High Viol. Crime Low Viol. Crime

High Viol. Crime 1.271** 2.172*** 1.061(0.136) (0.497) (0.136)

High Prop. Crime 0.575*** 0.554*** 0.608*** 0.594*** 0.583***(0.0606) (0.102) (0.0783) (0.0772) (0.107)

Owner Occ. 0.731** 0.660 0.757* 0.701* 0.812(0.104) (0.218) (0.122) (0.141) (0.166)

> 5 Units 0.0181*** 0.0386*** 0.0141*** 0.0177*** 0.0188***(0.00215) (0.0107) (0.00190) (0.00265) (0.00364)

Legal Disp. 3.896*** 8.587*** 2.983*** 4.017*** 4.464***(0.851) (3.499) (0.767) (1.020) (1.954)

Pers. Prob. 3.029*** 15.41*** 2.088** 2.647** 3.403***(0.795) (8.259) (0.653) (1.088) (1.190)

Low Rent 1.822*** 2.173*** 1.193(0.195) (0.287) (0.244)

Observations 35,465 8,811 26,654 18,857 16,608Standard Errors in parentheses*** p<0.01, ** p<0.05, * p<0.1

Table IX. Cox Proportional Hazard Model Estimations Under Various Condi-tions - Dilapidation Abandonment: This table presents hazard ratios for a range ofexplanatory variables under a range of model specifications. Abandonment here the subsetof the full set of abandoned firms, and in particular only includes those firms who becameabandoned due to the dilapidation criterion.

34

(1)VARIABLES Full Model

High Viol. Crime 2.361***(0.198)

High Prop. Crime 0.311***(0.0252)

Owner Occ. 0.249***(0.0579)

> 5 Units -(-)

Legal Disp. -(-)

Pers. Prob. -(-)

Low Rent 17.55***(1.895)

Observations 35,506Standard Error in parentheses

*** p<0.01, ** p<0.05, * p<0.1

Table X. Cox Proportional Hazard Model Estimations Under Various Conditions- in rem Abandonment: This table presents hazard ratios for a range of explanatoryvariables under a range of model specifications. Abandonment here the subset of the full setof abandoned firms, and in particular only includes those firms who became abandoned dueto the loss of the owner’s legal right to the property through in rem status.

35

B Figures

05

10

15

20

Density

0 .1 .2 .3 .4 .5

Percent

Percent Abandoned Percent Vacant

kernel = epanechnikov, bandwidth = 0.0080

Figure 1. Kernel Density Estimations for Rates of Vacancy and Abandonment byCommunity District: This plot shows the kernel density estimates for rate of vacancy (red,dashed) and rates of abandonment abandonment (blue, solid) within the city’s communitydistricts. A simple visual inspection seems to suggest that the two rates follow differentdistributions.

36

2002 2005 2008

Figure 2. First Principal Component of Crime (Violent Crime) per Unit over the Sample Period: This figureshows the first principal component of crime (which corresponds to violent crime) per total housing units in each of the 55community districts in New York City. We have standardized in this way to account for the fact that community districts arenot homogeneous in size. Darker community districts sustained more assaults.

37

2002 2005 2008

Figure 3. Abandonment Density: This figure shows abandonments per total housing units in each of the 55 communitydistricts in New York City. We have standardized in this way to account for the fact that community districts are nothomogeneous in size. Darker community districts have more abandoned units. There seem to be many parallels between theabandonment and assault maps, and this is borne out in their high correlations.

38

2002 2005 2008

−1 0 1 2 3 4

murder

rape

robbery

assault

burglary

grLarceny

grLarcenyAuto

−1 0 1 2 3 4

murder

rape

robbery

assault

burglary

grLarceny

grLarcenyAuto

−1 0 1 2 3 4

murder

rape

robbery

assault

burglary

grLarceny

grLarcenyAuto

Figure 4. Box and Whisker Plot of Standardized Crime Data: This figure shows box and whisker plots of the meanand variance standardized crime variables by year. We can easily see here that there are no significant outliers, and that theWinsorization of the grand larceny variable has been successful. All of this is important in the context of principal componentanalysis, which is both sensitive to outliers and to the relative scaling of the data. Standardizing the data puts all of thevariables into the same scale.

39

2002 2005 2008

−0.6 −0.4 −0.2 0 0.2 0.4 0.6

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

murder

rape

assault

robbery

Component 1

Com

ponent 2

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

murder

rape

assault

robbery

Component 1C

om

ponent 2

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8 murder

rape

assault

robbery

Component 1

Com

ponent 2

Figure 5. First vs. Second Principal Components for Violent Crime Data: This figure shows the original variablevectors from our data set projected onto the space of the first two principal components. We can see that the first componenthas very high positive values in all of the violent crime variables, and a moderate value in the abandonment variable, furtherreinforcing our hypothesis that the crime data is all highly correlated, and moderately correlated with abandonment. The secondprincipal component is low, but negative, in most violent crimes, and high and positive in property crimes, specifically burglaryand grand larceny, and high and negative in abandonment. This is interesting, and perhaps does not have an immediatelyintuitive explanation like the first component does. However, it is unarguably true that this component has something to dowith property and property crime, as burglary, grand larceny, and abandonment are its most important factors. Overall, thefirst component seems to be the violent crime component, while the second component seems to be the abandonment andproperty crime component.

40

2002 2005 2008

1 1.5 2 2.5 3 3.5 40

1

2

3

4

5

6

Principal Component

Eig

en

va

lue

1 1.5 2 2.5 3 3.5 40

0.5

1

1.5

2

2.5

3

3.5

Principal Component

Eig

envalu

e

1 1.5 2 2.5 3 3.5 40

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Principal Component

Eig

envalu

e

Figure 6. Screeplot of Eigenvalues of the Principal Components for Violent Crime Data: This figure shows ascreeplot of the eigenvalues of all eight principal components of the primary model. The eigenvalues are an indication of theamount that the transformation stretches the data, so because we are performing principal component analysis, the eigenvaluestell us the relative importance of the principal components in explaining the variance within the data.

41

2002 2005 2008

1 1.5 2 2.5 3 3.5 40.9

1

Cu

mu

lative

Va

ria

nce

Exp

lain

ed

(%

)

Principal Component

1 1.5 2 2.5 3 3.5 40

0.2

0.4

0.6

0.8

1

Va

ria

nce

Exp

lain

ed

(%

)

Cum. Var. Expl.

Var. Expl.

1 1.5 2 2.5 3 3.5 40.8

0.9

1

Cu

mu

lative

Va

ria

nce

Exp

lain

ed

(%

)

Principal Component

1 1.5 2 2.5 3 3.5 40

0.2

0.4

0.6

0.8

1

Va

ria

nce

Exp

lain

ed

(%

)

Cum. Var. Expl.

Var. Expl.

1 1.5 2 2.5 3 3.5 40.8

0.9

1

Cu

mu

lative

Va

ria

nce

Exp

lain

ed

(%

)

Principal Component

1 1.5 2 2.5 3 3.5 40

0.2

0.4

0.6

0.8

1

Va

ria

nce

Exp

lain

ed

(%

)

Cum. Var. Expl.

Var. Expl.

Figure 7. Percent Variance Explained by the Principal Components for Violent Crime Data: This plot shows thevariance explained by the individual principal components (right axis) and the cumulative variance of the principal componentsthus far (left axis). We see that the first violent crime component explains approximately 90% of the variance within the data.

42

2002 2005 2008

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

burglary

grLarceny

grLarcenyAuto

Component 1

Co

mp

on

en

t 2

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

burglary

grLarceny

grLarcenyAuto

Component 1C

om

ponent 2

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

burglary

grLarceny

grLarcenyAuto

Component 1

Com

ponent 2

Figure 8. First vs. Second Principal Components for Property Crime Data: This figure shows the original variablevectors from our data set projected onto the space of the first two principal components. We can see that the first component hasvery high positive values in many of the crime variables, and a moderate value in the abandonment variable, further reinforcingour hypothesis that the crime data is all highly correlated, and moderately correlated with abandonment. The second principalcomponent is low, but negative, in most violent crimes, and high and positive in property crimes, specifically burglary andgrand larceny, and high and negative in abandonment. This is interesting, and perhaps does not have an immediately intuitiveexplanation like the first component does. However, it is unarguably true that this component has something to do withproperty and property crime, as burglary, grand larceny, and abandonment are its most important factors. Overall, the firstcomponent seems to be the violent crime component, while the second component seems to be the abandonment andproperty crime component.

43

2002 2005 2008

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Principal Component

Eig

envalu

e

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.2

0.4

0.6

0.8

1

1.2

1.4

Principal Component

Eig

envalu

e

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Principal Component

Eig

envalu

e

Figure 9. Screeplot of Eigenvalues of the Principal Components for Property Crime Data: This figure shows ascreeplot of the eigenvalues of all eight principal components of the primary model. The eigenvalues are an indication of theamount that the transformation stretches the data, so because we are performing principal component analysis, the eigenvaluestell us the relative importance of the principal components in explaining the variance within the data.

44

2002 2005 2008

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30.5

0.6

0.7

0.8

0.9

1

Cu

mu

lative

Va

ria

nce

Exp

lain

ed

(%

)

Principal Component

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

0.2

0.4

0.6

Va

ria

nce

Exp

lain

ed

(%

)

Cum. Var. Expl.

Var. Expl.

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

0.5

0.6

0.7

0.8

0.9

1

Cu

mu

lative

Va

ria

nce

Exp

lain

ed

(%

)

Principal Component

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.2

0.4

0.6

Va

ria

nce

Exp

lain

ed

(%

)

Cum. Var. Expl.

Var. Expl.

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

0.7

0.8

0.9

1

Cu

mu

lative

Va

ria

nce

Exp

lain

ed

(%

)

Principal Component

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.2

0.4

0.6

0.8

Va

ria

nce

Exp

lain

ed

(%

)

Cum. Var. Expl.

Var. Expl.

Figure 10. Percent Variance Explained by the Principal Components for Property Crime Data: This plot shows thevariance explained by the individual principal components (right axis) and the cumulative variance of the principal componentsthus far (left axis). We see that the violent crime component and the abandonment and property crime component explainapproximately 70% of the variance within the data.

45

0.0

00.0

50.1

00.1

5

2002 2004 2006 2008

analysis time

Figure 11. Nelson-Aalen Estimate of the Cumulative Hazard Rate Function: Thisplot shows the cumulative hazard rate estimates for high crime (red, dashed) and low crime(blue, solid). We can see that the cumulative hazard rate function is much higher for unitsin community districts deemed high crime. This is in line with our assumptions.

46