Abandonment and Crime
Matthew Yeaton∗
ECON w4999 - Senior Honors Thesis Workshop
April 18, 2013
Abstract
We construct a novel measure of housing abandonment, and build a proportionalhazard model in order to estimate the ratios in probability of various explanatoryvariables, most notably violent crime. We find that violent crime has a strong causalimpact on abandonment, and furthermore that while the existence of a positive causalrelationship remains strong across various types of abandonment and various subsetsof the data, the magnitude of the impact varies substantially and predictably in accor-dance with theory along these same measures. We are also able to make similar claimsregarding other predictive variables, including rent levels and various large negativeidiosyncratic shocks.
∗Matthew Yeaton, [email protected].
1
I. Introduction
The relationship between abandoned buildings and crime is one that often finds itself at
the forefront in popular culture and political debates about the status of American cities.
Richard M. Daley, the former mayor of Chicago, once famously declared in a press release
that “vacant buildings are ugly. They attract gangs,”1 summarizing public opinion about
the intertwined role of crime and abandonment quite simply and succinctly. Certainly, from
a layman’s perspective, this jump is not hard to fathom. Abandoned buildings can act as
the symbol of a fallen, crime-ridden city. Others may cite the broken windows theory, first
articulated by Wilson and Kelling (1982), such that general property dilapidation acts as a
signal that a neighborhood is ill-cared for and as a signal for general urban disorder, and
thus leads to more serious problems like crime.
Empirically, the correlative relationship between abandonment and crime seem to be
fairly robust. Spelman (1993), for example, found that crime rates on blocks with abandoned
buildings were twice as high as rates on similar blocks without abandoned buildings. The
question then becomes whether or not there is a causal relationship between the two, and if so,
which direction the causality moves. In this vein, the problem of whether or not abandonment
leads to increased crime is fairly well studied, such as Skogan (1990) or Immergluck and Smith
(2006). However, the causal relationship may work in both directions, rather than simply
abandonment causing increased crime rates. Hence, it may also be true that higher crime
rates increase the rate at which buildings become abandoned in a given area. This problem
is not nearly as well studied, and as such, this paper will attempt to give some insight into
this particular direction of causality.
Simply put, we are interested in whether or not differences in crime rates may impact
the probability of abandonment, i.e. whether or not crime rates are related to the possibly
different rates at which occupied units become abandoned. Theoretically, we may think of
crime impacting abandonment hedonically through a demand channel. In the first place,
1Spielman (2000), p. 24
2
crime has been demonstrated to negatively impact housing values, as shown in Hellman
and Naroff (1979) or Lynch and Rasmussen (2001). Hence, crime in the area surrounding
a building may decrease the demand for housing within the building and therefore decrease
the profitability of the building, and through this channel increase the likelihood that the
owner abandons the building.
Furthermore, crime may impact abandonment beyond this direct demand channel. Be-
yond profitability, an owner may find maintaining and even visiting a building in a high-crime
area to be particularly undesirable. Specifically, the owner’s fear of violent crime may influ-
ence an owner to abandon a unit at a higher profitability threshold than he or she otherwise
might in the absence of violent crime, even if we hold demand for the housing units within the
building constant (and consequently hold profitability constant). Hence, under this scenario,
violent crime would have an additional positive impact on the probability of abandonment
even if we hold demand constant.
Before we are able to answer any of these questions, though, we must first address what
we mean by abandonment. Hence, there are two primary questions we are interested in
answering: first, how should we define and measure abandonment; and second, in what
scenarios, if any, do high crime rates differentially impact the probability of abandonment
as compared to low crime rates.
A. What Is Abandonment?
As per O’Flaherty (1993), there are three different things we could mean by the term aban-
donment : “the cessation of an owner’ provision of maintenance and operating services to
a building,” the “loss of the owner’s legal right to the property,” and “the demotion of a
building.”2 For the purposes of this paper, we are focusing rather generally on all three of
these definitions. Clearly, the second two of these definitions are fairly black-and-white. One
results in a change in property ownership, while the other results in the disappearance of a
2O’Flaherty (1993), pg. 45
3
building. The first definition, however, is a bit trickier.
It is often ambiguous what constitutes this sort of abandoned property, and in partic-
ular, we can not equate abandoned with vacant, which is what is measured in most data.
Furthermore, many have found that there is a nontrivial difference in the rates calculated
using the all-inclusive measures and the the more limited abandonment measures. Hence,
it pays to attempt to tease out which is which in our data in order to better estimate our
model. Specifically, measurement of this first definition will give us some pause, so that we
ought to think rather carefully about what it means.
A.1. Vacancy vs. Abandonment
To begin addressing the problem of the maintenance cessation definition and measurement
of abandonment, we must consider the distinction between vacancy and abandonment. Ob-
viously, not all vacant housing units are created equally, nor would all vacant housing units
be of the type to decrease the desirability of nearby properties. For example, an apartment
in mint condition vacated in the past week will not have nearly the same effect as an un-
inhabitable, mold-filled apartment with broken walls and windows in abject condition that
has been abandoned for the past two years. Hence, what we are interested in is not so much
vacancy as abandonment, and we must be careful not to blur the lines between the two (at
least to the extent we are able).
However, discerning the difference between the two is not always clear. In fact, Apgar
(1990) found that in many cases housing units in abject condition and on the verge of
abandonment are considered to be vacant in in standard vacancy calculations, instead of
being removed from from the set of available housing. In fact, Park (2000) finds that some
of the rising trend in higher vacancy rates of low rent units in particular can be attributed
to an increase in the sample of units in extremely poor condition that perhaps should be
considered abandoned rather than vacant.
Hence, we should not necessarily be satisfied with conventional measures of vacancy in
4
investigating what impact abandonment density has on unit value. Park (2000) notes that “if
a unit is in such bad shape that it is unlivable—either boarded up or the interior is exposed
to the elements—it is inappropriate to include it in the pool of vacant units available for
rent.”3 With this in mind, Park develops several methods of determining when a housing unit
is vacant and on the verge of abandonment, but primarily through looking at the vacancy
state of a unit over time, and removing those units that were consistently vacant. Park
found that removing these housing units that were abandoned and on the verge of being
abandoned significantly affected vacancy rates. Our approach will be theoretically similar,
but will include some methodological differences in order to be able to identify those units
which are vacant and on the verge of abandonment. Because of the detailed nature of our
data on vacancy, we will be able to give a well specified description and estimate of the true
level of impending abandonment in neighborhood. This will allow us to sufficiently address
this first definition of abandonment.
B. Crime and Demand
Our model will not be a hedonic model, since we are not interested in demand per se, but
rather abandonment. Nevertheless, we are interested in demand in as much as it impacts
abandonment. Hence, we may draw inspiration from the hedonics literature in order to help
us create and form hypotheses about our abandonment model.
Hedonic analysis, or hedonic regression, is a method of estimating the contributory effects
of the constituent elements of demand. This method of analyzing the market value of complex
goods, and in particular the market value of housing, was first elaborated by Rosen (1974).
Since then, hedonic models have been used to understand the effects on housing demand of
a variety of housing and locational characteristics. The locational studies are clearly most
interesting to our analysis, and even these range from analysis of the effects of differential
3Park (2000), p. 97.
5
school quality4 to the impact on demand of nearby subsidized rental housing5.
However, most relevant within the hedonics literature for our purposes are those studies
interested in the relationship between housing demand and crime. These studies suggest
that crime levels do indeed affect property values. Thaler (1978) and Gibbons (2004) found
that property crime rates decreased housing prices in Rochester, New York and London,
respectively. Hellman and Naroff (1979) found a similar relationship using total crimes in
Boston, and Lynch and Rasmussen (2001) saw this again using violent crimes in Jacksonville,
Florida. Interestingly, they also found that while violent crimes were associated with de-
creased demand, property crimes were associated with higher property demand. There are
a few reasons why this might be the case: first, it may be that smaller property crimes are
reported at a higher rate than they otherwise would in low-rent neighborhoods due to a
decreased tolerance for general societal disorder in these neighborhoods and higher rates of
policing and police presence, as suggested in Lynch and Rasmussen (2001). It also may be
that potential property thieves make the decision to commit crimes in the areas where their
efforts will be most fruitful, i.e. wealthy, high-rent, high demand neighborhoods. Further-
more, while an individual may fear for their personal safety in the case of violent crime, they
will likely not fear for their personal safety in the case of property crime (at least certainly
not to the same extent). Thus, it is unsurprising that at least in some locations the incentive
to commit crimes in wealthy neighborhoods in conjunction with the higher rate of report-
ing property crimes in these neighborhoods will outweigh the potentially negative demand
pressures from higher risks of property crime.
Thus, the impact of property crime on demand is ambiguous, such that in some locations
it associated with lower demand and in other locations associated with higher demand.
Violent crime, on the other hand, is unambiguous in decreasing demand. Thus, in as much
as demand impacts the abandonment decision through decreased building profitability, we
would expect violent crime to increase the probability of abandonment.
4Cheshire and Sheppard (2004)5Ellen et al. (2007)
6
C. Hazard Rate Analysis
We are now faced with a question of model specification. How can we estimate the impact of
crime on the probability of abandonment? Survival analysis, and by extension hazard rate
analysis, can help us to answer this question.
Survival analysis is classically used in the biological sciences in order to estimate patients’
time to death, or in various engineering fields in order to estimate when a piece of machinery is
likely to break down. We can easily draw a parallel to our inquiry. Let us consider a building
an organism with a lifetime. We can then build and estimate a model of our building’s “time
to death.” Thus, rather than dealing with changes in the stock of abandoned buildings
between different neighborhoods, as we might be forced to in a linear model in order to
account for existing stock of abandoned buildings in a neighborhood, this type of model will
allow us to evaluate the factors that impact the probability that an individual unit that was
previously occupied becomes abandoned, and as such should be substantially more accurate
as we will be able to derive results on the micro-level without having to resort to aggregation
of data or having to worry about mis-measurement of the stock of abandoned buildings.
Survival analysis has been used in the context of the housing market before, but never
in this context. It has been often used to examine trends in housing market cycles, as in
Diebold and Rudebusch (1990) or Cunningham and Kolet (2007), however these are largely
macro-level analyses. The most similar paper with regard to our purpose is Ambrose (2005),
which analyzes the decision to leave assisted housing programs. While this study is the most
methodologically similar, it nonetheless has an entirely separate focus from this study.
Hazard rate analysis is a subset of survival analysis that focuses on the factors that
determine death, system failure, or in our case abandonment, rather than the time to aban-
donment. This will be the focus of our study, as we are primarily interested in the impact of
crime on the probability of abandonment under various contexts, rather than the time until
abandonment. We will develop a model based on Cox (1972) that will allow us to remain
agnostic as to the underlying distributions of our variables.
7
The structure of the rest of the paper will be as follows: first, this paper will expand
upon the earlier vacancy and abandonment literature by using the theoretical framework of
the Park method for discerning the difference between vacant and abandoned housing units
described above. We will then elaborate upon our hazard rate models and methodologies, and
finally present the results of our analysis, where we find that crime has a significant impact
on the probability of abandonment, and furthermore that the impact varies in accordance
with theory in conjunction with various types of abandonment and under a range of subsets
of the data.
II. Data
This study uses data from a few different sources, primarily: detailed unit-level property
characteristics data for both vacant and occupied units from the New York City Housing
and Vacancy Survey, and precinct-level crime data from the New York Police Department.
We construct an abandonment measure using data from the New York City Housing and
Vacancy Survey, and a violent crime measure and a property crime measure using the first
principal components of violent crime and property crime, respectively.
A. New York City Housing and Vacancy Survey
The New York City Housing and Vacancy Survey (NYCHVS) data is conducted and spon-
sored by the New York City Department of Housing Preservation and Development in order
to comply with rent regulation laws in both New York State and New York City, and is
made available by the United States Census Bureau. It is compiled and published every
three years with respondents in a stratified random sample of New York City housing units.
The sample is based on the decennial census, and as such the units in the sample are by
necessity changed every ten years. However, within any given ten year period, the units in
the sample are kept consistent, and the data is actually panel data. With this in mind, this
8
study uses data for the period 2000-2010, which includes the 2002, 2005, and 2008 NYCHVS.
The sample includes both vacant and occupied units, and both public and privately
owned units. Sample units for the NYCHVS come from three primary sources: housing units
included in the 2000 decennial census master address files, housing units constructed since
the 2000 decennial census through a file of addresses listing all residential units, citywide,
issued Certificates of Occupancy for new construction between January 1, 2000 through
October 31 of the year prior to the survey year. Finally, sample units were drawn from a
list of housing units located in structures owned by the city because the owner failed to pay
taxes on the property (in rem units).
For the 2002 and 2005 surveys, approximately 18,000 units throughout New York City
were chosen as a representative sample of the housing in the five boroughs, and each unit
represents approximately 180 similar units. For 2008, the number was increased to 21,000
units each representing approximately 170 similar units. This was done to take into account
the much higher rate of new constructions that had been added since the 2005 NYCHVS
than had been typical.
Field representatives hired by the New York City Department of Housing Preservation
and Development conducted interviews between January and May of the given year. Inter-
view response rates were 98%, 96%, and 98% in 2002, 2005, and 2008, respectively.
Table II shows the total number of units in our panel. We have 53,417 unit-year obser-
vations, 4,021 of which are vacant while the remaining 49,396 are occupied. This represents
22,115 unique units over the sample period.
Beyond our abandonment measure, which is discussed in detail below, we include several
other variables from the NYCHVS. Table III presents the number of units for a series of
binary variables by year. Specifically, whether or not the unit is owner-occupied, whether
or not the unit is currently experiencing legal problems, for example those stemming from
a will, a lawsuit, settlement of an estate, or some other legal matter that places the unit in
limbo, whether or not the owner is experiencing personal problems, for example old age or
9
sickness, whether or not the building has more than five units, and whether or not the unit
is in the bottom quartile of units in terms of rent for the given year. Note that the totals
for each year, and thus the full sample is lower here than in Table II. This is due to entries
missing coverage in one or more of these variables.
Each of these variables should theoretically have an explanatory impact on the probability
of abandonment. If a building is owner occupied, it is less likely to be a purely investment
building, and thus the abandonment decision becomes not only one of profitability but also a
question of simply having a roof over one’s head. Therefore we might expect the probability
of abandonment to be lower in the case of owner occupied buildings.
The motivation to include legal problems and personal problems is very similar. Both re-
flect unexpected negative idiosyncratic shocks that could increase the likelihood of a building
tipping into abandonment. Thus, we should expect these both to increase the probability
of abandonment. As can be seen in Table III, these are very uncommon occurrences, with
incidence hovering between 0.1-0.2% of cases.
We include a many-units variable to capture differences in building types and sizes. We
denote a building as a many unit building if it includes more than five units. For extremely
large and expensive buildings there would need to be an absolutely enormous unexpected
shock to increase the probability of abandonment. Additionally, we might expect the types
of large institutional investors that have access to large apartment buildings to be better able
to estimate the relevance of crime rates in their investment decision. For all of these reasons
we should expect the probability of abandonment to be lower for many-unit buildings as
compared to smaller buildings.
The low rent variable is included to help control for demand. We characterize a unit as
low rent if it is in the bottom quartile for rent for a given year. This partition has been
selected to help us identify those units who might be most vulnerable to abandonment from
a demand perspective. Thus, in the event of additional high levels of violent crime and/or
an unexpected negative shock like those described above, these units would have the highest
10
probability of becoming abandoned.
B. Abandonment Measure
The abandonment measure used in this paper is derived from several aspects of the New York
City Housing and Vacancy Survey Data. We will outline the various components contribut-
ing to the measure before summarizing the definition of the measure below. Recalling the
definitions of abandonment from O’Flaherty (1993), we can begin to address abandonment
by looking at the second two definitions, namely “the loss of the owner’s legal right to the
property,” and “the demotion of a building.” We thus include a building if it has in rem
status, and if the unit is being held for planned demolition.
The first definition of abandonment, “the cessation of an owner’ provision of maintenance
and operating services to a building,” is more nuanced, as mentioned above. The NYCHVS
does not report abandonment, simply vacancy, and as mentioned above, several others, such
as O’Flaherty (1993) and Park (2000), have found that it can often be difficult to distinguish
units that are simply vacant from those that are on the verge of abandonment. However, Park
(2000) and Apgar (1988) find that there are significant differences in vacancy rates that have
been appropriately adjusted to remove these units on the margins of abandonment. Clearly
then, using vacancy as our rough gauge of abandonment is insufficient. Our goal will be to
apply Park’s method but in reverse: while she wanted to exclude the abandoned units to
more accurately measure vacancy, we want to exclude truly vacant units in order to more
accurately measure abandonment.
Park’s primary method of discerning units that are vacant but on the verge of aban-
donment is to remove units from the sample that are abandoned in the subsequent period.
However, unlike in the American Housing Survey (which Park uses), the NYCHVS does not
list a reason for non-response in the survey data. While the NYCHVS lacks this, it does
include a large number of detailed characteristics on each abandoned property, ranging from
amount of time that the housing unit has been vacant to whether or not the unit’s walls are
11
structurally intact. Hence, we must construct an alternative method of measuring units that
are vacant and on the verge of abandonment. Park (2000) notes that “if a unit is in such bad
shape that it is unlivable—either boarded up or the interior is exposed to the elements—it
is inappropriate to include it in the pool of vacant units available for rent.”6 We can use
this as a guiding principle as we construct our measure.
We begin in a similar way to Park, by using a longitudinal measure. As such, units that
have been consistently vacant for over a year are included in our pool. The New York City
Department of Housing Preservation and Development notes in Housing New York City 2008
that
In the City, which has been characterized by an acute housing shortage for thelast several decades, a long-term rental vacancy duration raises questions as toeither the absolute desirability of the rental unit within a rent context or its trueavailability. In other words, in the City’s rental housing market, an increase invacancies lasting three or more months could mean that these units are probablybeing rejected by prospective renters as unsuitable or not preferable for one ora combination of reasons. . . [namely that] their housing and/or neighborhoodphysical and other conditions are not acceptable7.
Thus, in order to remain conservative with respect to our measure, we only include units
with a consistent vacancy duration of one year or more, far longer than the lower bound for
long-term vacancy.
As mentioned above, the NYCHVS includes a large number of property characteristics,
even for vacant units. We can use those characteristics to get a better picture of which
vacant units are in Park’s category “such bad shape that it is unlivable.” With this in
mind, we include several different categories of building dilapidation: problems with walls,
problems with windows, problems with stairways, and problems with floors. These problems
are summarized in the Table I below.
If a unit possesses any of these dilapidation criteria, it is added into the pool. Essentially,
this is anything that would indicate that the unit had not been lived in for some time, or could
6Park (2000), p. 97.7Lee (2011), p. 376
12
Table I. Summary of Problems Dilapidation Criteria
Walls Windows
· missing outside wall material · broken or missing windows· major cracks in outside walls · boarded up windows· loose or hanging cornice, roofing,
or other material
Stairways Floors
· loose, broken, or missing steps · sagging or sloping floors· slanted or shifted doorsills or · holes or missing flooring
door frames
not handle a new occupant in the immediate future. These, along with our duration criterion,
are used to define abandonment. Like Park and Apgar, we find that this pool is significantly
different from the vacancy pool at very high significance levels, and that more interestingly,
their community district distribution is significantly different than that of vacancy’s. An
explicit definition of our verge-of-abandonment variable is given in equation 1 below:
abandonment = 1
(if in rem status), OR
(if scheduled for demolition), OR
(if vacant for more than one year AND
one or more Dilapidation Criteria from Table I)
(1)
Table IV shows tabulations of abandoned and abandoned buildings by year. We can see
that about 5% of units in a given year become abandoned. Table V shows various NYCHVS
and crime binary variables by abandonment status. There are a few things that immediately
jump out at us as interesting. First, note that both legal disputes and personal problems
are much more common among abandoned units than might be expected by the relatively
small number of abandoned units. In fact, personal problems is actually more common
under abandonment. Also note that small buildings are abandoned at a much higher rate
than large buildings. We see a similar effect for low rents, such that low rent units are much
13
more likely to be abandoned than high rent units. Both of these are in accordance with our
intuition and hypotheses, but more formal analysis is required to ensure that these effects
are more than simple correlations.
Figure 3 shows a heat map of abandonment rate by community district for each of the
years in the sample. We notice that abandonment rate seems to be essentially stagnant over
the sample period. Furthermore, we see that abandonment seems to be highly concentrated
in blocks of contiguous community districts. This suggests that there may be larger region
effects at play (potentially crime).
B.1. Comparison of Abandonment to Vacancy
While the construction of this abandonment measure may be interesting academically, the
measure is little more than an interesting interlude if it captures the same thing as vacancy.
Thus, we must confirm the difference between the two measures.
We can begin to do this by considering kernel density estimations for vacancy and aban-
donment, as shown in Figure ??. We aggregate vacancy and abandonment at the level of
the community district and are interested in whether or not the rates of each seem to reflect
the same distribution. Figure ?? seems to suggest that the distributions are indeed different,
and certainly we can see that they have different means, which is confirmed by a Student’s
t-test with p-value ¡ 0.0001. Beyond their means, however, the underlying distributions seem
to be rather different. Abandonment rate is much steeper and tighter than vacancy rate,
and vacancy rate seems to exhibit a second peak just above 10%.
We can test the equality of the distributions by employing a Kolmogorov-Smirnov test.
The K-S test is a non-parametric test of the equality of two distributions, taking as the
null hypothesis that the two distributions are equal. As shown in Table VII, we are able
to conclude with high confidence that these two variables come from different distributions,
and thus that our abandonment measure is capturing something different from vacancy.
14
C. New York City Crime Data
We also include crime data from the New York City Police Department by precinct for the
years 2002, 2005, and 2008, matching our NYCHVS panel. The data include all major
categories of both violent and property crimes, and includes murder, rape, robbery, as-
saults, burglary, grand larceny, and grand larceny auto. Because of the relationship between
precincts and community districts such that each community district contains nearly exactly
either one or two percents, we are able to move our unit of measure from the level of the
precinct to the level of the community district and hence merge with the NYCHVS panel at
the level of the community district.
We standardize crimes by the number of housing units in the district to account for
differences in both size and densities between the various community districts. Grand larceny
displays outliers, and we Winsorize at the 5% level, which successfully rectifies the problem.
Summary statistics for the crime data are presented in Table ??. We note that there are no
substantial high or low outliers.
While we would like to include all of these crime variables in order to maximize our
measure of crime, the high correlations between the various types of crimes will prevent us
from immediately doing so. For example, correlations between murder, assault, rape, and
robbery range from 80% to over 90%. Thus, we must consider an alternate method of dealing
with these variables if we wish to include them all. Principal component analysis will allow
us to accomplish this goal and include all of these variables in order to encompass a more
full measure of crime.
We will also separate violent crimes (murder, assault, rape, and robbery), from property
crime (burglary, grand larceny, and grand larceny auto) in order to better understand the
possibly different impacts of the two types of crime on the probability of abandonment
discussed above.
15
C.1. Principal Component Analysis
Principal component analysis is sensitive to the relative scaling and centering of the data,
so we standardize the data so that each of the original variables now has mean of zero and
a variance of one. Principal component analysis is also sensitive to the inclusion of outliers
in the data, but our earlier treatment of outliers using Winsorization should have rectified
this issue. The box and whisker plots for each of the three years presented in Figure ??
confirms this. The figure shows the variance-standardized data, and the whiskers here are
one-and-a-half of the interquartile range. Hence, we see some points that fall outside of this
range, but none so egregious as to distort the general shape of the variance of the data.
Overall, the data is generally balanced, and we observe that each variable is on a similar
scale, as desired.
We implement principal component analysis using singular value decomposition (as de-
scribed below) to construct yearly indices of violent crime and property crime. We will
include percentage of murders, assaults, rapes, robberies in the violent crime measure, and
grand larcenies, and grand larcenies auto for the property crime measure. This will success-
fully deal with the high correlation problem while still guaranteeing that we will capture the
maximum variance in the data. We will then implement these complete crime variables into
our analysis.
C.2. PCA Crime Indices
One of the benefits of principal component analysis is that we can project the original vari-
ables onto the space of the first few components in order to make meaningful interpretations.
We can do this using the eigenvectors of the principal components. When the eigenvectors of
the principal components are discussed in the context of the original explanatory variables,
we call them scoring coefficients, and can use these coefficients to interpret projections of
the original data into reduced dimension spaces of our new orthogonal basis. We can also
interpret these projections visually, and it is the combination of seeing the projection of the
16
original vectors onto these simplified spaces, and looking at the specific scoring coefficients
that we can make the best interpretations.
Because of the variance-preserving nature of principal component analysis, we can also
easily construct an index of crime and abandonment that can embody a large amount of the
variation in the full data in only a few principal components. Recall also that the principal
components are by construction orthogonal, and thus independent under normality. Hence,
we must first decide how many principal components to include in our analysis before moving
on to interpretation of the meaning of the principal components. Roughly speaking, the
eigenvalues denote the stretch of the transformation in the direction of the corresponding
eigenvector, and in this case, we can use the relative magnitude of the eigenvalues as a way to
understand the variance explained by the principal component. By design, the first principal
component contains more information than the first, the first more than the second, and so
forth, however, we must select a point at which to stop including components. Figure ??
shows screeplots of the eigenvalues of the principal components of violent crime, and Figure 9
shows screeplots of the eigenvalues of the principal components of property crime. We first
notice that there is a large amount of similarity between the shapes of the screeplots, which is
not entirely surprising, given that there is only three years between each data collection year.
We can also see that there is a significant drop below one in explanatory ability after the
first component for each of the three screeplots for each crime type, particularly for violent
crime, so preliminarily we will construct our index using the first component. Another way of
thinking about this is by seeing the actual variance explained by each component. Figure 7
shows the actual variance explained by each principal component of violent crime, while
Figure 10 shows the actual variance explained by each component of property crime, as well
as the cumulative variance explained by the principal components. Here we can see that the
first component of violent crime explain over ninety percent of the variation in the data, while
the first component of property crime explains approximately 70% of the variance within
the data, which further confirms that construction of two one component indices should be
17
acceptable.
Figure 5 and Figure 8 shows our original explanatory variables and original data points
projected onto the space of the first two principal components. We can see that the first
component of violent crime has very high positive values in all of the violent crime variables,
and the first component of property crime has moderate or high positive values in some
of the property crime variables, with grand larceny causing the most trouble. This further
reinforcing our hypothesis that the violent crime data is all highly jointly correlated, as we
suspected, while property crime may have more mixed origins and relationships to itself and
other variables of interest. We will be able to investigate this more closely within the context
of our model.
Finally, note the relationships between Figure 2 and Figure 3, which forms the basis for
the stylized facts on which we base this study. Building and implementing our model will
allow us to test these hypotheses.
III. Models
We will build a proportional hazards model based on ?. The motivation behind this is
twofold. First, the nonparametric nature of the model will allow us to avoid restrictive
assumptions on our data that will likely not hold. Second, and more importantly, the model
will allow us to estimate the differences in the probabilities that units currently occupied
become abandoned, rather than trying to work with changes in the stock of units, which
is an essentially unmeasurable quantity. However, before we elaborate upon this model, we
must understand its fundamental origins.
A. Hazard Rate Analysis
While linear models may be useful in helping us to understand correlative relationships
between abandonment and crime in New York City, it will not be useful in helping us to
18
understand any causal relationships that may exist between abandonment and crime due to
the endogeneity and possible simultaneous causality that exists between these variables. In
order to rectify this problem, we can also build a hazard rate model to better understand the
relationship between crime and abandonment. Specifically, hazard rate analysis will allow
us to address the question of the differential probability of a unit becoming abandoned in a
given time period in areas with different crime rates. Within our sample, many units begin
occupied and then subsequently are abandoned in later periods. For the context of our
analysis, we will consider this abandonment our “event,” and crime will be our “treatment.”
Let us define the time that a unit becomes abandoned as T , where T is a random variable
with an underlying continuous probability distribution, f(t), where t is some time. Then
the cumulative probability, or lifetime distribution function, F (t), and the survival function,
S(t), are denoted
F (t) =
∫ t
0
f(s)ds = Pr(T ≤ t) = 1− S(t) (2)
i.e. S(t) = Pr(T > t). The survival function gives the probability that the time of abandon-
ment will be at least t. Let a denote the probability that the unit will be abandoned in the
next interval of time, conditional upon the unit not being abandoned before time t. Then a
will be denoted by
a(t,∆t) = Pr(t ≤ T ≤ t+ ∆t|T ≥ t) (3)
Then the hazard function, λ, is defined as the event rate at time t conditional on not being
abandoned until time t or later, i.e.
λ(t) = lim∆t→0+
Pr(t ≤ T ≤ t+ ∆t|T ≥ t)
∆t=f(t)dt
S(t)(4)
In other words, the hazard rate indicates the rate at which units are abandoned at time t
conditional upon the unit remaining un-abandoned until time t.
In order to quantify differences in hazard rates between different groups of characteristics
we can utilize a Cox proportional hazards model. We begin with the baseline hazard function
19
described above. From now on, let us consider this baseline hazard function as λ0(t). Then
the hazard rate function for the Cox model is given by
λ(t|X) = λ0 exp(β′X) (5)
where X is the vector of explanatory variables. Hence, we can construct a partial likelihood
based on this new hazard function as follows
L(β) =J∏
j=1
exp(β′Xj)∑i∈Rj
exp(β′Xi)(6)
where j denotes the event times, Xj is the explanatory variable vector for the for the unit
that became abandoned at time tj, and Rj denotes the risk set at time tj, namely the set of
units which are at risk of abandonment at tj. This will allow us to make claims about the
ratios of the probabilities of abandonment for different crime levels while remaining agnostic
as to the specific underlying hazard rate distribution, a significant advantage of the model.
IV. Methodologies
We detail the technique of principal component analysis, which is integral in our violent crime
and property crime measures. Furthermore, in order to better understand the relationship
between abandonment and crime, we will also build a hazard rate (or survival analysis)
model, which we can estimate and test using the various methods described below. This will
help us to deal with the intrinsic endogeneity between abandonment and crime.
A. Principal Component Analysis
As per Jolliffe (2002), the principal component is given by:
nYTp = nX
Tp Wp (7)
20
where nYTp = (y1, y2, . . . , yp) is the principal component, nX
Tp is the mean-centered transpose
of the data matrix, and pWp is the matrix resulted from singular value decomposition of the
data matrix, pXn, as shown in equation (8):
pXn = pWpΣnVTn (8)
where pWp is the matrix of eigenvectors of the covariance matrix pXnXTp , pΣn is a rectangular
diagonal matrix with the diagonal composed of elements of R+, and nVn is the matrix of of
eigenvectors of nXTp Xn.
Essentially, principal component analysis uses a particular orthogonal transformation
to turn our potentially correlated explanatory variables into a set of linearly independent
vectors called “principal components.” We can think of this as finding a new orthogonal
basis for the space of our explanatory variable that preserves the “structure” of the data
in the sense of maintaining variance. Furthermore, this new basis is constructed to help
us identify the most important gradients in the data. Hence, the first principal component
is in the direction of maximum variance in the data, the second principal component is in
the direction of maximum variance such that it is orthogonal to the first, and as such will
be in the direction of second most variance, etc. We are in effect “rotating” our data to
identify the directions of maximum variance. Note that, as mentioned above, the principal
components are simply a linear combination of the original variables, albeit chosen such that
the first component contains more information than the second, the second more than the
third, etc. for all p components.
Because the principal components are sensitive to the relative scaling of the vectors in
the data matrix, and can give misleading results if not mean-centered, we have standardized
the crime variables as discussed above before implementing the method.
There are several advantages to principal component analysis that we will be able to
take advantage of in the context of this study. Most importantly, we will be able to include
21
more rigorous measures of violent crime and property crime without dramatically violating
the independence assumption of the hazard rate model as we would if we included all of our
violent crime or property crime variables within our analysis. Furthermore, because principal
component analysis by design creates the new basis’ coordinates in order of importance, we
can project our original data onto this new basis, and can visualize what was originally high
dimensional data in the space of R2 or R3, which is a significant advantage in helping us to
interpret the meaning principal components. Hence, we can summarize the most important
information embodied in these two sets of variables in a single variable while maintaining
the assumptions of our model.
B. Hazard Rate Analysis
In order to undertake hazard rate analysis, we must examine the survival and hazard func-
tions. We can use the Nelson-Aalen estimator to get a sense of the cumulative hazard rate
function, Λ, where unsurprisingly
Λ(t) =
∫ t
0
λ(s)ds (9)
where λ is the hazard rate function described in our model above. The Nelson-Aalen es-
timator is non-parametric, and is able to handle censoring of data, so we can both remain
agnostic with regard to the underlying abandonment probability distributions and include
the large number of right-censored units in our data. The Nelson-Aalen estimator is given
by
H(t) =∑ti≤t
dini
(10)
where di is the number of abandonments at time ti, and ni is the total number of units at
risk at time ti. This method will give us some preliminary indication of the differences in
cumulative hazard rates for units across various crime parameters.
We will utilize the Peto log-rank and the Wilcoxon-Breslow-Gehan statistics to test the
22
null hypothesis that the hazard rates are the same across the various crime parameters. Both
statistics are given by
Z = v′V −1v (11)
where V is the covariance matrix, and v is a vector whose components are given by
vi =J∑
j=1
wj(dij − eij) (12)
where i ∈ I, I the number of distinct groups, the summation over J , where J the unique
event times, dij is the number of abandonments in group i ∈ I at time j ∈ J , eij is the
expected value of abandonment in group i at time j, and wj is the weight for time period
j, where wj = 1 ∀j for the Peto log-rank statistic and wj is the number of units at risk for
abandonment at each time j for the Wilcoxon-Breslow-Gehan statistic. Each is distributed
χ2 with degrees of freedom equal to the rank of V . The Cox proportional hazard model can
easily be estimated using maximum likelihood, so we will not go into any depth as to its
estimation.
V. Results
We estimate sets of hazard ratios for three different types of abandonment, and for various
subsets of variables in order to tease out the various relationships within the data. Most
importantly we are able to make some claims about the conditional impact of violent crime
on different subsets of both abandonment and the various explanatory variables.
We begin by using the full set of abandoned firms, before estimating specialized models
for abandonment due to dilapidation and abandonment due to in rem status. This ability
to measure the impact of our explanatory variables on different types of abandonment is an
extremely interesting and valuable benefit of our abandonment measure.
23
A. Hazard Rate and Hazard Ratio Analysis
The various explanatory variables are as described above, with the addition of the high
violent crime and high property crime variables. Units whose community districts have the
first principal component of violent crime and property crime over the period above the
city-wide median are denoted high violent crime and high property crime, respectively.
We begin by analyzing the cumulative hazard rate function, Λ, which we can estimate
with the Nelson-Aalen estimator. This is plotted in Figure 11 below. We can superficially
see that the cumulative hazard rate is consistently much higher for units in high violent
crime versus low violent crime districts, which is unsurprising considering our hypotheses.
We then test the null hypothesis that the underlying hazard rates are the same, us-
ing the Peto log-rank and Wilcoxon-Breslow-Gehan statistics and find that both reject the
null at extremely high significance levels (p-values <0.00001). Hence, we can conclude the
probability of abandonment is significantly affected by the crime level in the unit’s district.
Using the Cox proportional hazard model, we can quantify the difference in probabilities,
and control for a range of other variables. A hazard ratio represents a ratio of probabilities.
Note the hazard ratio (1.271) of high violent crime on the first model estimation in Table VIII.
Thus, the probability that a unit will be abandoned in the next period, conditional upon the
unit not being abandoned in an earlier period is 1.271 times higher for high violent crimes
as for low violent crimes.
Table VIII shows the full model and various subset models for the full set of abandoned
units, Table IX shows the full model and various subset models for the set of units that have
become abandoned due to dilapidation, and Table X shows the full model for the set of units
that have become abandoned due to in rem status. Most importantly, we should not that
violent crime is alway highly significant for the full model, and only loses its significance
for the high rent model. Furthermore, the hazard ratio for violent crime is always largest
under the low rent subset, indicating that the impact of surrounding violent crime on the
probability of abandonment is most important when the distance to abandonment is lowest.
24
As discussed earlier, we would expect violent crime to be most important for dilapidation
abandonment, and we see that the distance or ratio between the low rent and high rent
violent crime hazard ratio is largest in this situation.
We notice that rent level is most important under in rem abandonment, indicating that
this decision may be primarily economic (i.e. exercising the option of abandonment) as
opposed to the possibly larger range of effects at play under dilapidation abandonment.
Finally, we see that legal disputes and personal problems are most important for dilapidation
abandonment when rent is low. This would seem to indicate the risk preferences play a part
in the abandonment process. Specifically, when an individual either suffers a large physical
or financial setback due to sickness or injury, or the ownership of the unit is in question,
units whose profitability and value was already marginal dramatically increase in probability
of abandonment.
Thus we see that the impact of crime seems to vary predictably in magnitude for each
of our abandonment measures. Several other measures, such as sudden negative shocks like
legal trouble or notable sickness seem to exhibit similar effects.
We also considered models that included change in crime rather than level of crime, but
found them to be highly insignificant. While higher crime levels increased the probability that
a unit would become abandoned, we found that higher changes in crime actually decreased
the probability that a unit would become abandoned. It could be because agents might
not be as sensitive to very small changes in crime when making abandonment decisions, so
it might not factor very heavily. Change in assaults also had much less variation between
community districts than level, so that could also explain why it seems to be less of a factor
within this time period. Since changes are small and relatively homogeneous, it could be
that the prevailing idea of a neighborhood has not changed, or at least has not had time
to change, and as such has not been able to affect abandonment decisions. It could also
be because the areas with the smallest change in crime (even smallest percent change in
crime) already have very low crime, so more change is not really relevant to the decision.
25
What I think is most likely, however, is that decreases are mostly taking place in high crime
neighborhoods, so all this is picking up is the level, but in reverse.
VI. Conclusion
Using our novel measure of housing abandonment and our proportional hazard model, we
have been able to estimate the ratios in probability of various explanatory variables, most
notably violent crime. We find that violent crime has a strong causal impact on abandon-
ment, and furthermore that while the existence of a positive causal relationship remains
strong across various types of abandonment and various subsets of the data, the magnitude
of the impact varies substantially and predictably in accordance with theory along these
same measures. We are also able to make similar claims regarding other predictive variables,
including rent levels and various large negative idiosyncratic shocks.
The contribution of this paper is twofold. First, we are able to suggest (for what I believe
is the first time) that violent crime has a causal impact on the probability of abandonment.
Furthermore, through our constructed abandonment measure and our acknowledgement of
the various types of abandonment, we are able to measure the changing effects of our ex-
planatory variables on types of abandonment that may have very different theoretical and
empirical causes. Here, we see that the impact of relatively static demand decreasing pres-
sures, such as rent levels or crime, may have an entirely different impact on differential
probabilities of abandonment than more instantaneous negative shocks, like legal troubles
or dramatically poor health, necessitating the use of more nuanced and specific policies to
address what appear to be somewhat disparate problems.
26
References
Brent W. Ambrose. A hazard rate analysis of leavers and stayers in assisted housing pro-
grams. Cityscape: A Journal of Policy Development and Research, 8(2), 2005. URL
http://www.huduser.org/periodicals/cityscpe/vol8num2/ch4.pdf.
William C. Apgar. Rental housing in the u.s. The Joint Center for Housing Studies of
Harvard University - Working Paper, W88-1, 1988.
William C. Apgar. Preservation of existing housing: A key element in a revitalized housing
policy. The Joint Center for Housing Studies of Harvard University - Working Paper,
W90-1, 1990.
Paul Cheshire and Stephen Sheppard. Capitalising the value of free schools: The impact of
supply characteristics and uncertainty. Economic Journal, 114(499):F397–F424, November
2004. URL http://ideas.repec.org/a/ecj/econjl/v114y2004i499pf397-f424.html.
D. R. Cox. Regression models and life-tables. Journal of the Royal Statistical Society. Series
B (Methodological), 34(2):pp. 187–220, 1972. ISSN 00359246. URL http://www.jstor.
org/stable/2985181.
Rose Cunningham and Ilan Kolet. Housing market cycles and duration dependence in the
united states and canada. 07(2), 2007. URL http://ideas.repec.org/p/bca/bocawp/
07-2.html.
Francis X. Diebold and Glenn D. Rudebusch. A nonparametric investigation of duration
dependence in the american business cycle. Journal of Political Economy, 98(3):pp. 596–
616, 1990. ISSN 00223808. URL http://www.jstor.org/stable/2937701.
Ingrid Gould Ellen, Amy Ellen Schwartz, Ioan Voicu, and Michael H. Schill. Does federally
subsidized rental housing depress neighborhood property values? Journal of Policy Anal-
27
ysis and Management, 26(2):257–280, 2007. ISSN 1520-6688. doi: 10.1002/pam.20247.
URL http://dx.doi.org/10.1002/pam.20247.
Steve Gibbons. The costs of urban property crime*. The Economic Journal, 114(499):
F441–F463, 2004. ISSN 1468-0297. doi: 10.1111/j.1468-0297.2004.00254.x. URL http:
//dx.doi.org/10.1111/j.1468-0297.2004.00254.x.
Daryl A. Hellman and Joel L. Naroff. The impact of crime on urban residential property
values. Urban Studies, 16(1):105–112, 1979. doi: 10.1080/713702454. URL http://usj.
sagepub.com/content/16/1/105.short.
D. Immergluck and G. Smith. The impact of single-family mortgage foreclosures on neigh-
borhood crime. Housing Studies, 21(6):851–866, 2006. doi: 10.1080/02673030600917743.
Moon Wha Lee. Housing new york city 2008. Department of Housing Preservation and
Development, 2011.
Allen Lynch and David Rasmussen. Measuring the impact of crime on house prices. Applied
Economics, 33(15):1981–1989, 2001. URL http://EconPapers.repec.org/RePEc:taf:
applec:v:33:y:2001:i:15:p:1981-1989.
Brendan O’Flaherty. Abandoned buildings: A stochastic analysis. Journal of Urban Eco-
nomics, 34(1):43 – 74, 1993. ISSN 0094-1190. doi: 10.1006/juec.1993.1025. URL
http://www.sciencedirect.com/science/article/pii/S0094119083710259.
June Ying Shann-Hwa Park. Increased homelessness and low rent housing vacancy
rates. Journal of Housing Economics, 9(1-2):76 – 103, 2000. ISSN 1051-1377. doi:
10.1006/jhec.2000.0263. URL http://www.sciencedirect.com/science/article/pii/
S1051137700902638.
Sherwin Rosen. Hedonic prices and implicit markets: Product differentiation in pure
28
competition. Journal of Political Economy, 82(1):34–55, Jan.-Feb. 1974. URL http:
//ideas.repec.org/a/ucp/jpolec/v82y1974i1p34-55.html.
Wesley.G. Skogan. Disorder and Decline: Crime and the Spiral of Decay in American
Neighbourhoods. University of California Press, 1990. ISBN 9780520076938. URL
http://books.google.com/books?id=ASrAMJh7LngC.
William Spelman. Abandoned buildings: Magnets for crime? Journal of Criminal Justice,
21(5):481–495, 1993. URL http://EconPapers.repec.org/RePEc:eee:jcjust:v:21:y:
1993:i:5:p:481-495.
Fran Spielman. City to target hazards of abandoned buildings. Chicago Sun Times, 15
March:24, 2000.
Richard Thaler. A note on the value of crime control: Evidence from the property mar-
ket. Journal of Urban Economics, 5(1):137 – 145, 1978. ISSN 0094-1190. doi: 10.1016/
0094-1190(78)90042-6. URL http://www.sciencedirect.com/science/article/pii/
0094119078900426.
James Q. Wilson and George L. Kelling. Broken windows: The police and neighborhood
safety. The Atlantic Monthly, 1982.
29
Appendices
A Tables
Year Occupied Vacant Total
2002 15,894 1,263 17,1572005 15,547 1,287 16,8342008 17,955 1,471 19,426Total 49,396 4,021 53,417
Table II. Summary Tabulations for New York City Housing and Vacancy Surveyby Occupancy Status: This table presents the number of units by occupancy status foreach of the years in our panel. We see that
Year Owner Occ. Legal Disp. Pers. Prob. Many Units Low Rent Total
2002 2006 17.2% 14 0.1% 8 0.1% 11501 98.8% 2909 25.0% 116382005 1794 15.8% 19 0.2% 5 0.1% 11241 98.8% 2842 25.0% 113752008 1906 14.7% 28 0.2% 11 0.1% 12884 99.1% 3095 23.8% 12999Total 5706 15.8% 61 0.2% 24 0.1% 35626 98.9% 8846 24.6% 36012
Table III. Summary Tabulations for New York City Housing and Vacancy Sur-vey Binary Variables: This table presents the number of units in each of the followingcategories by year: whether or not the unit is owner-occupied, whether or not the unit iscurrently experiencing legal problems (i.e. those stemming from a will, a lawsuit, settlementof an estate, or some other legal matter that places the unit in limbo), whether or not theowner is experiencing personal problems (i.e. old age or sickness) that impede the rent orsale of a unit, whether or not the building has more than five units, and whether or not theunit is in the bottom quartile of units in terms of rents.
30
Year Unabandoned Abandoned Total
2002 11,013 608 11,6212005 10,806 547 11,3532008 12,366 592 12,958
Total 34,185 1,747 35,932
Table IV. Summary Tabulation for Abandoned Units: this table presents a tabulationof abandoned and unabandoned units by year. We see that abandonment is fairly consistentthroughout the period, but is much lower than abandonment. We see that the rate ofnew abandonment for the entire city is 5.5%, 5.2%, and 3.4% for 2002, 2005, and 2008,respectively.
Variable Unabandoned Abandoned Total
Low Violent Crime 16401 479 16880High Violent Crime 17903 1265 19168Low Property Crime 16638 1296 17934High Property Crime 17666 448 18114Not Owner Occ. 28693 1639 30332Owner Occ. 5611 105 5716Few Units 233 165 398Many Units 34071 1579 35650No Legal Disp. 34269 1715 35984Legal Disp. 35 29 64No Pers. Prob. 34293 1730 36023Pers. Prob. 11 14 25High Rents 26636 436 27072Low Rents 7668 1308 8976
Table V. NYCHVS and Crime Variables by Abandonment Status: This tableshows tabulations for our explanatory variables by abandonment. There are a few thingsthat jump out at us as particularly interesting. First, note that both legal disputes andpersonal problems are much more common among abandoned units than might be expectedby the relatively small number of abandoned units. Also note that small buildings areabandoned at a much higher rate than large buildings. Both of these are in accordance withour intuition and hypotheses.
31
VARIABLES Mean SD Median P5 P95
Murder 0.03 0.03 0.02 0.00 0.07Rape 0.07 0.04 0.06 0.01 0.15Robbery 1.23 0.65 1.17 0.28 2.42Assault 0.90 0.56 0.80 0.17 2.10Burglary 1.12 0.40 1.15 0.48 1.88Grand Larceny 2.15 1.30 1.81 1.01 4.62Grand Larceny Auto 0.70 0.34 0.69 0.18 1.34
Table VI. Summary Statistics for Crime Data: Reporting unit is the CommunityDistrict. Each variable is per total housing units in each community district. We havestandardized in this way to account for the fact that community districts are not preciselyhomogeneous in size. We have Winsorized grand larceny at the 5% level, and subsequentlywe do not observe any outliers. No other variables exhibit outliers.
Smaller group D Stat. P-value Corrected
0: 0.055 0.6121: -0.449 0.000Combined K-S: 0.449 <0.001 <0.001
Table VII. Two-sample Kolmogorov-Smirnov test for equality of Abandonmentand Vacancy distribution functions: This table presents the results of a Kolmogorov-Smirnov test on the equality of the distributions of the percentage of abandonment andthe percentage of vacancy at the level of the community district. The K-S test is a non-parametric test of the equality of two distributions, taking as the null hypothesis that the twodistributions are equal. We are able to conclude with high confidence that these two variablescome from different distributions, and thus that our abandonment measure is capturingsomething different from vacancy.
32
(1) (2) (3) (4) (5)VARIABLES Full Model Low Rent High Rent High Viol. Crime Low Viol. Crime
High Viol. Crime 1.981*** 2.048*** 1.613***(0.133) (0.173) (0.180)
High Prop. Crime 0.383*** 0.364*** 0.495*** 0.407*** 0.319***(0.0246) (0.0289) (0.0555) (0.0288) (0.0509)
Owner-Occupied 0.467*** 0.411*** 0.570*** 0.333*** 0.589***(0.0557) (0.0774) (0.0883) (0.0588) (0.105)
> 5 Units 0.0479*** 0.174*** 0.0211*** 0.0538*** 0.0340***(0.00453) (0.0383) (0.00247) (0.00623) (0.00562)
Legal Disp. 2.064*** 3.000*** 4.204*** 1.748** 4.571***(0.462) (1.087) (1.219) (0.480) (1.787)
Pers. Prob. 5.180*** 3.831* 3.114*** 8.596*** 3.706***(1.437) (2.734) (0.945) (3.633) (1.377)
Low Rent 6.291*** 6.430*** 5.892***(0.406) (0.518) (0.649)
Observations 35,003 8,393 26,610 18,524 16,479Standard Errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
Table VIII. Cox Proportional Hazard Model Estimations Under Various Condi-tions: This table presents hazard ratios for a range of explanatory variables under a range ofmodel specifications. Abandonment here is the full sample of abandoned units, and includesall types of abandonment.
33
(1) (2) (3) (4) (5)VARIABLES Full Model Low Rent High Rent High Viol. Crime Low Viol. Crime
High Viol. Crime 1.271** 2.172*** 1.061(0.136) (0.497) (0.136)
High Prop. Crime 0.575*** 0.554*** 0.608*** 0.594*** 0.583***(0.0606) (0.102) (0.0783) (0.0772) (0.107)
Owner Occ. 0.731** 0.660 0.757* 0.701* 0.812(0.104) (0.218) (0.122) (0.141) (0.166)
> 5 Units 0.0181*** 0.0386*** 0.0141*** 0.0177*** 0.0188***(0.00215) (0.0107) (0.00190) (0.00265) (0.00364)
Legal Disp. 3.896*** 8.587*** 2.983*** 4.017*** 4.464***(0.851) (3.499) (0.767) (1.020) (1.954)
Pers. Prob. 3.029*** 15.41*** 2.088** 2.647** 3.403***(0.795) (8.259) (0.653) (1.088) (1.190)
Low Rent 1.822*** 2.173*** 1.193(0.195) (0.287) (0.244)
Observations 35,465 8,811 26,654 18,857 16,608Standard Errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
Table IX. Cox Proportional Hazard Model Estimations Under Various Condi-tions - Dilapidation Abandonment: This table presents hazard ratios for a range ofexplanatory variables under a range of model specifications. Abandonment here the subsetof the full set of abandoned firms, and in particular only includes those firms who becameabandoned due to the dilapidation criterion.
34
(1)VARIABLES Full Model
High Viol. Crime 2.361***(0.198)
High Prop. Crime 0.311***(0.0252)
Owner Occ. 0.249***(0.0579)
> 5 Units -(-)
Legal Disp. -(-)
Pers. Prob. -(-)
Low Rent 17.55***(1.895)
Observations 35,506Standard Error in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Table X. Cox Proportional Hazard Model Estimations Under Various Conditions- in rem Abandonment: This table presents hazard ratios for a range of explanatoryvariables under a range of model specifications. Abandonment here the subset of the full setof abandoned firms, and in particular only includes those firms who became abandoned dueto the loss of the owner’s legal right to the property through in rem status.
35
B Figures
05
10
15
20
Density
0 .1 .2 .3 .4 .5
Percent
Percent Abandoned Percent Vacant
kernel = epanechnikov, bandwidth = 0.0080
Figure 1. Kernel Density Estimations for Rates of Vacancy and Abandonment byCommunity District: This plot shows the kernel density estimates for rate of vacancy (red,dashed) and rates of abandonment abandonment (blue, solid) within the city’s communitydistricts. A simple visual inspection seems to suggest that the two rates follow differentdistributions.
36
2002 2005 2008
Figure 2. First Principal Component of Crime (Violent Crime) per Unit over the Sample Period: This figureshows the first principal component of crime (which corresponds to violent crime) per total housing units in each of the 55community districts in New York City. We have standardized in this way to account for the fact that community districts arenot homogeneous in size. Darker community districts sustained more assaults.
37
2002 2005 2008
Figure 3. Abandonment Density: This figure shows abandonments per total housing units in each of the 55 communitydistricts in New York City. We have standardized in this way to account for the fact that community districts are nothomogeneous in size. Darker community districts have more abandoned units. There seem to be many parallels between theabandonment and assault maps, and this is borne out in their high correlations.
38
2002 2005 2008
−1 0 1 2 3 4
murder
rape
robbery
assault
burglary
grLarceny
grLarcenyAuto
−1 0 1 2 3 4
murder
rape
robbery
assault
burglary
grLarceny
grLarcenyAuto
−1 0 1 2 3 4
murder
rape
robbery
assault
burglary
grLarceny
grLarcenyAuto
Figure 4. Box and Whisker Plot of Standardized Crime Data: This figure shows box and whisker plots of the meanand variance standardized crime variables by year. We can easily see here that there are no significant outliers, and that theWinsorization of the grand larceny variable has been successful. All of this is important in the context of principal componentanalysis, which is both sensitive to outliers and to the relative scaling of the data. Standardizing the data puts all of thevariables into the same scale.
39
2002 2005 2008
−0.6 −0.4 −0.2 0 0.2 0.4 0.6
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
murder
rape
assault
robbery
Component 1
Com
ponent 2
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
murder
rape
assault
robbery
Component 1C
om
ponent 2
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8 murder
rape
assault
robbery
Component 1
Com
ponent 2
Figure 5. First vs. Second Principal Components for Violent Crime Data: This figure shows the original variablevectors from our data set projected onto the space of the first two principal components. We can see that the first componenthas very high positive values in all of the violent crime variables, and a moderate value in the abandonment variable, furtherreinforcing our hypothesis that the crime data is all highly correlated, and moderately correlated with abandonment. The secondprincipal component is low, but negative, in most violent crimes, and high and positive in property crimes, specifically burglaryand grand larceny, and high and negative in abandonment. This is interesting, and perhaps does not have an immediatelyintuitive explanation like the first component does. However, it is unarguably true that this component has something to dowith property and property crime, as burglary, grand larceny, and abandonment are its most important factors. Overall, thefirst component seems to be the violent crime component, while the second component seems to be the abandonment andproperty crime component.
40
2002 2005 2008
1 1.5 2 2.5 3 3.5 40
1
2
3
4
5
6
Principal Component
Eig
en
va
lue
1 1.5 2 2.5 3 3.5 40
0.5
1
1.5
2
2.5
3
3.5
Principal Component
Eig
envalu
e
1 1.5 2 2.5 3 3.5 40
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Principal Component
Eig
envalu
e
Figure 6. Screeplot of Eigenvalues of the Principal Components for Violent Crime Data: This figure shows ascreeplot of the eigenvalues of all eight principal components of the primary model. The eigenvalues are an indication of theamount that the transformation stretches the data, so because we are performing principal component analysis, the eigenvaluestell us the relative importance of the principal components in explaining the variance within the data.
41
2002 2005 2008
1 1.5 2 2.5 3 3.5 40.9
1
Cu
mu
lative
Va
ria
nce
Exp
lain
ed
(%
)
Principal Component
1 1.5 2 2.5 3 3.5 40
0.2
0.4
0.6
0.8
1
Va
ria
nce
Exp
lain
ed
(%
)
Cum. Var. Expl.
Var. Expl.
1 1.5 2 2.5 3 3.5 40.8
0.9
1
Cu
mu
lative
Va
ria
nce
Exp
lain
ed
(%
)
Principal Component
1 1.5 2 2.5 3 3.5 40
0.2
0.4
0.6
0.8
1
Va
ria
nce
Exp
lain
ed
(%
)
Cum. Var. Expl.
Var. Expl.
1 1.5 2 2.5 3 3.5 40.8
0.9
1
Cu
mu
lative
Va
ria
nce
Exp
lain
ed
(%
)
Principal Component
1 1.5 2 2.5 3 3.5 40
0.2
0.4
0.6
0.8
1
Va
ria
nce
Exp
lain
ed
(%
)
Cum. Var. Expl.
Var. Expl.
Figure 7. Percent Variance Explained by the Principal Components for Violent Crime Data: This plot shows thevariance explained by the individual principal components (right axis) and the cumulative variance of the principal componentsthus far (left axis). We see that the first violent crime component explains approximately 90% of the variance within the data.
42
2002 2005 2008
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
burglary
grLarceny
grLarcenyAuto
Component 1
Co
mp
on
en
t 2
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
burglary
grLarceny
grLarcenyAuto
Component 1C
om
ponent 2
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
burglary
grLarceny
grLarcenyAuto
Component 1
Com
ponent 2
Figure 8. First vs. Second Principal Components for Property Crime Data: This figure shows the original variablevectors from our data set projected onto the space of the first two principal components. We can see that the first component hasvery high positive values in many of the crime variables, and a moderate value in the abandonment variable, further reinforcingour hypothesis that the crime data is all highly correlated, and moderately correlated with abandonment. The second principalcomponent is low, but negative, in most violent crimes, and high and positive in property crimes, specifically burglary andgrand larceny, and high and negative in abandonment. This is interesting, and perhaps does not have an immediately intuitiveexplanation like the first component does. However, it is unarguably true that this component has something to do withproperty and property crime, as burglary, grand larceny, and abandonment are its most important factors. Overall, the firstcomponent seems to be the violent crime component, while the second component seems to be the abandonment andproperty crime component.
43
2002 2005 2008
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Principal Component
Eig
envalu
e
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.2
0.4
0.6
0.8
1
1.2
1.4
Principal Component
Eig
envalu
e
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Principal Component
Eig
envalu
e
Figure 9. Screeplot of Eigenvalues of the Principal Components for Property Crime Data: This figure shows ascreeplot of the eigenvalues of all eight principal components of the primary model. The eigenvalues are an indication of theamount that the transformation stretches the data, so because we are performing principal component analysis, the eigenvaluestell us the relative importance of the principal components in explaining the variance within the data.
44
2002 2005 2008
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30.5
0.6
0.7
0.8
0.9
1
Cu
mu
lative
Va
ria
nce
Exp
lain
ed
(%
)
Principal Component
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
0.2
0.4
0.6
Va
ria
nce
Exp
lain
ed
(%
)
Cum. Var. Expl.
Var. Expl.
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
0.5
0.6
0.7
0.8
0.9
1
Cu
mu
lative
Va
ria
nce
Exp
lain
ed
(%
)
Principal Component
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.2
0.4
0.6
Va
ria
nce
Exp
lain
ed
(%
)
Cum. Var. Expl.
Var. Expl.
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
0.7
0.8
0.9
1
Cu
mu
lative
Va
ria
nce
Exp
lain
ed
(%
)
Principal Component
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.2
0.4
0.6
0.8
Va
ria
nce
Exp
lain
ed
(%
)
Cum. Var. Expl.
Var. Expl.
Figure 10. Percent Variance Explained by the Principal Components for Property Crime Data: This plot shows thevariance explained by the individual principal components (right axis) and the cumulative variance of the principal componentsthus far (left axis). We see that the violent crime component and the abandonment and property crime component explainapproximately 70% of the variance within the data.
45
0.0
00.0
50.1
00.1
5
2002 2004 2006 2008
analysis time
Figure 11. Nelson-Aalen Estimate of the Cumulative Hazard Rate Function: Thisplot shows the cumulative hazard rate estimates for high crime (red, dashed) and low crime(blue, solid). We can see that the cumulative hazard rate function is much higher for unitsin community districts deemed high crime. This is in line with our assumptions.
46