Spatial Statistics
Applied to point data
Centrographic Statistics• Most basic type of descriptor for spatial distributions,
includes:– Mean Center
– Median Center
– Standard Deviation
– Standard Distance
– Standard Deviational Ellipse
• Two dimensional correlates to basic statistical moments of a single-variable distribution
• Modified from one dimensional to two dimensional
Mean Center
• Simply the mean of X and Y
• Also called center of gravity
• Sum of differences between the mean X and all other X is zero (same for Y)
N
i
i
N
i
i
N
YY
N
XX
1
1
Weighted Mean Center
• Produced by weighting each coordinate by another variable (e.g., population)
• Points associated with areas can have the characteristics of those areas included
N
i
ii
N
i
ii
N
YWY
N
XWX
1
1
Standard Deviation of X and Y
• A measure of dispersion
• Does not provide a single summary statistic of the dispersion
N
i
iy
N
i
ix
N
YYS
N
XXS
1
2
1
2
1
)(
1
)(
Standard Distance Deviation• Represents the standard deviation of the distance of
each point from the mean center
• Is the two dimensional equivalent of standard deviation
• Where:
constants twoare
theresince distance standard of estimate unbiasedan
provide topoints ofnumber thefrom 2subtract We
point ofnumber total theis N andcenter mean the
and i, point,each between distance theis d where
2
)(
iMC
1
2
N
i
iMCxy N
dS
Standard Distance Deviation
• Because it is an average distance from the mean center, it is represented as a single vector
Standard Deviation Ellipse
• While the standard distance deviation is a good single measure of the dispersion of the incidents around the mean center, it does not show the potential skewed nature of the data (anisotropy).
• The standard deviation ellipse gives dispersion in two dimensions
Standard Deviational Ellipse
ellipse.an define andother each toorthogonal are
directiony and x in the deviations standard two theare where
2
)( 22
yonDistributi x
Testing the Differences
Crime Analysis with Centrographic Statistics
• A good “free” software product for doing some basic spatial statistics is Crimestat
• Review of Crimestat Figures 4.19 – 4.26
– Seeing the relationship between mean center, standard distance, and standard deviational ellipse
• Centrographic Statistics in Monroe County
Point Pattern Analysis
• The spatial pattern of the distribution of a set of point features. – Spatial properties of the entire body of points
are studied rather than the individual entities
– Points are 0 dimensional objects, the only valid measures of distributions are the number of occurrences in the pattern and respective geographic locations
Descriptive Statistics of Point Features
• Frequency: number of point features occurring on a map
Types of Distribution
• Three general patterns– Random any point is equally likely to occur at any
location and the position of any point is not affected by the position of any other point. There is no apparent ordering of the distribution
– Uniform every point is as far from all of its neighbors as possible
– Clustered many points are concentrated close together, and large areas that contain very few, if any, points
Quadrat Analysis• Based on a measure derived from data obtained after a uniform
grid network is drawn over a map of the distribution of interest
• The frequency count, the number of points occurring within each quadrat is recorded first
• These data are then used to compute a measure called the variance
• The variance compares the number of points in each grid cell with the average number of points over all of the cells
• The variance of the distribution is compared to the characteristics of a random distribution
RANDOM UNIFORM CLUSTERED
3 15 02 11 33 1
Quadrat #
Number of Points Per Quadrat x^2
1 3 92 1 13 5 254 0 05 2 46 1 17 1 18 3 99 3 9
10 1 120 60
Variance 2.222Mean 2.000
Var/Mean 1.111
2 22 22 22 22 2
Quadrat #
Number of Points Per Quadrat x^2
1 2 42 2 43 2 44 2 45 2 46 2 47 2 48 2 49 2 4
10 2 420 40
Variance 0.000Mean 2.000
Var/Mean 0.000
0 00 0
10 100 00 0
Quadrat #
Number of Points Per Quadrat x^2
1 0 02 0 03 0 04 0 05 10 1006 10 1007 0 08 0 09 0 0
10 0 020 200
Variance 17.778Mean 2.000
Var/Mean 8.889
mean
variance1
]/)[(
10__22
ratiomeanVariance
N
NxxVariance
quadratsofnumberN
Quadrat Analysis
• A random distribution would indicate that that the variance and mean are the same. Therefore, we would expect a variance-mean ratio around 1
• Values other than 1 would indicate a non-random distribution.
Weakness of Quadrat Analysis
• Quadrat size and orientation
– If the quadrats are too small, they may contain only a couple of points. If they are too large, they may contain too many points
• Some have suggested that quadrat size should be twice the size of the mean area per point
• Or, test different sizes (or orientations) to determine the effects of each test on the results
Weakness of Quadrat Method
• Actually a measure of dispersion, and not really pattern, because it is based primarily on the density of points, and not their arrangement in relation to one another
• Results in a single measure for the entire distribution, so variations within the region are not recognized
Nearest-Neighbour Analysis
• Unlike quadrat analysis uses distances between points as its basis.
• The mean of the distance observed between each point and its nearest neighbour is compared with the expected mean distance that would occur if the distribution were random
RANDOM UNIFORM CLUSTERED
PointNearest
NeighbourDistance
( r )1 2 12 3 0.13 2 0.14 5 15 4 16 5 27 6 2.78 10 19 10 110 9 1
10.9
r 1.09Area of Region 50Density 0.2Expected Mean 1.118034R 0.9749256
PointNearest
Neighbour Distance1 2 0.12 3 0.13 2 0.14 5 0.15 4 0.16 5 0.17 6 0.18 9 0.19 10 0.110 9 0.1
1
r 0.1Area of Region 50Density 0.2Expected Mean 1.118034R 0.0894427
PointNearest
Neighbour Distance1 3 2.22 4 2.23 4 2.24 5 2.25 7 2.26 7 2.27 8 2.28 9 2.29 10 2.210 9 2.2
22
r 2.2Area of Region 50Density 0.2Expected Mean 1.118034R 1.9677398
der
area
nd
n
rr
5.)(
)(er
rR
Advantages of Nearest Neighbor over Quadrat Analysis
• No quadrat size problem to be concerned with
• Takes distance into account
• Problems
– Related to the entire boundary size
– Must consider how to measure the boundary
• Arbitrary or some natural boundary
– May not consider a possible adjacent boundary