Date post: | 14-Oct-2014 |
Category: |
Documents |
Upload: | em-atallah |
View: | 1,446 times |
Download: | 0 times |
Case Study 49: Property Crimes
First M Last([email protected])
For
Professor Beintema
Managerial Statistics (GM533)
Keller School of Management
August 2010
I. Executive summary
Our study examined data provided by various U.S. government agencies on property crime rates in the
fifty U.S. states and eight possible contributing factors such as per capita income, high school dropout
rate, average precipitation, population density, and urbanization. Our analysis revealed that of the eight
possible contributing factors, only three variables (namely, urbanization rate, high school dropout rate,
and population density) affected property crime rates. Our data analysis model accounted for
approximately 66% of the factors contributing to property crimes. The model is generally considered to
be statistically strong, however, if we need to account for the remaining 34% of factors contributing to
property crime rates in the U.S., further data and evaluation of other possible factors would be
necessary.
II. Introduction
According to the US Department of Justice (2006), property crime includes several criminal
offenses such as burglary; car and motorcycle theft, larceny theft and arson. Property crimes involve
“taking of money or property, but there is no force or threat of force against the victims.” One exception
to the basic rule, however, is arson which does not involve the taking of property and does involve force
against the victims.
The purpose of this case study is to evaluate available data and attempt to determine the
variables that contribute the most and address several conceptions and misconceptions about the
leading causes of property crimes in the U.S. The questions that this study will answer include:
1. Are crime rates higher in urban than rural areas?
2. Does unemployment or education level contribute to property crime rates?
3. Does public assistance contribute to property crime rates?
4. What other factors relate to property crimes?
Case Study 49: Property Crimes Page 1
The study used data that was collected from a “variety of U.S. government sources, including:
the 1988 Uniform Crime Reports, Federal Bureau of Investigation; the Office of Research and Statistics,
Social Security Administration; the Commerce Department, Bureau of Economic Analysis; the National
Center for Education Statistics, U.S. Department of Education; the Bureau of the Census, Department of
Commerce and Geography Division; the Labor Department, Bureau of Labor Statistics; and the National
Climatic Data Center, U.S. Department of Commerce. The data set was originally collected by Louis J.
Moritz, an operations manager.” (Bowerman et. al., 2010). A copy of the available data set is attached
in Appendix A. The data consists of the following information for each of the fifty states:
1. Property crime rate per hundred thousand inhabitants
2. Per capita income
3. High school dropout rate
4. Average precipitation in the major city
5. Percentage of public aid recipients
6. Population density
7. Public aid for families with children
8. Percentage of unemployed workers
9. Percentage of the residents living in urban areas
III. Analysis and methods
We used MegaStat to analyze the given data and test the various facts and hypotheses about the
data. We ran a multiple regression analysis on the data to determine which variables affected crime
rate the most. In this scenario, our dependant variable (Y) was the given crime rate for each state, and
the independent variables were the other 8 variables (i.e., per capita income, dropout rate, etc.) given
for each of the states. The MegaStat output is shown in Appendix B and pertinent excerpts are shown
below.
Case Study 49: Property Crimes Page 2
Regression Analysis
R² 0.686
Adjusted R² 0.625 n 50
R 0.828 k 8
Std. Error 754.255 Dep. Var. CRIMES (Y)
…
Regression output confidence interval
variables coefficients std. error t (df=41) p-value 95% lower 95% upper
Intercept -1,008.0855 1,003.2571 -1.005 .3209-
3,034.2043 1,018.0334
PINCOME (X1) 0.0156 0.0731 0.213 .8323 -0.1320 0.1632
DROPOUT (X2) 73.3997 21.5165 3.411 .0015 29.9463 116.8532
PUBAID (X3) -49.3649 39.8547 -1.239 .2225 -129.8531 31.1233
DENSITY (X4) -2.2108 0.7018 -3.150 .0030 -3.6281 -0.7934
KIDS (X5) 0.4108 1.3363 0.307 .7601 -2.2878 3.1095
PRECIP (X6) -0.5357 10.9622 -0.049 .9613 -22.6744 21.6030
UNEMPLOY (X7) -57.4497 78.7026 -0.730 .4696 -216.3928 101.4933
URBAN (X8) 65.8552 11.0268 5.972 4.74E-07 43.5862 88.1242
The summary of this analysis:
1. R^2 = 68.6%: This is the proportion of variation in the dependent variable Y that is explained by
variation in the independent variables Xi. In other words, using this model, almost 67% of the
variation in the crime rate can be attributed to the independent variables X1 – X8.
2. To determine how much effect each of the independent variables has on the dependent
variable, we examine the correlation coefficient for each of the independent variables. The
higher the coefficient, the more effect the particular independent variable has on the
independent variable. A positive coefficient value indicates that the independent variable has a
positive effect on the dependent variable, while a negative coefficient indicates that the
independent variable has a negative effect on the dependent variable. The output shows that
X2 (dropout rate) has the largest number (most effect), followed by X8 (urbanization), X7
(unemployment rate), and finally public aid. A positive number indicates that the independent
variable has a positive effect on the independent variable while a negative number indicates
that the independent variable has a negative effect on the dependent variable. For example, X2
Case Study 49: Property Crimes Page 3
= 73.39 means that every 1% increase in the dropout rate for the state contributes 73.39%
increase in crime rate. X7 = 57.44 means that every 1% increase in unemployment rate
decreases the crime rate by 57.44%.
3. Next, we examine the p-value of each of the independent variables to determine which ones are
most significant (below .005 alpha value) and we see that the X2, X4, and X8 are the only
independent variables with p-value less than .005 with X8 (urbanization) having the lowest p-
value. This is an indicator that perhaps urbanization has the strongest effect on crime rate.
4. To refine our model further, we drop the independent variables that do not positively affect the
dependant variable and re-run the regression analysis with only the variables that have a strong
effect (high coefficient and low p-value).
5. The new regression analysis for the data that uses only significant independent variables (X2, X4,
and X8) yields the results below (also presented in Appendix C).
Regression Analysis
R² 0.656
Adjusted R² 0.633 n 50
R 0.810 k 3
Std. Error 745.822 Dep. Var. CRIMES (Y)
ANOVA table
Source SS df MS F p-value
Regression 48,778,906.692
7 3 16,259,635.564
2 29.23 9.92E-11
Residual 25,587,492.190
5 46 556,249.8302
Total 74,366,398.883
2 49
Regression output confidence interval
variables coefficientsstd. error t (df=46) p-value 95% lower 95% upper
Intercept -1,052.5531 613.1049 -1.717 .0928
-2,286.669
2 181.5630
DROPOUT (X1) 57.7544 15.3153 3.771 .0005 26.9262 88.5826
DENSITY (X2) -1.9318 0.5270 -3.666 .0006 -2.9926 -0.8710
URBAN (X3) 67.8889 8.4077 8.075 2.30E-10 50.9650 84.8127
Case Study 49: Property Crimes Page 4
6. Next, we examine R^2 = 65.6% and Adjusted R^2 = 63.3% for this model to determine how
much of the variation in X2, X4, and X8 used in this model accounts for the variation in Y and we
determine that the model is useful to us and accounts for high percentage of the variation in
crime rate (Y).
7. We further look at the F(test) = 29.23 value and its associated p-value = 9.92E-11 which is very
low and well below 0.005 (the lowest of the standard confidence tests). This is a positive sign
that the model represents the data adequately.
The hypotheses for the overall F-test are: Ho: B1 = B2 = B3 = 0 (i.e. none of the independent
variables are significantly related to Y) vs. Ha: at least one Bi <> 0 (i.e. at least of the
independent variables is significantly related to Y). Therefore, we reject Ho and due to the
extremely low p-value, we are able to conclude that there extremely strong evidence that at
least one of the independent variables is significantly related to Y and our model represents the
data accurately.
8. Next, we examine the significance of each of the proposed independent variables, by looking at
their individual p-values:
If we select an alpha of 0.005 as a cutoff and we compare our p-values to alpha = .005, we
conclude that since each of the p-values is < 0.005 all 3 independent variables are significantly
related to crime rate and should be included in the model.
Case Study 49: Property Crimes Page 5
X1 p-value = .0005X2 p-value = .0006X3 p-value = 2.30E-10
9. Now that we’ve determined the fitness of our model, we look at the following information:
a. Scatter plot for each of the independent variables (presented in Appendix D).
b. Descriptive statistics and point estimates:
i. Central Tendency: the regression equation of the least squares line:
y-hat = b0 + b1x1 + b2x2 + b3x3
y-hat = -1,052.5531 + 57.7544x1 - 1.9318x2 + 67.8889x3
ii. Standard error = 745.82
iii. Simple coefficient of determination = 0.656
iv. Conclusion: 65.6% of the variability in crime rate can be explained by changes in
dropout rate, population density, and urbanization
c. Confidence intervals:
i. The 95% confidence interval for dropout rate is [26.93, 88.58] which means that
for each 1% increase in dropout rate, the crime rate will increase between 27%
and 89%.
ii. The 95% confidence interval for population density is [-2.99, -0.87] which means
that for each increase in population density per square mile, the crime rate will
increase between 1% and 3%.
iii. The 95% confidence interval for urbanization is [50.96, 84.81] which means that
for each 1% increase in urbanization, the crime rate will increase between 51%
and 84%.
IV. Conclusions and summary
We first used MegaStat and ran a regression analysis that looked at all eight possible variables that could
affect property crimes so we could determine which of the variables truly contributed to the crime rate.
The initial test revealed that there were only three factors that contributed significantly to crime rate,
Case Study 49: Property Crimes Page 6
and those were (in the following order): Urbanization high school dropout rate, and population density.
We then used MegaStat again and ran another regression analysis with the three variables that we
identified as significantly contributing to property crime rate. The model is considered a statistically
strong one, since 66% of the variation in property crime is explained by the model. Using our model, we
provided and explained the descriptive statistics and the confidence intervals for the data. Finally, we
were able to answer the questions posed by assignment:
a. Are crime rates higher in urban than rural areas?
According to 9.c.3 above, we are very confident that crime rates are higher in urban areas than
rural areas.
b. Does unemployment or education level contribute to property crime rates?
According to 4 above, unemployment and education do not appear to contribute significantly to
property crime rates.
c. Does public assistance contribute to property crime rates?
According to 4 above, public assistance does not appear to contribute significantly to property
crime rates.
d. What other factors relate to property crimes?
According to the available data, other factors that appear to influence property crimes are the
high school dropout rate and population density.
References
Bowerman, O’Connell, Orris, & Murphree (2010). Essentials of Business Statistics, Third Edition. New York: The McGraw−Hill Companies.
Department of Justice, Federal Bureau of Investigation (2006). Property Crime. Retrieved from http://www.fbi.gov/ucr/cius_04/offenses_reported/property_crime/index.html
Case Study 49: Property Crimes Page 7
Case Study 49: Property Crimes Page 8
Appendix A
Data Set
STATE CRIMES PINCOME
DROPOUT
PUBAID DENSITY
KIDS PRECIP UNEMPLOY URBAN1 4003.1 12604 30.5 6.5 80.8 114 59.4 7.2 60
2 4398.8 19514 26.4 4.3 0.9 593 53.2 9.3 64.33 6861.2 14887 30 3.5 30.7 268 7.1 6.3 83.84 3796.9 12172 21.4 5.9 46 190 49.2 7.7 51.55 5705.7 18855 31.5 8.8 181.2 581 17.3 5.3 91.36 5705.7 16417 24 3.7 31.9 317 15.3 6.4 80.67 4642.2 22761 21.8 4.3 663.6 486 44.4 3 78.88 4347.4 17699 29 4.3 341.7 267 41.4 3.2 70.69 7819.9 16546 36.5 4.1 227.8 240 55.2 5 84.6
10 5661.2 14980 35 6.4 109.2 252 48.6 5.8 62.411 5731.9 16898 15.5 4.9 170.9 481 23.5 3.2 86.512 3738.2 12657 20.4 2.7 12.2 250 11.7 5.8 5413 4810.4 17611 22.2 22.2 208.7 309 33.3 6.8 83.314 3770 14721 24.1 3.6 154.6 263 39.1 5.3 64.215 3819.8 14764 13.4 5 50.6 348 38.6 4.5 58.616 4514.7 15905 15.9 3.8 30.5 338 28.6 4.8 66.717 2804.7 12795 32.1 7 93.9 204 43.6 7.9 50.918 5043.3 12193 38.4 8.8 99 167 59.7 10.9 68.719 3420.3 14976 21.2 6.5 38.9 370 43.5 3.8 47.520 4897.9 19316 23.5 5.1 469.9 329 41.8 4.5 80.321 4371.3 20701 24 5.9 752.7 536 43.8 3.3 83.822 5342.7 16387 28.6 8.5 162.2 481 31 7.6 70.723 4024.7 16787 11.3 4.6 54.1 515 26.4 4 66.924 3267.6 10992 34.4 11.1 55.5 119 52.8 8.4 47.325 4292.2 15492 23.9 5.5 74.6 264 33.9 5.7 68.126 4144 12670 15.5 4.5 5.5 362 11.4 6.8 52.927 3866.8 15184 13.3 3.8 20.9 320 30.3 3.6 62.928 5672.5 17440 18.7 2.6 9.6 273 4.2 5.2 85.329 3186.1 19016 25.4 1.7 120.7 407 36.5 2.4 52.230 4712.5 21882 20.3 5.6 1033.9 357 41.9 3.8 8931 5948.1 12481 26.8 5.7 12.4 225 8.9 7.8 72.132 5212 19299 33.3 8 378 523 39.3 4.2 84.633 4360.4 14128 30.9 5 132.9 243 42.5 3.6 42.934 2668.9 12720 11.6 3.1 9.6 350 15.4 4.8 48.835 4193.2 15485 20.8 7.5 264.7 298 37.8 6 73.336 5154.6 13269 24.2 4.8 47.2 278 30.9 6.7 67.337 6513 14982 29.2 4 28.8 348 37.4 5.8 67.938 2814.4 12168 18.9 6.1 267.4 347 40 5.1 69.339 4807.7 16793 28 6 940.9 450 41.9 3.1 8740 4671.2 12764 32.2 6.3 114.9 186 51.6 4.5 54.141 2467.3 12475 13.1 3.9 9.4 270 17.5 3.9 46.442 3936.7 13659 32.8 6.4 118.9 155 48.5 5.8 60.443 7365.1 14640 34.1 4.5 64.3 169 42 7.3 79.644 5335.5 12013 17.5 3.2 20.6 343 15.3 4.9 84.445 4098.2 15382 17.3 5.6 60.1 469 33.7 2.8 33.846 3877.5 17640 23.4 4 151.5 257 45.2 3.9 6647 6646.6 16569 21.9 5.8 69.9 443 38.6 6.2 73.548 2107.4 11658 22.9 8.2 77.8 238 40.7 9.9 36.249 3757.6 15444 16.3 7.7 89.2 473 30.9 4.3 64.250 3653.1 13718 20.4 3.2 4.9 303 13.3 6.3 62.7
Case Study 49: Property Crimes Page 9
Variables
CRIMES Property crime rate per hundred thousand inhabitants (propertycrimes include burglary, larceny, theft, and motor vehicle theft);calculated as # of property crimes committed divided by totalpopulation/100,000
PINCOME Per capita income for each stateDROPOUT High school dropout rate (%, 1987)PRECIP Average precipitation in inches in the major city in each state
over 1951 - 80PUBAID Percentage of public aid recipients (1987)DENSITY Population/total square milesKIDS Public aid for families with children, dollars per familyUNEMPLOY Percentage of unemployed workersURBAN Percentage of the residents living in urban areasSTATE Number (1-50) representing the state
1 = Alabama 26 = Montana2 = Alaska 27 = Nebraska 3 = Arizona 28 = Nevada 4 = Arkansas 29 = New Hampshire 5 = California 30 = New Jersey 6 = Colorado 31 = New Mexico 7 = Connecticut 32 = New York 8 = Delaware 33 = North Carolina 9 = Florida 34 = North Dakota 10 = Georgia 35 = Ohio 11 = Hawaii 36 = Oklahoma 12 = Idaho 37 = Oregon 13 = Illinois 38 = Pennsylvania 14 = Indiana 39 = Rhode Island15 = Iowa 40 = South Carolina 16 = Kansas 41 = South Dakota 17 = Kentucky 42 = Tennessee 18 = Louisiana 43 = Texas 19 = Maine 44 = Utah 20 = Maryland 45 = Vermont 21 = Massachusetts 46 = Virginia 22 = Michigan 47 = Washington 23 = Minnesota 48 = West Virginia 24 = Mississippi 49 = Wisconsin 25 = Missouri 50 = Wyoming
Case Study 49: Property Crimes Page 10
Appendix B
Initial MegaStat output for multiple regression analysis taking into consideration all given data variables with crime rate as the dependant variable and the other data as independent variables.
Regression Analysis
R² 0.686
Adjusted R² 0.625 n 50
R 0.828 k 8
Std. Error 754.255 Dep. Var. CRIMES (Y)
ANOVA table
Source SS df MS F p-value
Regression 51,041,456.255
9 8 6,380,182.032
0 11.21 3.11E-08
Residual 23,324,942.627
3 41 568,901.0397
Total 74,366,398.883
2 49
Regression output confidence interval
variables coefficientsstd. error t (df=41) p-value 95% lower 95% upper
Intercept -1,008.0855 1,003.257
1 -1.005 .3209
-3,034.204
3 1,018.033
4
PINCOME (X1) 0.0156 0.0731 0.213 .8323 -0.1320 0.1632
DROPOUT (X2) 73.3997 21.5165 3.411 .0015 29.9463 116.8532
PUBAID (X3) -49.3649 39.8547 -1.239 .2225 -129.8531 31.1233
DENSITY (X4) -2.2108 0.7018 -3.150 .0030 -3.6281 -0.7934
KIDS (X5) 0.4108 1.3363 0.307 .7601 -2.2878 3.1095
PRECIP (X6) -0.5357 10.9622 -0.049 .9613 -22.6744 21.6030
UNEMPLOY (X7) -57.4497 78.7026 -0.730 .4696 -216.3928 101.4933
URBAN (X8) 65.8552 11.0268 5.972 4.74E-07 43.5862 88.1242
Case Study 49: Property Crimes Page 11
Appendix C
Refined MegaStat output for multiple regression analysis that takes into consideration independent variables that strongly affect the dependent variable (crime rate).
Regression Analysis
R² 0.656
Adjusted R² 0.633 n 50
R 0.810 k 3
Std. Error 745.822 Dep. Var. CRIMES (Y)
ANOVA table
Source SS df MS F p-value
Regression 48,778,906.692
7 3 16,259,635.564
2 29.23 9.92E-11
Residual 25,587,492.190
5 46 556,249.8302
Total 74,366,398.883
2 49
Regression output confidence interval
variables coefficientsstd. error t (df=46) p-value 95% lower 95% upper
Intercept -1,052.5531 613.1049 -1.717 .0928
-2,286.669
2 181.5630
DROPOUT (X2) 57.7544 15.3153 3.771 .0005 26.9262 88.5826
DENSITY (X4) -1.9318 0.5270 -3.666 .0006 -2.9926 -0.8710
URBAN (X8) 67.8889 8.4077 8.075 2.30E-10 50.9650 84.8127
Case Study 49: Property Crimes Page 12
Appendix D
Scatter plots for the three independent variables high school dropout rate (X1), population density (X2), and urbanization (X3) that significantly saffect the dependent variable, property crime rate (Y).
5 10 15 20 25 30 35 40 450
1000
2000
3000
4000
5000
6000
7000
8000
9000
f(x) = 71.7172258646256 x + 2832.58007008327R² = 0.167947891260972
Crime and Dropout Rate
DROPOUT (X1)
CR
IME
S (
Y)
0 200 400 600 800 1000 12000
1000
2000
3000
4000
5000
6000
7000
8000
9000
f(x) = 0.324658737786672 x + 4506.02529038453R² = 0.00371175403619672
Crime and Population Density
DENSITY (X2)
CR
IME
S (
Y)
Case Study 49: Property Crimes Page 13
20 30 40 50 60 70 80 90 1000
1000
2000
3000
4000
5000
6000
7000
8000
9000
f(x) = 57.1814101542171 x + 737.009819651514R² = 0.457157880440283
Crime and Urbanization
URBAN (X3)
CR
IME
S (
Y)
Case Study 49: Property Crimes Page 14