Post on 16-Oct-2021
transcript
OTIC FILE COPY t -
NAVAL POSTGRADUATE SCHOOLMonterey, California
0
DTICG "1 D &%ELECTE
THESIS S...DVHA MODEL REVIEW
by
Michele L. Williams
March 1990
Thesis Advisor: Laura Johnson
Approved for public release; distribution is unlimited
UnclassifiedSECURITY CLASSIFICATION OF THIS PAGE
SForm Approved
REPORT DOCUMENTATION PAGE OMB o 0704-0188
la REPORT SECURITY CLASSIFICATION lb RESTRICTIVE MARKINGS
Unclassified2a SECURITY CLASSIFICATION AUTHORITY 3 DISTRIBUTION/AVAILABILITY OF REPORT
2b. DECLASSIFICATION/DOWNGRADING SCHEDULE Approved for public release; dis-tribution is unlimited
4 PERFORMING ORGANIZATION REPORT NuMBER(S) S MONITORING ORGANIZATION REPORT NUMBER(S)
6a NAME OF PERFORMING ORGANIZATION 6b OFFICE SYMBOL 7a NAME OF MONITORING ORGANIZATION
(if applicable)
Naval Postgraduate School Code 55 Naval Postgraduate School6c. ADDRESS (City, State. and ZIPCode) 7b ADDRESS (City. State. and ZIP Code)
Monterey, California 93943-5000 Monterey, California 93943-5000
Ba NAME OF FUNDING, SPONSORING Bb OFFICE SYMBOt 9 PROCuREMENT INS'rRU jANT IDENTIFICATION NUMBERORGANIZAT!Cj.% (If applicable)
Bc. ADDRESS (City, State, and ZIP Code) 10 SOURCE OF FUNDING NUMBERSPROGRAM PROJECT TASK WORK UNITELEMENT NO NO NO ACCESSiON NO
11 TITLE (Include Security Classification)
VHA MODEL REVIEW
12. PERSONAL AUTHOR(S)WILLIAMS, Michele L.
13a TYPE OF REPORT 13b TIME COVERED 14 DATE OF REPORT (Year, Month, Day) [S PAGE COUNT
Master's Thesis FROM TO _ 1990 March 30816 SUPPLEMENTARY NOTATION The views expressed in this thesis are those of the
author and do not reflect the official policy or position of the Depart-ment of Defense or the U.S. Government
17 COSATI CODES 18 SUBJECT TERMS (Continue on reverse if necessary and identify by block number)
FIELD GPOUP SUB-GROUP VHA, Regression models, median rent, weighted
least squares, Analysis of covariance, BAQ
19 ABSTRACT (Continue on reverse of necessary and identify by block number)
A regression model is used to predict median rents by the Office of
the Secretary of Defense (OSD) to find variable housing allowance (VHA)
as a supplement to Basic Allowance for Quarters (BAQ). These allowancesare made for service members in the continental United States. It isthis model that is reviewed in this thesis. Median rental data takenfrom the annual VHA survey is used to test this model. From thisanalysis, the mocel indicates lack of fit, invalid assumptions andperhaps not even a "reasonable" approach. A more sensible approach isused to propose two other regression models.
These models are a Weighted Regression Model which, like the currentmodel, predicts medians; and an Analysis of Covariance model which
20 DISTRIBUTION /AVAILABILITY OF ABSTRACT 21 ABSTRACT SECURITY CLASSIFICATION
; UNCLASSIFIED/UNLIMITED L] SAME AS RPT DTIC USERS Unclassified22a NAME OF RESPONSIBLE INDIVIDUAL 22b TELEPHONE (Include Area Code) 22c OFFICE SYMBOL
Laura Johnson 408-646-2569 55-1cIDO For, 1473, JUN 86 ,re,, ,s -,jI,unsate cosoee . .. , :.-:.TON O-' - ' " .
S/N 0102-LF-014-6603 Unclassified
i
Unc lassified
SECURITY CLASSIFICATION OF THIS PAGE
#19 (Continued)
predicts or analyzes the mean rent. More reasonable predictions ofmedian and mean rent are indicated by these two models respectively.
6
AccesIggon ForNTIS GRA&IDTIC TAB
Uuannced
JhSt Irlcatio
Av 'tV Codes
',st :pC-c al
DD Form 147, JUN 86 Reverse) SECL)RITY CLASSIFICATION OF THIS PAGE
ii Unclassified
Approved for public release; distribution is unlimited
VHA Model Review
by
Michele L. WilliamsLieutenant, United States Naval Reserve
BSBA, University of Denver, 1980
Submitted in partial fulfillment of therequirements for the degree of
MASTER OF SCIENCE IN OPERATIONS RESEARCH
from the
NAVAL POSTGRADUATE SCHOOL
March 1990
Author: . 6 ;6 )Michele L. Williams
Approved By: II-/ auraD/ohnson, Thesis Advisor
Donald P. Ga r, Second Reader
feter ue, Chairman, Department ofOperations Research
iii
ABSTRACT
A regression model. is used by the Office of the Secretary
of Defense (OSD) to predict median rents so as to find variable
housing Rllowance (VHA) as a supplement to Basic Allowance for
Quarters (BAQ). These allowances are made for service members
in the continental United States. It is this model that is
reviewed in this thesis. Median rental data taken from the
annual VHA survey are used to test this model. From this
analysis, the model indicates lack of fit, invalid assumptions
and perhaps not even a reasonable'2Yapproach. A more sensible
approach is used to propose two other regression models.
These models are a Weighted Regression Model which, like
the current model, predicts medians; and an Analysis of
Covariance model which predicts or analyzes the mean rent.
More reasonable predictions of median and mean rent are
indicated by these two models respectively.
iv/
iv
THESIS DISCLAIMER
The reader is cautioned that computer programs developed in
this research may not have been exercised for all cases of
interest. While every effort has been made, within the time
available, to ensure that the programs are free of computa-
tional and logic errors, they cannot be considered validated.
Any application of these programs without additional verifica-
tion is at the risk of the user.
v
TABLE OF CONTENTS
I. INTRODUCTION ..................................... 1
A. BACKGROUND ................................... 1
B. CURRENT VHA COMPUTATIONAL PROCESS .............. 2
C. PROPOSED PLAN TO UPDATE VHA COMPUTATIONALPROCESS ...................................... 5
D. DATA DESCRIPTION ............................. 7
E. PROBLEMS WITH THE DATA ....................... 9
F. PURPOSE OF THESIS ............................ 9
II. ANALYSIS PROCEDURES .............................. 11
A. ORDINARY LEAST SQUARES REGRESSION .............. 11
B. INITIAL MODELS TESTED USING ORDINARY LEASTSQUARES REGRESSION ........................... 16
C. WEI.,HTED LEAST SQUARES REGRESSION .............. 19
D. ANALYSIS OF COVARIANCE MODEL ................... 21
E. CROSS VALIDATION TECHNIQUES ..................... 22
III. ANALYSIS ......................................... 23
A. ANALYSIS OF CURRENT MODEL ....................... 23
B. ANALYSIS OF PROPOSED MODEL ...................... 27
C. ANALYSIS OF WEIGHTED LEAST SQUARES MODEL ..... 30
D. ANALYSIS OF THE ANALYSIS OF COVARIANCEMODEL ........................................ 32
IV. CONCLUSIONS AND RECOMMENDATIONS ..................... 35
APPENDIX A SCATTER AND RESIDUAL PLOTS ............. 38
APPENDIX B SAS PROGRAM EXAMPLE ................... 111
vi
APPENDIX C TABLES 1 - 14........................... 118
L IIST OF REFERENCES.................................. 298
INITIAL DISTRIBUTION LIST........................... 299
vii
I. INTRODUCTION
A. BACKGROUND
VHA, Variable Housing Allowance, is a supplement to the
BAQ, Basic Allowance for Quarters, paid to service members who
live in private housing in the United States. VHA is designed
to aid the service member who is assigned to a "high cost area"
of the United States where the median monthly cost of housing
for a person in the same grade or dependency status exceeds 80%
of the national median for members in the same rank or
dependency status [Ref. 1:p. 2-1]. VHA is computed from the
following equation [Ref. 1:p. 2-2]:
VHA = local median housing costs - 80 % of the natibonal (1)by paygrade and marital median housing coststatus by paygrade and
marital status.
The law specifies that each member's VHA allowance will be
determined by the actual housing costs currently paid by the
service member [Ref. 1:p. 2-2]. VHA rates are computed by the
Per Diem Travel and Transportation Allowance Committee Staff,
a subset of the Office of the Secretary of Defense (OSD), with
the aid of the Defense Manpower Data Center, DMDC. The basic
process by which the rates are computed is as follows:
1. Distinct areas in which military members reside aredetermined.
2. Proper sample sizes are determined.
3. Survey samples of housing costs are taken, edited andmedian rents are computed for each category of paygrade,house type, number of bedrooms, and marital status.
1
4. Preliminary VHA rates for each area and dependency statusare computed by determining an estimated median rent foreach category using the GPX program which utilizesvarious regression analysis techniques and smoothingprocedures. (GPX is the name of the model developed byOSD.)
5. Preliminary VHA rates are reviewed to ensure that the V
rates determined by GPX are in line with the costguidelines set by Congress.
B. CURRENT VHA COMPUTATIONAL PROCESS
The computation of preliminary VHA rates for each area
(MHA - military housing area), paygrade, and dependency status
has developed into an extremely complicated process. Once the
median rents are computed for each category of house type,
number of bedrooms, paygrade, and marital status, a count of
the number of median rents per category is taken [Ref. 1:p. 2-
56]. If the number of counts in each category for a particular
MHA is too small then larger sample sizes are obtained by
incorporating median rent information from the same category
from a close, in geographic terms, MHA. (Ref. 1:p. 2-58] This
information, taken from these close MHA's is then weighted.
The closer, in terms of miles, this MHA is to the original MHA
the more weight is placed on the information from that MHA.
[Ref. 1:p. 2-59] A new vector of median rents, incorporating
the information from the geographically close MHAs and
dimensioned by the four categories above is calculated. [Ref.
1:p. 2-59] The underlying reason for finding this vector of
median rents is to find the underlying relationship between
the total pay of a military member and the amount of rent a
2
military member will pay [Ret. 1:p. 2-60]. Let Pijkl = the total
pay for a person in the ith paygrade, in the jth dependency
status who has 'k' number of bedrooms in his or >her home and
an 'I' type oC home. Let Tijkl equal the median rent for
military members in that same group. Then the current
regression model in use is:
1 A + B + Eijkl (2)
Tijkl Pijkl
where rijkl is the error term. Standard linear Regression
techniques are use to est-mate A and B which assume the error
is normally distributed, homoscedastic, and with mean zero.
This in turn means that the distribution of inverted median
rent is normal and homoscedastic. It is not clear that these
assumptions are in any sense "reasonable". In fact if medians
tend to be normal, then the inverse will certainly not be
normal. Let A and B denote the regression estimates of A and
B, respectively. The estimates A and B are used to determine
the estimated median rents, Rijkl through the equation
Rijkl ( Pijkl (3)
(A + B Pijkl)
where Rijk, and Pijkl denote the rent and total pay, respectively,
for paygrade, marital status, number of bedrooms and house type
[Ref. 1:p. 2-60]. Generally, a separate A and B are determined
for the enlisted, company grade officers, and field grade
officer ranks. Thus a separate Rijkl is computed for each one
3
of these three ranks of military personnel. Rijkl is then sed
to determine owner equivalency median rents. Owner equivalency
rents are the rent fig- es assigned to a military member who
owns and does not rent his or her residence. Costs assignedV
to owners are thought not to be appropriate for use in
calculatir VHA since intangible benefits accrue to owners and
not to renters. These owner equivalency median rents are
weighted according to population percentage of owners and are
then incorporated into the vector of median rents [Ref. 1:p.
2-61]. This new vector of median rents, including both owner
and renter information, still has four dimensions and must then
be aggregated to the paygrade and dependency status level.
[Ref. l:p. 2-61] After this aggregation, a further smoothing
process and a denormalization process, the VHA rate multipliers
are finally computed by dividing by a weighted average of BAQ
rates [Ref. 1:p. 2-63]. These multipliers are checked and if
an inversion exists, which for example, is when paygrade 02
receives less VHA than paygrade 01, then additional smoothing
across paygrades will take place. If inversions still exist
after the smoothing process has taken place then the entire
computation of VHA multiplier rates begins again from the point
where data from close, in geographic terms, MHAs is used [Ref.
1:p. 2-64]. Median rent information is then taken from these
MHA's and the entire process is run again and again, up to 11
more times until the rate inversions cease to exist. If after
11 more times an inversion still exists then the GPX program
4
aborts and an inversion in the total population data is
assumed. [Ref. 1:p. 2-64)
C. PROPOSED PLAN TO UPDATE VHA COMPUTATIONAL PROCESS
In an effort to get away from the geographical weighting
of data from close proximity MHA's and in an attempt to
simplify the process of computing VHA rates, the Per Diem
Committee is investigating a new method for computing VHA
rates. Under this "new" plan, survey data from each MHA is
placed into various costing bands based on county rental data
from HUD (Department of Housing and Urban Development) in the
following manner. From each county in the United States, HUD
has data for the average rental costs in that county. A
military housing area is placed into a costing band with other
military housing areas which have the same average rental
costs. Therefore if the computed average rental cost for MHA
A is $260.00 and the median rental cost for MHA B is also
$260.00, MHA A and MHA B would be placed in the same costing
band. The computed median rent figure used in this "new"
process is a single figure found by taking a weighted average
of rental costs, based on number of bedrooms and house type,
from the national military population. For example, if 10% of
the national military population resides in one bedroom
apartments, the average rental cost of one bedroom apartments
for that MHA accounts for 10% of the total average rental cost
figure for that county. Initially the bands will be broken
into groups of $45.00 increments. The costing bands begin at
5
an average rental cost of $260.00 and continue up to $890.00.
There is one further ccsting band which accounts for the
extremely high average rental cost areas such as Alaska which
are so far above all of the other areas in terms of cost. Thus
there are a total of 15 different costing bands including the
"high" costing band. The idea behind grouping military housing
areas together which have similar average rental costs is to
provide more data points to reliably predict median rental
costs per paygrade and dependency status based on the survey
data. Also using an "outside", other than military, source to
group the data provides a small means of getting away from the
military raising its own VHA rates. The "intent of VHA is not
to reimburse the military member for what he or she pays for
housing costs but to enable the military person to live in
adequate housing in whichever area he or she is assigned"' .
The costing bands will be used for two major purposes. One
purpose is, through the use of an appropriate regression model,
to determine owner equivalency housing costs, and the other
purpose is to provide housing cost data when there is
insufficient data in a category to determine a median rent for
that category. Once this needed data is found it will be
incorporated back into the MHA data, and then, within the MHA,
a median rent figure will be computed for each paygrade and
dependency status. This figure will then be utilized in the
congressionally mandated equation, (1), local median rent - 80%
K From a conversation with Debra Davis, DMDC., June 1989.
6
of national median rental cost, to determine the VHA rates for
that MHA. Of course these VHA rates are then subject to
budgetary constraints and congressional approval.
D. DATA DESCRIPTION
The data used to determine VHA rates come from data
collected from military members who participate in the VHA
Survey. The VHA Survey is taken every other year. The data
collected from the survey are kept by the Defe-e Manpower Data
Center which is the repository for all of the data used in the
VHA calculations. The data used in the VHA process consist of
raw survey data taken from each military housing area, and
contain information such as what type of house a military
member lives in, whether it is a single family home, townhouse,
apartment, or mobile home, how many bedrooms the house
contains, whether or not the military member has any dependents
or whether he or she shares the housing costs with another
military member, and the paygrade and service of the military
member. Also contained in the data for each military person
who participates in the survey is the rental cost, utility
costs, and maintenance cost of the housing. Other items such
as social security numbers, whether the member rents or owns
the housing, and other miscellaneous information are also part
of each data record for that particular military person.
The data used in this analysis and taken from the 1989
survey, consist of the paygrade (El-09) and dependency status,
having dependents, single, or single and sharing, of the
7
military member. In addition, the total housing cost for that
member which consists of the rent plus the maintenance cost
plus the utility and insurance costs is used. Further
information on the living space for the individual is also
needed, such as the number of bedrooms (1-4), and the type of
living space, detached house, townhouse type, apartment, and
or mobile home. Additionally, total pay (basic pay + BAQ) has
to be associated with each military member's dependency status
and paygrade in order to perform the regression analysis.
These raw data are edited to reflect only true rental costs not
ownership costs. Thus one data record used in this analysis
consists of information regarding paygrade, house type, number
of bedrooms, dependency status, total housing costs, and total
pay.
From this initial set of data one median rent for each
category of house type, number of bedrooms, marital status,
and paygrade is then computed. Thus data for an individual
costing band which might have consisted of over 50,000 records
is reduced to a data set which contains a maximum of 1104
records which reflects all of the possible combinations of
paygrade, house type, number of bedrooms and dependency status.
SAS was used to extract and edit the raw data, match total
pay to paygrade and dependency status, and compute a median
rent figure for each category of paygrade, dependency status,
number of bedrooms, and house type. (An example of this
program can be found in Appendix B.)
8
E. PROBLEMS WITH THE DATA
There is one major problem associated with the data used
in the VHA computational process. The data used does not
include data from the military members who are in paygrades E5
and above and who share a residence with another person. These
data, which might provide further information and might enable
a more reliable estimate of median rents for a MHA, to be
computed, are not being used. This is a policy decision. This
is a major problem in the computation of VHA rates because one
of the basic reasons for the existence of the "costing band"
idea and one of the major problems associated with the current
manner in which VHA rates are calculated, is the sparsity of
data. This policy decision essentially throws away what could
be valuable and informative data and is contradictory to the
purpose of finding "good" estimates of median rents.
F. PURPOSE OF THESIS
The main purpose of this thesis will be to test the
validity of the currently used regression model equation (2).
The data in its newly proposed format of costing bands will be
used. If the current regression model is not found to be
adequate then the second goal of this thesis is to suggest a
better, more sensible model which will more accurately predict
total housing costs for each costing band. Thus this thesis
will basically consist of two different types of analyses and
will analyze the MHA data from two vantage points. Since there
is no explanation as to why an inverse of rent is predicted
9
linearly by the inverse of pay (equation 2) a more sensible
regression model will be examined to explain the relationship
between total rent and total pay.
A secondary goal of this thesis will be to test the current
and any proposed regression models not only with the data that
is currently assigned to each costing band but also with
fifteen other costing bands comprising of data from the
original costing band plus data from the military members who
are E5 and above who share housing with another person. Thus
thirty costing bands will be formed and a comparison of the
regression models using the data from the original costing
bands and data from the "new" costing bands will be made. This
is important because it may show that the regression models are
better able to predict housing costs with the added information
and this in turn will provide better, more accurate VHA rates.
10
II. ANALYSIS PROCEDURES
A. ORDINARY LEAST SQUARES REGRESSION
Most of the analysis performed in this thesis employs
simple linear regression (ordinary least squares) to test the
various postulated models.
In ordinary least squares regression, a linear model,
Y; = B + BX: + e; (4)
is used to find the relationship between the Xi's (independent
variables) and the Y ,s (dependent variables). The random error
component is denoted by e: and assumed to be normally
distributed independent random variables with mean zero and
constant variance, a2. This relationship as described by B.
and B. is used to further predict or estimate other Y's. Since
B and B, are fixed and unknown, b0 and bI, are used to denote
the estimates of their values [Ref. 2:p. 11]. With the
utilization of these estimators the least squares regression
fitted values are described by [Ref. 2:p. 11],
Y = b0 + b:Xi . (5)
The values for b0 and b, are determined by minimizing
n nS( - B0 - BX.) 2 (6)i=1 i=l
By differentiating this equation first with respect to B, and
then with respect to B,, and then by setting these results
equal to zero and solving for B and B,, the values for b and
b. are found by setting the solution for B, equal to b and B,
11
equal to bi. [Ref. 2:p. 13] The rationale behind this
minimization process is to ensure that the predicted ith value
is as "close" as possible (in Euclidean vertical distance) to
the actual ith value for all i. If the model (4) is correct
these estimates have minimum variance among all unbiased v
estimates. [Ref. 2:p.14] Utilizing the method above, the
value for b0 [Ref. 2:p. 14] is
given by
bo= Y - bI (7)
and the value for bi [Ref. 2:p. 13) is given by
n.E (xi - ) )(Yi - Y)b -= 1= 1 (8)
n2Z (x i - X)2.
i=1
The sum of the residuals squared divided by the number of
observations, n, minus two is given by
n 2Z (Yi- Y )
s 2 = i= 1 1 (9)(n-2)
and represents the unbiased estimator of the variance about
the regression o2 . [Ref. 2:p. 21) if the model is correct. If
a postulated model (i.e., the conditional variance of y givenx) is the true model then c 2 = 02Y. [Ref. 2:p. 23) Thus s 2 is
an estimate of o2 if the model is correct. [Ref. 2:p. 23)
The basic assumptions of ordinary least squares regression
are:
1. E(ei) = 0, V(e i ) =a .
2. ej and ej are uncorrelated, Cov(e i , e4)=O.
12
3. ei is a normally ditributed random variable with meanzero and variance o. Thus the ei's are independent.
4. E(YJX) = a + bX, the conditional expectation of Y givenX is linear in X.
If assumptions 1 and 2 hold then ordinary least squares
provides the best minimum variance linear unbiased estimates
of the B, and B1. [Ref. 2:p. 87] If all of the above
assumptions hold then b0 and b, are the maximum likelihood
estimates of B, and B1 and s is an unbiased estimate of a2.
[Ref. 2:p. 88]
If the residuals are normally distributed it is then
possible to use the F and t tests to test the significance of
the regression and to test the individual null hypotheses that
B, equals 0 or that B1 equals 0. If the null hypothesis is not
rejected and the values for B, and B1 are not deemed different
from zero then, of course, there is no significant linear
relationship between the independent variables and the
dependent variables. The t test statistic is
n(b.-0) (Z (X -X)2)ti= (10)
and has a student's t distribution with n-2 degrees of freedom.
[Ref. 2:p. 26] The F test statistic tests the overall
significance of the regression. The F test statistic is
F b {Z (Xi - R)(Yi- (11)
s2
13
and has 1 and n-2 degrees of freedom. [Ref. 2:p. 32]
The R2 value measures the "proportion of total variation
about the mean Y explained by the regression". [Ref. 2:p. 33]
R2 is the sum of squares due to regression divided by the total
sum of squares, corrected for the mean Y and is denoted by
n -z (_
R= i= (12)nZ (Y. Yi=l
Values for R2 fall between 0 and 1. The closer the value of
R2 is to 1 the better the regression equation explains the
variation of the data about Y.
The "residuals contain all available information on the way
in which the fitted model fails to properly explain the
observed variation in the dependent variable Y" [Ref. 2:p. 34].
Thus careful examination of the residuals will provide
indications as to the adequacy of the proposed model. A
graphic examination of the residuals may provide an indication
that the model is systematically deficient. Also utilizing a
lack of fit test may indicate that the model appears to be
inadequate.
The lack of fit test breaks the residual sum of squares
into the mean square due to lack of fit, MSL, and the mean
2square due to pure error, s [Ref. 2:p. 37] The MS,
estimates a2 if the model is correct and o2 plus a bias term if
the model is inadequate. The value for so2 estimates o2. [Ref.
14
2:p. 37) The lack of fit test compares the F ratio MS,/se2 with
the 100(1-a)% point of an F distribution with (nr - ne) and ne
degrees of freedom where nr equals the number of degrees of
freedom associated with the residual sum of squares and ne
equals the number of degrees of freedom associated with the
pure error sum of squares. If the comparison is significant
(i.e., the F ratio is greater than the tabled F value) this
then serves as an indication that the model is inadequate [Ref.
2:p. 37]. If the test is not significant (i.e., the F ratio
value is less than the tabled F value), this is an indication
that "there appears to be no reason to doubt the adequacy of
the model and both pure error and lack of fit mean squares can
be used as estimates of o2'. [Ref. 2 :p. 37]
By graphically examining the residuals, a scatter plot of
the e4's versus the Yi's will give an indication as to whether
or not the assumptions of normality, homoscedasticity and
linearity of ordinary least squares have been violated. If the
proposed model is correct, the resulting residuals should
indicate that these assumptions hold. [Ref. 2:p. 141) If the
model is correct a plot of the residuals versus the fitted
values should take the shape of a horizontal band as shown in
Figure 2.1 below [Ref. 2:p. 145]. If the plot of the residuals
takes the shape of a funnel as shown in Figure 2.2 below [Ref.
2:p. 146], the variance, o2, is not constant and is increasing
with x, which indicates the need either for weighted least
15
squares or a transformation on the observations Yj before
performing a regression analysis. (Ref. 2:p. 147]
-x
Figure 2.1 Satisfactory Residual Plot(Ref. 2:p. 145]
y
x
Figure 2.2 Unsatisfactory Funnel-Shaped Residual Plot[Ref. 2:p. 146]
B. INITIAL MODELS TESTED USING ORDINARY LEAST SQUARES
REGRESSION
The first step in this analysis was to test the model
currently in use, equation (2), to see if it could be used to
predict median rents for each of the thirty costing bands.
The model was tested under several different conditions.
First, the model was run using all of the available data in
each costing band. Next the data was divided by marital status
16
and within each costing band the model was tested using all of
the data for those military personnel with dependents and then
the model was tested using all of the data for those military
personnel without dependents. The model was tested under
another condition in which the data was broken down further by
paygrades into enlisted, paygrades 1-9, company grade officers,
paygrades 10-19, and field grade officers, paygrades 20-23.
Thus the model was tested within each costing band according
to groupings of the data consisting of enlisted personnel,
company grade officers, and field grade officerc Finally the
current model was tested within each costing band by grouping
the data by a combination of dependency status and paygrade
categories. In this case the data in each costing band was
first broken into groups by dependency status and within each
dependency group, the data was further broken into categories
of enlisted, company grade officer and field grade officer.
For each of the above mentioned conditions in which the
model was tested, the data was plotted 1/T. versus 1/P k, the
model was tested using Ordinary Least Squares regression
procedures, the residuals were plotted versus the fitted values
of the median rents, Tik ' and the residuals were tested for
normality. (These results are given in the next chapter.)
After reviewing the results of the regression procedures,
the initial model did not seem to adequately describe the
relationship between total pay and median rental ccsts nor did
it serve as an adequate predictor of fitted values for median
17
rental costs since the assumptions of least squares regression
were violated. Evidence of this, includes low R2 values, non-
normality of the residuals, unequal variance of the data, and
an indication of significant lack of fit. This, along with
cross-validation results are explained in detail in the
analysis portion of this thesis. Therefore a new model was
postulated. The new model was
Tijkl = PijkiA + B + (13)
in which all of the variables have the same meaning as in the
previous model. The only difference was that the total pay and
median rental cost vectors were not inverted. This model was
tested in all of the same conditions as the initial model. In
other words the model was first tested using all of the data.
The data was then broken into groups by dependency status and
the regression was run again. The data was next broken into
groups by paygrade and ordinary least squares regression was
used to test the model using this data. Finally the data was
broken into groups by a combination of both by paygrade and by
dependency status and the model was again tested.
The results of the regression analysis testing this model
again indicated that a systematic deficiency in the model
existed; namely that the residuals exhibited a tendency towards
nonconstant variance and that the residuals were not normally
distributed. The nonconstant variance is explainable by the
fact that different medians from different population sizes
will have different variances. Thus a weighted least squares
approach was attempted in conjunction with this model.
18
C. WEIGHTED LEAST SQUARES REGRESSION
If a postulated model has been tested using ordinary least
squares procedures and examination of the residuals shows a
nonconstant variance, a need for some type of transformation
on Y is necessary. This transformation will change the ei's
so that the assumptions of ordinary least squares regression
will hold. [Ref. 2:p. 147] Generally a nonconstant variance
among the residuals indicates that some of the observations are
"less reliable" than others. 'Ref. 2:p. 108] In this case the
e's are normally distributed with mean 0 and variance o,2
instead of o2. Thus the ei's have variance of vio2. To combat
this nonconstant variance term, via2 , the entire regression
equation
Y, = b0 + bX; + ei (14)
is multiplied by the weight, vi"2/2 Thus the regression
equation becomes
Yi= b0 + b1 Xi + ei (15)
Then E(e./v,i)= 0 and the V(e/v-7) E(ei2/vi) = vo2/v = 0.
Thus ei//V-i' N(0,o 2). Therefore the assumptions of ordinary
least squares will now hold and ordinary least squares
procedures may now be applied to the transformed regression
equation.
Evidence of nonconstant variance was seen in the residual
plots after OLS regression was applied using the model (13)
for most of the costing bands. This implies, as stated above,
that some of the observations were less reliable than others.
19
Intuitively this makes sense in this problem since each
observation represents a median cost and not an individual
cost. Thus some observations represent the median of 20 or 30
data points while other observations represent the median of
only 5 data points. This makes the median of only five data
points "less reliable" than the median of a data point which
represents 20 or 30 data points.
In order to transform the model into one in which the
assumptions of ordinary least squares holds a weight vi-1 2 must
be found. In this case the necessary weight is i/si where
Si 1.25 Ri (16)
1. 35f
This is the Gaussian-based approximation (Kendall and Stuart,
1967) of the standard deviation of the median. [Ref. 3:p. 16]
Ri equals the interquartile range for the ith subset of data
and ni equals the number of data points comprising that median.
The reason for this is that if x is N (p,o) then the median is
N(pC -a). From the normal table, for normal distributions,
S2n
IQR = 1.35o thus
S IQR - 1.25 Ri (17)
21.35 1.35
Since the variance of ej = oj2 and since s is an estimate
of oi if we transform the ei's into ei/s the variance of ei/s i
should approximate 1. The variance of the transformed ei's is
now estimated to be one and is thus approximately constant.
Accordingly, the predictor will have more neatly constant
20
variance. Therefore this assumption of ordinary least squares
hold and OLS regression procedures are more appropriately
performed on the transformed model.
D. ANALYSIS OF COVARIANCE MODEL
The results of using a weighted least squares approach
with the transformed model, equation (15), indicated that this
was more sensible than using ordinary least squares, however,
another approach also seemed plausible. Analysis of Covariance
(ANCOVA) was used in which the grand mean rental cost is
adjusted within each group of paygrade, number of bedrooms and
house type by the rental cost which is determined by these
factors. Thus the ANCOVA model would becomeYijk o X0o Xijkiik + eijk (18)
~ijk XE ij +'
in which the X0B0 term is the grand mean, the Xi kBijk term is the
total pay for each group of number of bedrooms and house type.
The Y;,k term would represent rental cost for each ith person
dimensioned by jth type of house and the kth number of bedrooms
in the house. This model differs from the previous model in
that instead of using medians of total pay within groups of
paygrade, house type, bedrooms, and dependency status to
predict median rent, the model used the total pay of each
individual person in a costing band and the deviations caused
by differences in house type and number of bedrooms to predict
rent. Thus, in this case, total pay becomes the continuous
variable and house type and number of bedrooms become the
categorical term. Paygrade and Dependency status were not used
as class variables in this model since total pay adequately
21
reflected their values. Their inclusion would cause
collinearity to exist among the variables and the regression
estimates would then be biased.
E. CROSS VALIDATION TECHNIQUES
Since the weighted least squares approach with the model
(15) and the ANCOVA approach (18) using all the data, not the
median data, were thought to be the most sensible, a cross
validation technique was used in each case to test the
parameter estimates and the models. For the weighted least
squares model half of the data was used to determine regression
coefficients and these coefficients were then used with the
other half of the data to calculate new fitted values. These
values were then compared to the actual observed values to find
estimates of slope and intercept. The equation
n 2E (Y; - Y) (19)i=1l
is the residual sum of squares. These values for sum of the
squares of the residuals were compared for each half of the
data within each of the thirty costing bands for the weighted
least squares model. For the ANCOVA model, no provision in SAS
was available for the above described cross validation so the
data for each costing band was randomly divided in half and the
parameter estimates of the coefficients and its standard error
for each half of the data were compared (See results in
Analysis chapter).
22
III. ANALYSIS
A. ANALYSIS OF CURRENT MODEL
The current model, equation (2), was run using OLS
regression procedures with the data from the thirty costing
bands, fifteen of which contained data as specified by the Per
Diem Committee and fifteen which contained the additional data
obtained from those military members who are in paygrades E5
and above and who share their residences. The results of the
regression analysis indicated that this model was suspicious
in that it did not adequately fit the data, and would therefore
perhaps not produce an adequate prediction of median rent based
on total pay.
Initially the current model, equation (2), was run using
all of the available data within each costing band. The data
was plotted, median rent versus total pay, for each costing
band. A spread in the variance of the data was seen and in
some instances a curve was present, indicating a nonlinear,
instead of linear type of relationship (See Appendix A). The
regression analysis results as seen in Table 1 (See Appendix
C) showed that in twenty-three out of twenty-eight cases the
model had a significant lack of fit. (The data from the other
two costing bands contain only two data points and regression
analysis is not valid in these two cases.) The residual plots
from each of these regressions also exhibited evidence of
nonconstant variance which was a further indication that the
23
model was inadequate. (These residual plots can be seen in
Appendix A.) The regression results from the costing bands
which did not exhibit a significant lack of fit did, however,
have residuals which had a nonconstant variance and were not
normally distributed. Also the R2 values in each of these
cases were extremely low (less than .32) which again served as
an indication that the model only explained at most a third of
the variance.
The data within each of the thirty costing bands was then
broken into two groups according to dependency status. The
"zero" group within each costing band contained the data from
those military members who had dependents, and the "one" group
contained the data from those military members who claimed no
dependents. The regression model, equation (2), was run again
using these new groupings of the data. The results of the
regression analysis again indicated that this model was
entirely inappropriate. Although there was not one case of
significant lack of fit, the residual analysis of the data, as
seen in Table 2 (Appendix C), from twenty-six out of twenty-
eight of the costing bands, illustrated that the residuals were
not normally distributed. The residual plots (Appendix A)
again show nonconstant variance. Two costing bands, the "zero"
labeled data from both costing bands 510 and 512, while
indicating that the residuals were normally distributed and had
constant variance, not showing significant lack of fit, and
according to the F test for significance of the regression
24
exhibiting evidence of a significant regression, had low R2
values of less than .500 which indicates a lot of unexplained
variance. In this instance, with the data broken into groups
by dependency status, the model again was inadequate.
Next the data within each of the thirty costing bands was
broken into groups according to paygrade. Paygrade 1 consisted
of the data from military members who are in paygrades El to
E9. Paygrade 2 consisted of the data from military members who
are in paygrades W1-W4, 01E-03E, and 01-03. Paygrade 3
consisted of the data from military members in paygrades 04-
07. Data from paygrades 08 and above are included in the data
for paygrade 07. The model, equation (2), was again tested
usi:-g this data. With the data from the costing bands broken
into groups in this manner there were 84 different cases in
which the model was tested. In fifty out of eighty-four cases,
as can be seen in Table 3 (Appendix C), a significant lack of
fit was found. Of those thirty four cases where there was not
a significant lack of fit, twenty eight of them had residuals
which were not normally distributed and had residual plots
which showed evidence of nonconstant variance. The six cases
which showed no evidence of lack of fit, and which had
residuals which were normally distributed, namely costing band
632 paygrade 3, costing band 530 paygrade 2, costing band 590
paygrade 2, costing band 570 paygrade 3, costing band 650
paygrade 3, and costing band 510 paygrade 2, all had R2 values
less than .330. Thus once again there was strong evidence that
25
even in this case where the data was broken into groupings
according to paygrade the model was inadequate.
To further ensure that the model was tested under all
appropriate conditions, the data was broken into groups first
by dependency status and then further broken into groups by
paygrade. Thus the data from each costing band was broken into
"zero" or "one" groups as defined previously. The "zero" or
"one" groups were then broken into further groupings according
to paygrade. Thus the "zero" group, for example, was broken
into three further groups, paygrade 1, paygrade 2, and paygrade
3 also as previously defined. Therefore each of the twenty
eight costing bands now has two dependency status' and within
each dependency status three paygrades associated with it.
Thus the model was tested using 168 different sets of data.
The results of the regression analysis, using each of these
different data sets, can be seen in Table 4 (Appendix C). At
an alpha level of .05 three out of the 168 data sets showed
significant lack of fit. Of those data sets which did not show
a significant lack of fit 105 had residuals which were not
normally distributed and which had residual plots which
exhibited nonconstant variance. Of those remaining sixty sets
of data which show no significant lack of fit and normally
distributed residuals, nineteen of them did not have
significant overall regressions according to the F test at an
alpha level of .05. Of the remaining forty-one data sets which
did not show significant lack of fit, which had normally
26
distributed residuals and residual plots showing constant
variance (Appendix A), and which had regressions which were
significant according to the F test, all had R2 values which
were less than .440. In fact all but four of these remaining
data sets had R2 values which were less than .220. Thus this
analysis indicates once again that the original model was
woefully inadequate and that in none of the cases where the
data was broken into groups according to dependency status, or
by paygrade, or by a combination of both would this model
adequately predict median rent based on total pay. An adequate
model would be one in which there was no lack of fit, the
assumptions of Least Squares Regression would hold, and the R2
values would be high indicating that the model explains the
variance of the data.
B. ANALYSIS OF PROPOSED MODEL
The proposed model, equation (13), was tested using the
same data from the thirty costing bands as was used to test the
current model, equation (2). The results of the regression
analysis indicated that in certain cases the use of this model
may be more adequate in predicting median rent from total pay;
however it must be used with caution.
This model, equation (13), was also tested using the same
groupings of data as used in testing the current model,
equation (2). Initially, the model was tested using all of the
data within each costing band. As in the previous model median
rent versus total pay was plotted. The plots indicated an
27
increase in variance but indicated a strong linear relation-
ship. The results of the regression analysis showed that in
all twenty-eight instances, see Table 5, a significant lack of
fit was evidenced. Next the data within each costing band was
broken into groups by dependency status. The data was labeled
with a zero if the military member had dependents and the data
was labeled with a one if the military member had no dependents
or had no dependents and was sharing his or her residence. The
plots of median rent versus total pay for each costing band
indicated an even stronger linear relationship than in the
original plots but they still exhibited evidence of unequal
variance. The results of the regression analysis, see Table
6, showed that in eight out of fifty-six cases a significant
lack of fit was evidenced. Of the remaining forty-eight cases
twelve of these had residuals which were not normally
distributed. The residual plots of these data sets showed that
nonconstant variance was present. The residual plots of the
thirty-six cases which did not have significant lack of fit,
which had residuals which were normally distributed, and which
were significant regressions at the alpha level .05, also
showed some evidence of nonconstant variance. Also, the R2
values were in the .4 to .5 range with the highest a value of
.55. These R2 values are lower than the ones obtained with the
use of the Weighted Least Squares model, seen in the next
section, whose purpose is to reduce or eliminate the
nonconstant variance of the residuals. Thus prediction was
28
worse for residuals with more variance. See Appendix A. The
data within each costing band was next broken into groups by
paygrade. This procedure was the same as the one used in
testing the current model, paygrade 1 reflected paygrades El-
E9, paygrade 2 reflected paygrades WI-W4, O1E-03E, and 01-03,
and paygrade 3 reflected paygrades 04-07 with paygrades 08-
010 included in paygrade 07. When the data was broken into
these groups there were many more, fifty-six out of eighty-
four, see Table 7 (Appendix C), cases of significant lack of
fit. Also because of few data points within each group, the
overall regressions in many instances were not significant.
Finally the data was broken into groups first by dependency
status and then by paygrade. The results of the regression
analysis indicated that while there were only eight cases of
significant lack of fit, see Table 8 (Appendix C), out of one
hundred and sixty-eight, thirty had residuals which were not
normally distributed and because of few data points within each
group, some of the data sets did not have significant
regressions, at the .05 alpha level. Of the regressions on the
data sets which did fulfill all of the criteria the R2 values
were low. Thus the model best predicted median rents from total
pay when the data was divided by dependency status, however,
this model must be viewed as possibly inaccurate since the
residual plots indicated evidence of nonconstant variance, and
a better model would predict points in an unbiased fashion.
29
C. ANALYSIS OF WEIGHTED LEAST SQUARES MODEL
Analysis of the Weighted Least Squares Model, equation
(15), with Yi = median rent and X = total pay for the ith
group, was conducted in the same manner as that of the current
model, equation (2), and that of the proposed mnodel, equation
(13). The only difference here was that initially the data
were randomly divided into two sections in order to use cross
validation procedures to compare the sum of the squares of the
residuals of the first division of data to the sum of the
squares of the errors of the second division of data in which
the parameter estimates from the first set of data were used
to compute the predicted values for the second set of data.
Thus the Weighted Least Squares model was first tested using
one half of all of the data available within each costing band,
next the model was tested by the half of the data which had
been divided into groups by dependency status, then the model
was tested by the half of the data which had been broken into
groups by paygrade within each costing band, and finally the
model was tested with half of the data which had been broken
first into groups according to dependency status and then by
paygrade.
The results of the regression analysis using half of all
of the data within each costing band showed (see Table 9,
Appendix C) that a significant lack of fit existed for each
costing band. When the data was broken into divisions by
dependency status the regression analysis results, see Table
30
10 (Appendix C), showed that seventeen out of fifty-six cases
exhibited significant lack of fit and that nine out of the
thirty nine remaining cases did not have normally distributed
residuals. Three out of the remaining thirty cases did not
have regressions which were significant overall and of the
remaining twenty seven cases in which all statistical criteria
were met, the R2 values were typically between .44 and .75.
There was no evidence of nonconstant variance in the residual
plots and they seemed to appear to have been normally
distributed in most cases.
When the data was broken into groups by paygrade, only
twenty-five out of a possible eighty four cases, see Table 11
(Appendix C), met all of the criteria of successful regression
in that they did not have significant lack of fit, their
residuals were normally distributed, and their regressions were
significant at the .05 alpha level. The R2 values, however,
ranged from very low to a high of .73. Again the residual
plots appeared to indicate a fairly normal distribution with
little evidence of nonconstint variance.
The results of the regression analysis, when the data was
broken into groups both according to dependency status and
paygrade, see Table 12, showed that better than half, 93 out
of 168, met the criteria for a successful regression and had
R2 values ranging mostly between .4 and .65. There were
however, very few data points in some categories, thus these
results must be viewed with suspicion. The statistics for lack
31
of fit, normality of the residuals, and overall significance
of the regression all might have been affected by this small
number of data points. Therefore this model using a weighted
least squares approach, equation (15), performed best when the
data within each costing band was divided according to
dependency status.
The cross validation technique used here proved to be
unsuccessful since only the sum of squares of the residuals
(SSR) term were compared, see Table 13 (Appendix C), in the
case where all of the data was used within each costing band.
The differences between the SSR for the first group of data and
the data with predicted values found by employing the parameter
estimates from the first set of data for each costing band were
quite large. This could be due to the lack of fit which was
found or due to the fact that the second group generally had
several more data points than the first group. Either of these
two factors or a combination of both might have accounted for
these tremendous differences.
D. ANALYSIS OF THE ANALYSIS OF COVARIANCE MODEL
The results of the regression analysis on the ANCOVA model
indicated that this model may be the best model discussed thus
far for use in predicting rent based on total pay (see Table
14, Appendix C). All of the regressions were significant and
had R2 values ranging from .42 to .58 with few values above or
below these numbers. The residup! plots, normal plots, and
stem and leaf diagrams indicated that the residuals were
32
normally distributed (See Appendix C). The significance levels
of the normal statistic used to test the normality of the
residuals, however, did not, in most cases, indicate that the
residuals were normally distributed. However the residuals
were fairly symmetric and the sample size was quite large,
therefore the model should be fairly robust to the lack of
normal fit. The residual plots showed the fairly typical box-
like pattern of randomly distributed data. The stem and leaf
and normal plots supported a fairly good defense for the
normality of the residuals.
In the case of several of the costing bands there did not
appear to be a significant difference in the least squares
means of the rent pertaining to different house types and
different number of bedrooms. This was particularly true
between house types 1 and 2 (single family home and townhouse)
and also between house types 3 and 4 (apartment or mobile
homes). In some costing bands there also appeared to be no
significant difference between the least square means of rent
predominantly in the case between 3 and 4 bedrooms and less
predominantly with 1 and 2 numbers of bedrooms. This
indicates, that, when there is not a significant difference
between the least squares means between two different types of
housing or two residences with different numbers of bedrooms,
either of the parameter estimates of two types of housing or
number of bedrooms may be used to predict rent. Thus the
ANCOVA model which predicted rent based on the total pay
33
associated with number of bedrooms and house type may not have
been completely correct in these cases since the mean amount
of rent associated with each type of house or number of
bedrooms may not have been different.
The cross validation technique used here, since GLM does
not provide a vehicle to compute the Sum of Squares of the
Residuals from previously calculated parameter estimates, was
one in which the data was randomly divided into two sections
and after the ANCOVA model was run on both sets of data, the
coefficient of the slope parameter estimate and its standard
error were compared. A comparison of the slope parameter and
its standard error between the two sections of data from each
costing band revealed that the model was not at serious fault
since in both of the sections of the data the slope parameter
estimates were very close and the standard errors were small
and similar (See Table 14).
34
IV. CONCLUSIONS AND RECOMMENDATIONS
The purpose of this thesis was to test and validate the
current model, equation (2), to see if it could effectively be
used to predict rent based on total pay from the survey data
which had been arranged in a newly devised, simplified format.
If the current model was deemed invalid or suspicious, then the
second purpose of this thesis, was to propose a better, more
sensible model which would adequately predict rent based on
total pay.
There are two major conclusions from the analysis conta4ned
in this thesis. The first conclusion is that the current
model, equation (2), should not be used to predict median rents
in each paygrade and dependency status when the data is divided
into costing bands in the manner previously described. This
conclusion is justified by the results of the regression
analysis which show that this model is inadequate and may not
accurately predict median rent. The second conclusion is that
both the weighted least squares model and the ANCOVA model are
possible alternative models for use in predicting rent based
on total pay. They are shown to be at least as reasonable as
the current model, if not better. The ANCOVA model may be
preferable for predicting mean rather than a median rent. Also
the ANCOVA model may be preferable if the model is used to
determine owner equivalency rents. If a median rent figure
must be used in the congressionally mandated formula for the
35
computation of VHA the weighted least squares model is
preferable.
The secondary purpose of this thesis was to determine if
the data from military personnel in paygrades E5 and above who
share housing should be used or discarded since these data had
been previously discarded on the basis of a policy decision
without any statistical backing. Curiously enough, there seems
to be no systematic difference across all of the models
investigated in relation to the addition of this data. In some
instances when regression analysis results from the same two
costing bands, one which contained the additional data and one
which did not contain the additional data, were compared, lack
of fit was affected. Also in some cases the significance of
the regression would be affected, or in some cases the R2
values would go up or down. Thus there was no instance in
which, for example, all of the R2 values would go up or all of
the significance of regression statistics would suddenly
increase or decrease for a certain model. The important
consideration here was that the additional data did affect R'
values; it did affect the lack of fit, significance value
statistics, and the normality of residuals. Thus while the
additional data did not have a systematic effect, it did have
an effect and this aspect should not go overlooked when a
decision is made whether or r:t to include these data when VHA
rates are actually calculated.
36
There are several recommendations for further analysis.
First, the way in which the data is broken into costing bands
must be investigated. Perhaps a better method or a different
dollar figure could be used to divide the data into costing
bands. If a different method is used and the data contained
in each costing band is different, analysis of each of the
regression models discussed in this paper must be redone. If
the data is put into different costing bands other than the
ones used in this thesis, the models discussed may be more or
less accurate predictors of median rent. In either case the
original data must be investigated and natural breaks in the
data must be discovered in order to achieve the best placement
of data into costing bands. A second area which requires
further analysis concerns the ANCOVA model. The data, before
testing the ANCOVA model, should be divided into groups either
by dependency status or by paygrade. A better fit of the
regression model may be accomplished in either case. Other
models should also be investigated as possible solutions to the
problem. Perhaps instead of the weighted least squares,
another transformation on the data could be devised which may
provide a better model. Since there is an indication of non-
normal errors, perhaps GLIM (Generalized Linear Models) could
be used for more accurate prediction [Ref. 4]. Further
Analysis and other models should still be investigated as
possible predictors of median rents for the VHA.
37
APPENDIX A. SCATTER AND RESIDUAL PLOTS
A. USING DATA SET 540 AS AN EXAMPLE, SCATTER AND RESIDUAL PLOTS
FOR THE CURRENT MODEL.
PLOT OF 0MCCST TOTP .ECE6D: A I OBS. B O 2 OBS. ETC.
009
2226. A B
2.203_A BAA
AA A1 AA A
-0 A A
0 00B A2.0. A C A A A A ABAB A A A AB A BA
A B A AB A A A AA A A B
Ar . A Ae 40A i/ aOntB.2/B A C A B A
S ABA A A B A B A A
A A C 2 2AAl A l AC A ABAB BA A AB
A AA *B Al fEAB0
- 000 A A B
0000
Figure 1. Data Set 540 1/Median Rent vs. 1/Total Pay.
38
PLOT OF RESIZ':X.STHT CEND: A L 25. B 2 0BS. EC.
208
AA
':6 A
AA
? A
A A
A A A
F~~~~~~ A~tBA. *B
A A
AA
_ C ....... ..... ........ ..
- b z - B BFge.a SB 54.Asdul C A Acd u
9B A B B
0012 405 "is ).30 0024* 0223 33 030 0033 0036 2.389 3.PREDICTED VALLE
Figure 2. Data Set 540. Residuals vs. Predicted Values.
39
NSHR-0
F Or C :X-5:7:07 'SCROD: A 2BS. B : BS. E::,
CA
A A BA A
C AA BC CA8 A B 1 A 8
oc:A ABECS002 B A
A A A B A9B B
*C. A A
C 0050
000
Figure 3. Data Set 540.Dependency Status '0'.
1/Median Rent vs. 1/Total Pay.
40
.4BHR* I
CFCM0CBCS :CP LEGEND: A 1 CBS. b 08B. TC.
:9
A A
A A
0.004 A AA A A aA A A
A B A A AA
AC I A b A DA B' , -aBA S B A A B A 5 A
S A2A S DO D DAF r .S A ta b A 4A
B. BC .A S ,BB A A ' A A A 3aBn B A A C B A BtA
C A AB A A AAA A A A
3:11
C 00
0.30001 0.0003 0.0003 0 0004 0.000B 0.00 000 0.0006 .0009 20010 C01
ITO?
Figure 4. Data Set 540.Dependency Status '1'.
1/Median Rent vs. 1/Total Pay.
41
PLOT OF RESIDt!MCSThT LEGEND. A I O0 . B O 0BS. ETC.
3 003.5
0,00,0
)0035
3.0030
,.0025
30015 A4A
A A
0.000A A A A
A A A A* A B A0 0000 A A
A A A A CA A A I A .B A A A S A
B C A BAB 5 :B A AA CA VS BA CA A A
3. 0000 .................. .A ........ B-B . B-AB .... B .......... ......... A... .............. .B B
A BA AB AA AAA B A A 0 A At A A A A A A A
-0.3005 A A AA A B A 5 a
AC A A A
-0.0010
-0. 0013
-0. 0020
FPUOCTU VA3.Ut
Figure 5. Data Set 540.Dependency Status '0'.
Residuals vs. Predicted Values.
42
NSHB: ],
r tS::oMCsm ;r£OoN Do A : B. B : CBS. ETC.
:308
0C6 A
3 33A
D.003
A
-? 000 -A_1A- - - - - - -- - - - - -- - - - - -
B A:C.A A 1A b A A 19
AA A A B
A000 A S B
• 0,003
-A 004 A
-0.005 B. C- B--
PRD..I CMID VAL UE
Figure 6. Data Set 540.
Dependency Status ' 1' .Residuals vs. Predicted Values.
43
P7'CT CF IMCOST0ITOTP SOCEND. A I CBS. B 2 CBS. ETC.
AA
AA
B SA
000. A A
B AA B A AB
A AA A A AG
003 A B A A A I AA A A - A B S C C ~ B :A B ,.B AS 2
0 002 BA C A E B A B B E B A0 A C 3 B B
IA - SBA AA B
: O
0 o060 ................ .. ............. ................................ ................................................. ".. 66i6PC. 000325 0.S0000 0.000475 0.0005B0 0.00025 0.000700 0,000575 .000850 0 000925 C C:10O C 00106;
irma
Figure 7. Data Set 540.Paygrade '1'.
1/Median Rent vs. 1/Total Pay.
44
2 '36
23..
A
AA
A A A A AAA A A
A A
ITOT?
Figure 8. Data Set 540.Paygrade '2'.
l/Median Rent vs. l/Total Pay.
45
F AMOT T END .A 5 !S. E.C.
0 0CZ:
0 020 A
A A
AAA b*
A
AAA A *
O 0010 A A0 000150 00012 lo0 10 0.000092 0.00 0 o 0 000225 660i0 000252 0.000270 co0as ;00]¢s
ITOT?
Figure 9. Data Set 540.Paygrade '3'.
1/Median Rent vs. l/Total Pay.
46
?L.:- :-csRES' ' SH i BS. B * .B5. ETC.
--,1 --
3 A
00 .. .... .... ...... ........... ... .... ..." -........ ..... .. ....... ....... i...... ......... ......... ...... .. ... .. ....
A B
- 0i. 0 0020 b. 0022 0 002", 0 0026 ; 0028 1. 0030 0 0032 D,003., o 0036 . O : .
PR(EDI CTED VALUE
Figure 10. Data Set 540.Paygrade '11.
Residuals vs. Predicted Values.
47
.F M:ts: F. E12;W. A 280B. B C55 E-:
A
aAS:C36 AA A I
A
SA A A A
5 A
A2 ;2 AA
2 2 A002 . . A A
A A
A A A AA
0 0002 AA
B AA2 33A A
A A0A A•
A A A A
A A A
-'20002 AB00 AA
-0 0 '. 00i80G .. . . ...... 0--0---0 .......02 00 0 0 7 0 2 G 0
PIZDICTX VAI.XE
Figure 22. Data Set 540.
Paygrade 121.Residuals vs. Predicted Values.
48
?3 3 8:-:T- 0ESD0'!S147 3EED A- 'B5. B 2 . E-C.
)2z8
0016
301.
A.00-
AAAA
A
330 2*
A AAA
A A
-33306 *
* 0 0003C. 00CO 0 0012 75 0,CSS 0035 .001425 3 01500 300375 001650 0. 001725 3080 037
PIEDICTD VALUE
Figure 12. Data Set 540.Paygrade '3'.
Residuals vs. Predicted Values.
49
- 2- ST-:TCTP -':E' A B . Z B E7C
AA
AA A A
A A' A
... .. ...o.. .. .. .. ...... .57 6 i............. ........ .... ........ 7 6 6 .... 2 6 g i .... 6 ~ ~ 6. .. .......;.... ...66 6........... ........
A020 A 0027 3 D030 ,002 -1
Figure 13. Data Set 540.Dependency Status '0' and Paygrade 11
l/Median Rent vs. l/Total Pay.
50
.... . 3m 2~m m m
71.- : :MOS -:T7F --END A 'BS. - B .!C
* AA A
A AB
A AA
A A,
0.0017 A AA
A A
A A
1013 -A A A
0 00030 0.00033 ,00036 _ 00039 o 0' 0* 0* 03 03 03 .06ITOTP
Figure 14. Data Set 540.Dependency Status '0' and Paygrade '2'.
1/Median Rent vs. 1/Total Pay.
51
SHRIC P-S.3
C-:F9 A
D. COC
001920
- 001'.
€A
,'. CCO
0.0019
0.0010
C.0012 A
A
A00020 A A0.0011
0,0010 A0. 000116 C 000190 0.000204 0 O00213 0 000222 0 000231 020004 0 2 000249 0.000215 0. 000250 0 000076
ITOTP
Figure 15. Data Set 540.Dependency Status '0' and Paygrade '3'.
1/Median Rent vs. l/Total Pay.
52
..CT CF RESI'-IMCSIHT --CEND: A " BS. B 2 CBS. Z'C.
C. 002
20030
n,
0 000 A=
A A
SAA B0000 A
A
A SB A A
B.Gcoo ..... . . . . . ...... B ..... . . . A ..... A.... ... ....... .0.0............B...................
A AAA AAA C A A-COCOA A B
A BA
200010
0 0020CO00160 OSO 009 OB 0 .00205 0200220 2 00235 2 00250 C.00265 2 00280 C 09
PREDICTED VALUE
Figure 16. Data Set 540.Dependency Status '0' and Paygrade '1'.
Residuals vs. Predicted Values.
53
010:~ 00H % : A - .010. B -BS. E7'
3 0015
0 009
A
o0002
- 0007
0.000o 006
C. 0005-
A A
0,0001 8
A A
........................... ......---- .................................... ---....... ......... ... ....... ......... ...........0.0001 AAB
A
0 A Ba
A
BA A A A-0 0004. BA AA
A A
•00000 A A AA B
.0,0006 A
-0 0007 A0.00170 010017 0.00171 0.0019 00.00156 0 0010 i 0002 000198 01002002 0C206
PREIICaED VALUE
Figure 17. Data Set 540.Dependency Status '0' and Paygrade '2'.
Residuals vs. Predicted Values.
54
-LCT -F RES:D :-.CSTTiS B ^BS. T::
02
G005
A
00010032
2 2002 A
C 3004
A *
-o0005
-2 0006
-0. 0008
0.00120 C0012'. J0010 0.00132 0.00136 0 00140 C 00141 .04 0,00152 0 00156 -. 00160 '0016Z
PMIDOCTED VALVI
Figure 18. Data Set 540.Dependency Status '0' and Paygrade '3'.
Residuals vs. Predicted Values.
55
,LC !5 :-C7P -EMEN BS5. 1 LB3. E7
006 3
A A
B
B^
A A
AA
00,0 A A
-0000.3000325 02000400 0.000O475 0. 000350 0.000625 . 3000 0.000775 0.00COI0 C0009,15 50.300 2.00,65i
,TOT?
Figure 19. Data Set 540.Dependency Status '1' and Paygrade '1'.
1/Median Rent vs. l/Total Pay.
56
. 3036
0 2032
0 0330
A A
2 AA
A
A
F AA A
A A
A
2.0018 A A
AA
0 00" A *
-O3C14
A
20010
0,0000
tTOT?
Figure 20. Data Set 540.Dependency Status '1' and Paygrade '2'.
l/Median Rent vs. 1/Total Pay.
57
FLOT '; B B5. E7:
1 0036
-003'.
0. 0,030
0030
O 0C28
'4 0.0-024
D 0022
0 0020
A0,0016
0.0014
A
012 A
J 00100.0001 .00060 . .08i0 0 .0001 .000 0 0.00ii..6060ii6 . . 6 ii0 66 6 .006i . . 0680ii6 . 0 8
ITOT?
Figure 21. Data Set 540.Dependency Status '1' and Paygrade '3'.
1/Median Rent vs. 1/Total Pay.
58
308
S007
: 06
:0.
i : : : ...*
3 A00. . . ------------------------------ --------- ------------ -------- ----- -----------A...............
A B A
-0000 Is0 A
A A
00.
-0 005
0.0020 0-0023 0.0026 a 0029 0032 0.0035 0 0036 01.30*0 0.0044 1.7 0 300
PUD0C10D VA'
Figure 22. Data Set 540.Dependency Status '1' and Paygrade '1'.
Residuals vs. Predicted Values.
59
ASKR, P0-r'.C1 -F RES.: V1-M000 500N . OBB*2 6. E::
2. 0C -
2 0226 A
)6 1x
A 0 A
" 200. AA
2 2000 A A
B
^.O^
..0000 ............ . . .....................................................
~AA A AA
AAA, A
- 0002 - A AA
A A A
AA
-0.0006A
-0.0008
-0. 00100. 002000 ,0020005 0 002000 0. 002226 0 002300 002376 0 00040,0 2 002000
ILOEICTED VALUE
Figure 23. Data Set 540.Dependency Status '1' and Paygrade '2'.
Residuals vs. Predict - Values.
60
P*-'- OF RES::.:-:oCT :-,~ -BS. B 2 :BS. ET:
3008
B B'002. A
0006
0008
.0,000
0.00135 0.0.' 00153 0.0016z J.00171 &1 8 -00109 00198 OCZC7 00
PREDI CTED VALUE
Figure 24. Data Set 540.Dependency Status '1' and Paygrade '3'.
Residuals vs. Predicted Values.
61
B. USING DATA SET 540 AS AN EXAMPLE SCATTER PLOTS AND RESIDUALPLOTS FOR THE PROPOSED MODEL.
?LP T :F M CST'TOTP 2EGEND A DBS B - CBS. ETC.
-630 A
:.00
:8@0
.Zoo
I 00
A2200 A
BA AS *A A
go
A * A A A A AXAC A A AA A A
AA B1AAA A A AA A ADB B A A BA
600 B A A A A B 0C0 BAc0A A CCA S BE ?p A B B ?Ab A A B A
AB B A~I AA AA A A C o* XACA 1~~ACA A A AABBA 3
FAAB B BE AB A A.0 B B A: B A AA -Z ACE BA B A AAIAB; 20 E A 'B AB I A A A BA
BBA" SA.B A A A B,SAA; A A A A
A Ac: A:00 A
Al
5O0 1530 :500 3000 COO CB 500 .000 -B0B 5000 5B00 "000 ABAC,
TOT?
Figure 25. Data Set 540. Median Rent vs. Total Pay.
62
PL C O DF - ',.DD1CSTH O..GCND A : B. . : BS. E7C.
'-000
* AA A B
AAA 7 A A A
AB AB 8 A AA AAADA E AA~j A
A A,-0 B .AA ADOC CA A A A A.... .- - -- D-BB-A$ - -C---BB ..... ....................AZ := :B: :,E SA- . DA B CABAl A AD B A
A A A C .AAB B ADF .. C BAAA A B BBBD A XA B A A
A0 A A A A*BA A A
.1000
PIEDOC'ED VALUE
Figure 26. Data Set 540. Residuals vs. Predicted Values.
63
%SHR G 0
F* o7: MCB1:7n LEGENID. A BS. B Z BS. ETC.
AA A
AA
A A A
A A A
BAA A A
8 1 1 5
7 A
-' 'A A BAA!0B A BAAB A A A A
A B A A A AA 3C A I A AIB
zO A A A A B A'AB A A ~ A
A A B A - A BAA B A G A A
A AA - A A A
A A A
AA
i.000 1400 3.S0 21.00 Z600 3000 3400 3$00 4,200 46 00 5000 5.00 5O
TOT?
Figure 27. Data Set 540.Dependency Status '0'.
Median Rent vs. Total Pay.
64
NSHB 1FL-T OF iTCSCT0CfTP ZOO;D: A BS. 0 CBS. ETC.
:600 A
2I400
i:00
000
A000 0AA
A A B600 B A C AA B 3I : I b A
A BA B A A A A
A B B B EA w 5 BA A B-00 A B E 0 BA A
A A AA C 0 20 B B 5 A A
A 60 0 B A SB B B AAA A A
AC B
Iooo0 ..................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
500 1000 00 00 00 00 00 O 0 00 00 0 0
Figure 28. Data Set 540.Dependency Status '1'.
Median Rent vs. Total Pay.
65
NSHR-0PLOT OF RE£5:- CSTH LEGEND, A 0 03S. B 2 OBS. ET:
-20AA
A A
A AA A
A:50 A A
A A AA A A
A AA A 3 AB A A A B A
S A A A A AA B A
A3 A A A A A
A IA A A A A A
A 3 A A
B• B A l A AA
A B A A A A
A A A A
AA.. AD A.. A AA B A A A A AA
- A B- A A A A
A A A:A A A A
A A A-A0 A A A
AA
2150AA A A A AA
A A
-A00 A A
AAAA
-350•
B0 360 380 4B0 420 440 460 480 500 52B 540 560 B80 600 620 640 660 6a0 700 72B 740 760 230o $ 22
UIZDCTED YALUZ
Figure 29. Data Set 540.Dependency Status '0'.
Residuals vs. Predicted Values.
66
NSH= IPLOT OF RESID" -CSTHT :ECEND A I BES. B C 2BS. ET.
:50
2SA
:5 A A A ~ A0 2*.A AA A *B A A AA
'A A A A AA A=A 2 A A A A A ?
..... A-- .... - --A--"B. --... B 'A... B .. ............... ..... ................
SB A A B
D . A A A A A-'P. A
-500
-'!00
.1000
PItIOCTtD VALUE
Figure 30. Data Set 540.Dependency Status '1'.
Residuals vs. Predicted Values.
67
?LOT OF m3S0Tc0 p 1OCT02 A :BS, -'B5. ETC
2-00
2:^0
L00
SA A
A- AA AB b A BB600 B5 A A A
A B A B B A B
SA B A
A oA § I 0 2 B A A-A B - B A AA20 :B : B B B Ab B AA
A B :A A A A B AB AA
,3j A A A A
AC B
TOT?
Figure 31. Data Set 540.Paygrade '1'.
Median Rent vs. Total Pay.
68
PLOZ OF MCDS'!Nflp LEGEND: A I 0B5. 3 2 0 . ETC.
A
BA A
AA
AA
A AA A
AAA
A A A
2 .~A A
N^A A LA
A A ANAN
A A4 A A-0A A
0 A
A A A 44 AA A-AO A A AA AA
SA AA
A B A
:A
2000 1200 2400 1300 1800 2000 20 240 200 JID 80 2200 2200 :42012000 TOTP
Figure 32. Data Set 540.Paygrade '2'.
Median Rent vs. Total Pay.
69
?~3
:T F tCCS7'TTP L0ENE A AS. B BS. ETC
A AA A
AA
AA
A
!C A A
AA
A A
0 A
A
0C
A
450
A
200
3500 3700 3900 .400 .300 ASO 400 A700 000 000 5300 5500 5700 0900 6000 6i300 9500
TOTF
Figure 33. Data Set 540.Paygrade '3'.
Median Rent vs. Total Pay.
70
r::T OF , rESf DMCSTlT LEGND A •I CBS. B 2 -3S. E--,
AA
i:"S
A. AC A C B B A C B B AA : CS A B B A CB.-.................--..... . .. . .. . ....... ..... D.... . . . . . B"
A-- .. . . . . . . .. . . . . . B B- B B- -- - -A- -
-: . A ,
- *53
tIEDCTD VAUL
Figure 34. Data Set 540.Paygrade '1'.
Residuals vs. Predicted Values.
71
?G:
7C FRS:: sC51.7 '-CEND A -,25. B :!S. T
A
A
:22 *A
A AB
AAA
AAA A
~AA A 5
B A A
AA
B A AAAa
AA
.330Li . ....0 L. ... .. . .. . . .500 .. . 0. 0 . 30 . . 0 o 00 2. . .
PKI D VALUE
Figure 35. Data Set 540.Paygrade '2'.
Residuals vs. Predicted Values.
72
F FES!: M I T. "S-. A C ZBS. 5 B !S. E--
AA
*8
AA
-150
-2 0
-70 A
-200
-250 AA
tIEOXCTED VALUE
Figure 36. Data Set 540.Paygrade '3'.
Residuals vs. Predicted Values.
73
P^
'-C, -F ':-:: A OBO B -:8S. E7C
A A
00 A A
A I A
A A
:55
.00:soo .250 .5O .00 . 0 100 :500 X0 5.00 IB00 2800 .Q;5 :0 .-50
TOT?
Figure 37. Data Set 540.Dependency Status '0' and Paygrade '1'.
Median Rent vs. Total Pay.
74
'F T:: 7p L N :B. B 2 . I
2A A
A A
A0 A
30000?
Figure 38. Data Set 540.Depf-dency Status '0' and Dependency Status '2'.
Median Rent vs. Total Pay.
75
NSHB'C PC-S
U :S-TC'P ZC;N" 1 CBS. B ZBS. 7
'CO
200
TOT?
Figure 39. Data Set 540.Dependency Status lot and Paygrade '3'.
Median Rent vs. Total Pay.
76
'.T OF RES- .C5TIW :ZN A . BS, a ^B25. ETC.
- 8 - AA
AA
P AA
SA A A
* AA
-A
A
AA
^A
PI8K0 C , 2 VALL6
Figure 40. Data Set 540.Dependency Status '0' and Paygrade '1'.
Residuals vs. Predicted Values.
77
BA
"50
00 6A
AA
A A
A A
S A
0*
.:50- A
o250
500 55 510 55 3520 !.5 530 535 5 B 545 550 555 565 5,7B 0 57i BSB 390j;6 595 600'PUDICTID VAL!E
Figure 41. Data Set 540.Dependency Status'0' and Paygrade '2'.
Residuals vs. Predicted Values.
78
53-.R-O PC 23
CF RES: " STHT -Z. ED: A I OBS. B : 3S. E-..
A
] .................................................................... ........................................... . . . . . .
ARA
220
-300•
50
Figure 42. Data Set 540.
Dependency Status '0' and Paygrade '3'.Residuals vs. Predicted Values.
79
?LOT Of MCCST T.P :;ENr A C BIS. I -BS. £7Z
:200
:40
z:0C
:Uc
0 .OO
800
000 - A60A A
B A* A A 02
A C -C A
A AC A
50 0 0 00 0C.00 .700 .900 220 .00 00 70 0oar
Figure 43. Data Set 540.Dependency Status '1' and Paygrade '1'.
Median Rent vs. Total Pay.
80
IS~ z I
.20 AF AC0O0 2N 3.3 221
630 B
A
690 A
55.0 - A A
A AAA
A2 50 A
F^
AA * A
AA
0 .0
AABA
AA
A A
00 ABA A A
300A A
A
A A A A B A
A A A360 3A
300 A A
300
270
TOT?
Figure 44. Data Set 540.Dependency Status '1' and Paygrade '2'.
Median Rent vs. Total Pay.
81
,:Snz- I PC 3
P?..T or CCS T3TP :END A - IOS, B 8 2 CBS. :C
B O A
500
?A
A2 .
Boo
5330
A
600
500
A
A0
A A22O 300 0 3700 ..00 .100 30300 .700 .900 0100 0300 0000 3700 .9.0 .7,0 6300 600
TO" P
Figure 45. Data Set 540.Dependency Status '1' and Paygrade '3'.
Median Rent vs. Total Pay.
82
'SAR 2 22A
sOF RESID 3T.. ETC.
-250 - B A *
- .*.A A A
"] 6 ' 6 " -"--"--"'-"----6 ------ -- --- -------"---"'---"-- "A ."'...".."'.." ....' ..." ...." ..." .."' ..
A 3 F BVB UCE A
- 33
1000200 30 30 30 34 30 *0 30 31 30 30 43 40 30 44 ~0 60 3B AI 40 00 31 30 303A
PIDOC50VLU
Fiue 6 at et50Deednc ttu 1 ndPyrae0'
Reiuasv. rditdVaus
.. .... .... .... .... .... . ..... ..... .... .... .... .... .... .... .... ...
7'-' F RE:D'Ms-1A -END3 A OS. B :?5. E
A.A
A
A A
AA
A
o .............. 1 ............................... ....................................... ........................ . . . . . . .B
A
A AA A
30 A
AA
-560
PKDICTED VAL.UE
Figure 47. Data Set 540.Dependency Status Ill and Paygrade '2'.
Residuals vs. Predicted Values.
84
P : F R-S::- MS7H7 -r: . B5. B
L -1C
-100
* 150
-200 A A
-300
-350 . .
520 530 540 550 560 570 Soo 590 600 610 6:0 630 6,0 650 60 60 60 40 0 0 1 0 * 7 6PXEVI rD VALUE
Figure 48. Data Set 540.Dependency Status '1' and Paygrade '3'.
Residuals vs. Predicted Values.
85
C. USING DATA SET 540 AS AN EXAMPLE, SCATTER PLOTS AND RESIDUALPLOTS FOR THE WEIGHTED LEAST SQUARES MODEL.
E^E: A I 01S. B *2 CBS. 7:
.50C
1500
:250
500
A A A A
AB B A BA B * A nAA A A B * A B b...
BA bA A A A * A A -a AB A
A A050 A A
-500
- "50
.1000
PREDICTED VALUE
Figure 49. Data Set 540. Residuals vs. Predicted Values.
86
7,So 9
L:A A
0 AB B
A A
AAA A
A BAA
A A A
AC
A A BA A A A
-C A AA A " AA.A A B A AA
3 A A A A A200 A
A
AA
A
1000 IsO0 2600 300 AM0 -zoo 0 600 i00 5ooTOTP
Figure 50. Data Set 540.Dependency Status '0'.
Median Rent vs. Total Pay.
87
:80
2630
A
500 A AA A A A
b A A A A A AA A AA A At 5 AF00igure 5 A A b A
B - B At AB A A A A A
A -A A A
tot
Figure 51. Data Set 540.Dependency Status '1.
Median Rent vs. Total Pay.
88
SA
A A
AA
AM
V A A A A - A
AAAAAA
A
A AA
A A
AA
ZOO
-230
-30A AA
Figure 52. Data Set 540.Dependency Status '0'.
Residuals vs. Predicted Values.
89
7F ~ ~ -ESS. a z -Is. t::
':00
000
I A
A AA A A A A*A A A A A A
A A A B SA....... ..-. A -..A,-A- - - - - - -a-. A ----------- -------------- 8--- ----
AR Ab A A AA A A- - - - - - - - - -
AA A A AA A
-200* b A A A
260 2180 300 3.0 3.0 360 380 400 .00 ..0 _60 f5 50 50 55 6 8 0 60 60PUDICTED VALSI
Figure 53. Data Set 540.Dependency Status 1'.
Residuals vs. Predicted Values.
90
?LCT OF ZCOSTTOTP . END. A OBS B * 2 BS. E7C.
:.00
Xc0o
N ;4CC
:130
BOO
600 A A A BA A A A A A
B B A A A AA A A A
AA A B B C B tB AAACO0 B AE A A A A A
A BA B A A A a AAB B A B B A A AA AA CA B A A
AOB A AAA
BAt
aB0 COBS ZS00 0455 .I0B .850 2555 2000 2400 2600 UB00 3000 200
Figure 54. Data Set 540.Paygrade '1'.
Median Rent vs. Total Pay.
91
PLZT CF CC5"T Tp LZ ND A ± BS. B Z CBS. ETC
30C
02. A
too
60
650
6:0 A
- A AAA
AA A
A A
AB
A AA A A A
-oo AA A
A A
A A A A
AAA A
A A AA A A
AAA A
300
230
i000 ---- 0400 .000 - -00 2000 200 mo0 2600 2800 3000 3:00 3400
Carp
Figure 55. Data Set 540.Paygrade '2'.
Median Rent vs. Total Pay.
92
-C0 OF MCOST-TCTP LEGENC A BS. B 2 I BS. ETC.
-CC
350iOC A
A350
6'3
A
SSJ A
0 620
AA!50
500 A
A
-00 A
350 A
300
:50
200
3000 3200 3400 3600 3800 4000 .200 -400 4600 .800 5000 3200 5.00
TOT?
Figure 56. Data Set 540.Paygrade '3'.
Median Rent vs. Total Pay.
93
CFT ? 0ESI-'CS-H5T 2ECE4 A 20. 5 2 CBS.E .
!-0
2:5:
2 ::3
A:50 A5 A
A B A A A AA A A A A
A b I S A A * A A..... .... A.I-C..........
A A S B 05 AAA A A A A
AAA 0
-500
-50
1000
300 320 340 360 380 .30 *20 .0 .60 .0 500 520 50 563 80 600
PIDICD V*341
Figure 57. Data Set 540.Paygrade '1'.
Residuals vs. Predicted Values.
94
PLOT OF RES:-'MCSTMT U0202 A : coS. S OBS. ETC
4A AA
A AA
S CD- A-
A A
A A
A A A A AA A A B
- 00 AA
A A A
SAAAA A A
-200
-050A
-300
360 370 380 390 400 410 420 430 -- 0 450 460 Al0 480 490 500 510 520 530 540 550 560 570 580 590 600
?lOICTfl 0VALUE
Figure 58. Data Set 540.Paygrade '2'.
Residuals vs. Predicted Values.
95
I C7 F RES:: "!CSTH -EE . 0 S '2 5 25S. ETC.
A
A
* CO •
A
A A
.300
AA
AA
-300•
550 '60 570 580 590 600 610 620 630 640 650 660 i'0 680 690 700 710 720 730 740 750 756 770 580 790
PREDITED VA.UZ
Figure 59. Data Set 540.Paygrade '3'.
Residuals vs. Predicted Values.
96
CF ~ :!C-77 :BS B : BS :ss.
A
A AA
530
.5AS A AA
A
BA
A5 A
0 A
230A
:50
A
sea0 2:00 1600 1600 :S00 :000 3:00 z'00 2600 2100 3000 3200 540
Figure 60. Data Set 540.Dependency Status '0' and Paygrade '1'.
Median Rent vs. Total Pay.
97
P-'7 F 07:7- P : 1 :BS, B BS. ETZ
A A
AA
-50
-:0 AA A
290 AA
360 A
10
050 .500 2000 0000 0000 2500 A 00 0500 :600 0700 I00 00 OSC 000 5000 -s:oo *500300?
Figure 61. Data Set 540.Dependency Status '0' and Dependency Status '2'.
Median Rent vs. Total Pay.
98
Figure 62. Data Set 540.Dependency Status '0' and Paygrade '3'.
Median Rent vs. Total Pay.
99
7F ' E,:Z :;E Z B B 2?S.Z
- SO- A"......... 7 ..... ...... .. ..... ....... a..... 6..... . ;6..... " a..... ...... ,6..... 6 ..... Z ..... ; ..... ... ..... ;, "
-30
PfDlICTtD VJ.U
Figure 63. Data Set 540.Dependency Status '0' and Paygrade '1'.
Residuals vs. Predicted Values.
100
Aa A
AA
A A
A
AA SAA
AA
380 38)A .0 .40 -60 480 030 $20 5.0 560 Sao o20 63 t-0 .
?IEDICTED VAIVE
Figure 64. Data Set 540.Dependency Status '0' and Paygrade '2'.
Residuals vs. Predicted Values.
101
AA
" 0 . .........................................................................................................................
700 75 710 7.5 72 715 730 735 70 ;S 750 755 760 765 770 775 780 785 790 795 800 05 9 .3 815 820
P9UDSCTED VALUt
Figure 65. Data Set 540.Dependency Status '0' and Paygrade '3,.
Residuals vs. Predicted Values.
102
F .-r C CST-7 7? :--;E. I :'3. 2 OS. E-:
SCO
A.A
A AA A
-DC. BA AIB AAA A A
2^ C AA A *
000 5 700 ;2O ------ 00 H600 .-00 .900 2100 2300 3500 3700 :?30TOT?
Figure 66. Data Set 540.Dependency Status '1' and Paygrade '1'.
Median Rent vs. Total Pay.
103
I SH]P. 1 P, 2
!70
S-80
-AC A AAB
A
AAAA-20
6 .A AAA
310
AA
3¢2
270
ooo00 O0 ;.400 2600 s.m00 z~O0 0200 2*,00 2800 o.a00 5000 3200 "*00
TO"t
Figure 67. Data Set 540.Dependency Status '1' and Paygrade '2 .
Median Rent vs. Total Pay.
104
350
TOTP
Figure 68. Data Set 540.Dependency Status '1' and Paygrade '3,.
Median Rent vs. Total Pay.
105
00 AAA
A A A
-020. A
260 270 280 290 300 300 300 330 040 330 360 370 380 290 - 0 .10 420 430 440 430 *60 470 480 A90 500
?IEOCTZD VAt.E
Figure 69. Data Set 540.Dependency Status '1' and Paygrade '1'.
Residuals vs. Predicted Values.
106
A~H
A A
AA
.. . ......
.0 *5 .0 *5 .0 .5 .0 .8 45 1 *0 .61 . .....5 *80 .PIIDICTID VALUE
Figure 70. Data Set 540.Dependency Status I1l and Paygrade '2'.
Residuals vs. Predicted Values.
107
*:30. A
.00 '60 470 .80 '90 000 520 520 030 40 000 080 070 060 090 600 820 620 60 640 600 660 670 6S0 890
PXLD0COE VALIU
Figure 71. Data Set 540.Dependency Status '1' and Paygrade '3'.
Residuals vs. Predicted Values.
108
D. USING DATA SET 540 AS AN EXAMPLE, STEM AND LEAF, NORMAL PLOTS,AND RESIDUAL PLOTS FOR TEE ANCOVA MODEL.
CSO
SAC4. . 50ASiAlACoC C C •i *0CsAI LA A l A A A A
-9S*AACC KEIK PA"$ tA: AACAA 9. C
ASI Kwi iC1PO 3 AAA IO . A C1 . . . . A:
CCA AlA :I z :A~E.~CCtA III .. A a C... st i
--------- .. --- ----------------ACCVAC-SA..ACC ---- - A--------- ---- ---- ----
ISO C3C??? ! I'S Is. .lo $,-:a o:1 . so . C
Figure 72. Data Set 540. Residuals vs. Predicted Values.
109
HISTOGRAM
'B I
-700-- 2700. - 0
..... .. .... 9
.. 20
- 700--I 6
I.
-2 -1 *l
Figure 73. Data Set 540. Stem and Leaf and Normal Plots.
110
APPENDIX B. SAS PROGRAM EXAMPLE
/I XT4 JOB (1668 9999) 'WILL MIS',CLASS-S//'-MAIN SYSTEMS2,LIN =(99) CARD~S=(500)// EXEC SAS//WORK DD SPACE=(CYL (202)//DArAIN DD DISP=SHR D N-r1~4W DPDVHA.EDITSR.CCG45.M540//DATAOUT DP~ DISP=(OLD,KEEP),DSN=MSS. 51668. EXT/ISY SIN DD-fATA DATA54O.
INFILE DATAI1INPUT PG 18-10 NSHR 20-21 HT 22-23 BR 24-25 RO 26-27 COST 30-33El 34 E2 35;
VWP=269;BW2=269;BI-3=282;BW4=3 04;BW5r 349;BW6= 388;BW7= 420;BW8=452;BW9=49 1
BW 1 431;BW]2=469,BWl 3 511;BW14=428,BW15=463;BVI16=513;B~s'7=365;Bt-18=408;BWl9-4'8;BI-2O 578;Bl-21=655;BW22=680;BlW23=755;BWO1=150;BVIO2=169;BVO 3 208;BW04 =2 12;BW05=244;BWO 6 =26 4;BWO17=292;PWO =342;BWO9 372-BVIOlO 28J;BWO 1=338;BWO 2=31BWO14=318;BVIO15=370,BWO16=434;BWOl7=2b9;13W0]8=319;BV701 9=402;BV1020=502,BU021=542;1M422=562;BlW023=6 13;TP1' 1054;TP2=1178;TP3= 1238;TP4=139E.TP5= 1631;TP6= 1914;T = 2 2 3 8;T8=2590;
TP9=3072-TP10=20Ui;TP11=2ei 12;T P12=2811;IP13-332];TP14=2281;TP15=2759
TPl6=3343;T 17=1815,TP18=239 4;TP19=2966;TP2O=3628;TP21=4321,TP22:5 179;TP23=6517IF El EQ OR E2 EQ 2 THEN DELETE;IF El E U 7 OR E2 EQ 7 THEN DELETE;IF El GE 8 OR E2 RE 8 THEN DELETE;IF NSIIR CT 2 THEN DELET;IF NSHR EQ 2 AND PC GT4 ;THEN DELETE;IF RO EQZ THEN DELETE;IF COST LT 1 THEN COST = 1;ICOST:1/COST;DATA DATA5 0O
SET DATA54O-ARRAY BW(?3j BW1-BW23-ARRAY BWUI2 ) BWOl-BwC)23;ARRAY TP(23) TPl-TF23;DO I =I TO 23-
IF PG EQ I )ANP NSHR EQ 0 THEN DO;BAQ BW(I)PAY Tf()TIP TP( Ij - BAQ;TOTP TTP * BAQ;ITOTP 1/TOT?;
END'ELSt-
IF~~ P E N SHR NE 0 THEN DO;BA :WQCIPA: TF(I)TTP PAYN BW(I)TOTP BAQ I TP;ITOTP 1 (TOT?;
EN*END;DATA DATA 4O-
SET DAfA540-PROC SORT DATA = hATA540i
BY PG NSHR HT BA COST ICOST ITOIP TOT?;DATA DATAOUT. DATA54O;
SET DATA540WKEEP PG NSHA HT BR COST ICOST ITOTP TOT?;
PROC UNIVARIATE DATA=DATA54O NOPRINT;VAR COST ICOSTBY PG NSHR HI iR ITOTP TOT?;OUTPUT OUT=DATA54lMEDIAN=MCOSTMEDIAN: IMCOSTN: NUMB;
DATA DATAOUT. DATA541;SET DATA541bKEEP PG NSH~ HT BR. MCOST IMOOST ITOTP TOT? NUMB;
PROC PLOT DATA-PATA541;PLOT MCOST-TOTP-PLOT IMCOST- IIofP T 1PO OMLPROC UNIVARIATE DATA=DAT5IPO OMLVAR MCOSI;
PROC UNIVARIATE DATA=DATA541 PLOT NORMAL;VAP IMCOST
PROC REG DATA DAtA541 SIMPLE;MODEL MCOST: TOT?.OUTPUT OUT=DATA546
PMC STHTR=RESID,
MODEL IMCO ST=ItOIP;OUTFUT OUT=DATA5 47
P= IMCSTHTR=RESID;
PROC PLOT DATA=DATA546;
112
PLOT RESID'*TOTP/IVREF=O;PLOT RESID"-'MCST HT/VREFl-O;
PROC PLOT DATA=PATA547-PLOT RESIDI:ITOTP/I ~REF=O;PLOT RESID*'IMCSTIITIVREF=O.
PROC LNIVARlATE DATA=DATA546 PLOT NORMAL;VAR RESID;
PROC UNIVARIATE DATA=DATA547 PLOT NORMAL;VAR RESID;
PROC SOPT DATA = DATA541 OUT=DATA541A;BY TOTP.,
DATA DATAOUT. oArA IA;SET DATA541A;KEEP PG NSHR MT BR MCOST IMCOST ITOTP TOTP;
PROC RSREG DATA=DATA541A,MODEL MCOST=TOT P4 LACY.FIT-
PROC SORT DATA =DATA5 1 OUT=DATA541B;BY.I TOTP
DATA D AT AOU. .1)A T.Ai4 1B;SET DATA54~1B:KEEP PG NSHR HT DP MCOST IMCOST ITOTP TOTP NUMB;
PROC RSREG DATA=DATA5 41BMODEL IMCOST=ITOTP LACKFIT;
DATA DATA541C;SEr DArA54'IF NSHRP GT 1. THEN NSHR' 1:DATA DATAOUr. DAIA541(;
SET DATA541C:VEEP PG NSHR TIT BR MCOST IMCOS T ITOTP TOTP NUMB;
PROC SORT DATA =DATA541C OUTrDATA54 1D;BY NSHR'
DATA DATADU!T. DATi541D;,SET DATA54LD:KEEP PG NSHR HT PR MCOST IMCOST ITOTP TOTP NUMB;
PROC PLOT DATAMQATA541D);['LOT IICOST-TOTP;BY NSHP*
PROC PLOT DAfA=DATA541D;PLOT IMCOSTITOTP;BlY UsHR
PROC UNIVARIATE DATA=DATA541D PLOT NORMAL;VAR MCOST;ply NSHR.
PROC U:;IVARIAiE DATA=DATA541D PLOT NORMAL;VAR I!ICOST;BY NSHR;
PROC REG DATA=DATA541D SIMPrLE;MODEL MCOST=TyTF.OUTPUT OUT=DATA546D
P=MC S IH-TR=RESID;
BY NSHR'PROC REG DAtA=DATA541D SIMPLE;MODEL IMCOSlIIITI'OUTPUT OUTVDAIA547
P=1MiCSTHTR=RESID;
BY NSIR-PROC PLOT LATA~pATA546D'
PLOT RESID ITr/VRftFO;BY NSHRP
PROC PLOT DATA- PATA546DPLOT RESID MCS11T/4REF=O;
BY NSHP-PROC PLOT DATA'PATA547D
PLOT PESIl'ITOIP,'VAEF=O;BY NSHP-
PROC PLOT DATArDATA547DPLOT RESID<-IflCSTHT)VREFO;
BY NSHR-PROC UNIVARIATE DATAzDATAS46D PLOT NGRIAL,
VAR RESID;
1 13
BY NSHRPROC IJNIVARiATE DATArDATA547D PLOT NORMAL;
VAR RESID;BY NSHR.
PROC SORT DATA =DATA541D OUT=DATA541E;BY NSHR TQTP;
DATA DATAOUT. DAIA5 4IE;SET DATA541E;KEEP PG NSHR H-T BR MCOST IMCOST ITOIP TOTP NUMB;
PROC RSREG DATA=DATA541F-MODEL MCOST=TOTP/LACKFIT;
BY NS"R.PROC SORT DATA =DATA541D OUT=DATA54 IF;
BY NSHR ITOTP;DATA DATADUT. DATA54IF;
SET DATA541F;VEEP PG NSHR HT BR MCOST IMCOST ITOTP TOTP NUMB;
PROC PSREG DATA=DATA54FMODEL IMCOST=J.TOTP)LACKFIT;BY NS"R;
DATA DATA541G,SET DATA541IF PG GE I AND PG LE 9 THEN PG~l;IF FG GE 10 AND PG LE 19 THEN PG'2*IF PG GE 20 AND PG LE 23 THEN PG=3;DATA DATAOUT. DATA541G;
SET DATA54IG;KEEP PC NSIIR HT BR MCOST IMCOST ITOTP TOTP NUMB;
PROC SORT DATA = DATA54IG OUT DATA5 41H;BY PG-
DATA DATAOUT. DATA54 iN;SET DATA541H;KEEP PG NSHR HT BR. MCOST IMCOST ITOTP TOTP NUMB;
PROC PLOT DATA=PATA541H;PLOT MCOSTTOTP;BY PG&ADTA4H
PROC PLOT ~T=AA4HPLOT IMCOST"'lTOTP.
PRO U "1VAIATE DATA=DATA541H PLOT NORMAL;VAR MCOST;BY P
PROC UNIV1RMATE DATA=DATA54IF{ PLOT NORMAL;VAR IMCOST;BY PG.
PROC REG DAtA=DATA541H SIMPLE;MODEL MCOST=TOIPOUTPUT OUT=DATA546H
P MC ST HTBRRESID;BPG&ADT51
PROC REG f~T=AA4HSIMPLE;MODEL IMCOST=ITOTP:OUTPUT OUT=DATA547H
P=IMCSTHrYR=RESID;BYPG.
PROC PLOT~ DATA=QATA54 6H-PLOT RESIDT-:OTP/VRtF=O;
BY PG;PROC PLOT DATA=DATA546H
PLOT RESID1MCSTHT/ 'REF=O;BY PG;
PROC PLOT DATA=IPATA547HPLOT RESID%:ITOTP/ViEF=O;
BY PG;PROC PLOT DATA=QATA547H
PLOT RESID"<IMCSTHT)VREF=0;BY PG AEDT=DT56 LO OML
PROC UNIVAAT DTDTA4HPONRMLVAR RESID;
BY PC;
114
PROC UNIVARIATE DATA=DATA547H PLOT NORMAL;VAR RESID;
BY PG;PROC SORT DATA =DATA541H OUT=DATA54II;
BY PG TOTPjIDATA DATAOUT. DATA54 i
SET DATA54].1;KEEP PG NSHR HTT BR MCOST IMCOST ITOTP TOTP NUMB;
PROC RSREG DATA=DATA541I,MODEL MCOST.=TOTP/1.ACKFIT;
BY PG-DATA DATA 41J3
SET DAT 541HPPROC SORT DATAz DATA541H1;
BY PG ITOTP;DATA DATAOUT. DATA .4 ii
SET DATA541J;KEEP PG NSHR HT BR MCOST IMCOST ITOTP TOTP NUMB;
PROC RSREG DATAzDATA541J-MODEL I1COST~ITOTP2LACKFIT;BY UPK
DATA DATA5 4KSET DATA54DIF NSHR CT i THEN NSIIR~1:IF PG GE 1 AND PG LE 9 IIhEN PG=1;IF PG GE 10 AND PG LE 19 1HEN FG=2;IF PG GE 20 AND PG L.E 23 THEN PG=3;DATA DATAOUT. DATA541K;
SET DATA541K;KEEP PG NSHR HTT BR MCOST IMCOS T ITOIP TOTP;
PROC SORT DATA =DATA541K OUT=DATA5 4 L;BY NSHR PG-
DATA DATADUT. DATA54 iL;SET DATA541L,;KEEP PG NSHR ITIT BR MCOST IMCOST ITOTP TOTP;
PROC PLOT DATA=PATA541L;PLOT MCOSTTOTP;BY NSHR PG'
PROC PLOT DATA~?IATA541L;PLOT IMCOST"ITOTP;BY NSHR PG-,
PROC UNIVARIATE DATA=DATA541L PLOT NORMAL;VAR ?COSTBY NSHR P :
PROC UNIVARIATE DATA=DATA541L PLOT NORMAL;VAR lI-.' fBY NSF, FG!
PROC REG DATA D~tA541L SIMPLE;MODEL MCOST=TOTi'OUTPUT O1'T=DATA546L
P MC STHTR=RESID;
BY NSHR PG:PROC REG DATA=DAlA541L SIMPLE;MODEL. IM-COST=ITOTPOUTPUT OUT=DATA547L
P= IMCSIHTrR=RESID;
BY NSH. PG;PROC PLOT DATA=PATA546L
PLOT RES1D--TOTP/VR F=O;BY NSHR PG;
PROC PLOT DATA=QATA546L;PLOT RESID-MCSTHT/VREF=O;
BY NSHR P;;PROC PLOT DATA=PATA547L
PLOT RESID,:ITOTP/ViEF=O;BY NSHRz PGC;
PROC PLOT DATAIPATA547LPLOT RESIO"eIMCSTlIT7VREF=O;
BY NSHR PG*PROC UNIVARIATt DATA=DATA546L PLOT NORMAL;
VAR RESID;BY NSHR NPG ADT57LPO OML
PROC UNIVARIA1~DT~AA4LPO ONLVAR RESID;
BY NSHR PG;PROC SORT DATA =DATA541L OUT=DATA54IM;
BY NSHR PG TOTP;DATA DATAOUT. DATA541I;
SET DATA41M;KEEP PG NSHR H-T BR !ICOST IMCOST ITOTP TOT? NUMB;
PROC RSREG DATA=DATA541MODEL MCOST=TOTP/LCKFIT;
BY NSHR PG;DATA DATA541N'
SET DATA54lL'PROC SORT DATA =DATA541L,;
BY NSHR PQ ITOTP;DATA DATAOUT. DATA54 IN;
SET DATA541N;K~EEP PG NSHR MT R MCOST IMCOST ITOTP TOT? NUMB;
PROC RSF.EG DATA= DATA5 IN;MODEL lllCOST~lTQTP/LACKFIT;BY NSHR PG;
OKTIONS LINESIZE=80
116
APPENDIX C
TABLES 1 - 14
117
'2~kg
ilLIS liii * 0
II ~~*I WI.* III 1 0
Nj ji U U U U U U U U U
Ij ji UUWUUI.UIUUWUUUUUUU
.Ini
U
IN~ lb
ilov
:119:1
is
i U U U U I U U U.
j !i i1P I !U I!W! IIt
120
ilL, RI
ilk,* I ill
II i
IIN
121• •
4 '4I
V v
A4-
rFr
q:: .., EUq qEU EU r-r EU)C
I) c 4 44 o C ;C.1.
122
. . . .
Ll
123
ukc
~12
uk f- fIE
ji~~k WErE W i
r- r-
I-Ii
125
*I * 0 .
r -,10c c' Ic C ,'
en* m
hu l -4 r1 r-1 m r -e-CN. r--1N C',C4 .-qr m NI
126
AWAC:
Nrl IF P,9 (p P P; P 9- (5l
z
127
1*11)
L1)
C, r--
ii . 128
j
(- N r- in c - C4P oIsF.I
12 9
'A 14*
liil
3 130
jjj1 v VV
I- CA f D0I nc n n r4f
,-4O (n r. r- en c C4 0;L C 4 c;
1 31
Clkm
CA V4 AV V 4
illv v
I VC : ( A C
r-I 4 N fnr4 C4 m
132
j- V4 % A v V4 V& A Vi
11133
IN
Ii i00I A -4 uLn C.4 A4
134
C4-
MI 48A %*A
135
v El
Vr VOV D
iii *- *N1F4 IT NC 1In
f-I C1 P-1 r) 00 4 4 ,;
f-4 uN m -l 14
136
C4 lI r-
Ill
f C4 N (4 r C4
137
AjfR
qqq %a ALA C E 0 -
r-4 C41AUO ~ -
1 3
rL1 '-
j4 j V V V
11'1 .in
r- C t4C4 Ln%0WI EUP E
139
lii LA (4LA -4 (c4
C,4.
ip
LAL (N;
f-4 LA C4 LA A r4 4
(N -I (N. r4
140
r; C4-4 .
414
Clii (V a t 0 000
41
m r-4 1C14 %0 -4 C14 4
r4 # 0 a *4 0 6
142
mI iiq L)
LA) LA)
f--i CN m 1 4 C,,
143
vi -im.
ilk- r1k c .
414
jn Ac V r-4C A V I
r-l
145
II
4
% r- -4 C V 4
146
Ii" Iji
4'4
147
Aj InJ $ig44 LA V4ci4 ;
r- N (Y4 ('4
148
r-4I
illlU W
0 A Ca 0
UUEU
149
I-I G 4j C -4c' 4 4
m Ii IV~ AI eIlrlA4 6r 64 4r -
!IIF coM
9-4 CN m9-l CN M
150
A H HA
%Au cll0,j) r- W I % lr4 r - -c
I- I W I mE mI
151
~C01
i6lk.cc ik Vr V1 V- V
4'-
152
C6 C ~~~r-: 444L C , 4C
jn jN U) V4 V V 4 G
Iii F153
.1.1 CL ( n(% C-4 CN m~
'-4
154
A. I
155
I 1~r q I
r-4 C14 M ,en
u-
156
VR r-
44 A 9.4r- r
r Ir _- Ne c C 4 4L
1 57
6 0
q-W qC4( 1 -I-
mi(nrn.
158
RgN *n vE mEEcE4 4c-
ilk A A
cm r-4
159
IfEl
jN EU lk U-UAlWEj
'-4 ( Y)
160
ElnLfLA
ii4 C1 CI
n- C'4
161
liiii
CN en
tmEjjj1 I162
R8 .g .O A
416
r:~ (44C I. 1 C 4I
r-4IIIr-N C1
~ V V
Jii64-- --- - -
%ZLA-1 4 AA 4c4 ,; ',
416
V VV V
co~~ U-qwC4110
C~j N
166
CIi1
146
cs t V; C44 1 -q Nen -4M
min
rid CNmr-
4)r-
168
'8.
r-4 - 1
liii9IIIC mC*1 c'J
169 .
*ii1i
.1-)i
170
Ij, *L
.4
rc~
L> NLLL
1 7 ! 1'i
(V) LA A 4 C~4
j LAAA
I *iliI I I I
mI DR1R1F,111
172
C4 '4 .4 4 0-
41
41
17 3
4111AA
z
1 74
j Ak
F; 51 ~r lirnO
~CCC
4'-
4UP
1
'I °
~C
1 76
Jil VA A V A
pi inF- 1
Ii Ri N
17 7
ii*L A v
.1178
it,
I~
r14)
1 7 9
r-4 r-4
~18
A V
rid)
rid)
io oEr-
181
hi1 ; r-44
418
C,
-A-l
C1i
ill4Ul i
F-I-
1 83
C&
4J L or
z
tili ~ 14
Ic re iii, -O C)C -
185
9 A g A
.1186
41 -WA 4 A A gA
r-4- C 4 m
187
ilk *E ; Ci
Ow C; C;
A A . I
IlMl-1 CN rji 18w
1-4 r-4 r-
I tIE AAA
41
R P
r-4 6 -I 4 m
189
Im AM'
*iflm m CO U l -W U
be c
190I RIOU
ill~i -
r- C14
ji RWE U'.41
41
1 91
A A .
m IN
41
A1 rC)C m~ I14 MO CJ
r- N m q
192
451
C44-5L
/II
193
A A9O A p A
ONll v I
r-4 m -4n
1 94
J3 AA A4 A
LIn
C 10
cc L "I I L) c c , e
CN ml ( (Y)
19 5
'. 4 Cl
V A A96
j AA 'A
O Ln -~LA C4 M r- %D r- n Iu,
14 M r4
197
jA 03 41 4
I ~
2 - C4 en(
198
IM liL %D
'ii!' I
I il ,imr .u*
1 i ~ p4 .. . .
UU, P FJ H
C" f (4 M p 4 P- 4 1
I- N np4 N (
199
iI
Ln N
U,-4
200
,-,,, mnmmnn l nannnmnl Hi i li1lli nm sI
AkA
411
rl C1~4 CN4
201
r-fliii U202
tlk
ilkA A
cilIw11 C,1
20
AjA
420
Lf
r-4 C4 rnrn14
205 -
ilk LAl
41d CC oC4 mr4 m e
4m ,-4-
206
A AA A
cooi
Il~ ID CN~ ~
20
Pliiinr- li em
2 208
I'll
A (4,4
It"Ill9
ij
m |I N m C% mI
2 0 2 1 0
cjj V A n V4c 4 -'c
ill
ILAjA
ioiiM - L o m C
I12 1
1111 r- -4 r-4
A A A
II
22
A AA A
m2 4
jfjj 4 N
ItID215
m:ji Im mlmImmmmmm
in m
4r4
216
&41
itI AA V A
ll~l "
217
.s.D
owJ r,4 k, -o , l ,
218
I s.DII
lit
212!
• • a Ii a ii i |'
'a q.2 1 9~ M'
A V
.1114j22
CnC1
2 21 1
4 rq
'-40 pie
2 Lf
P-1 CN m - C"
cmU-2 22
IdiI •
v 4 C1 4 ( - 1
I ,I
9 (N (.r-4
I. 223223
Iti
4 I C
224
~IWIIE I ON0-4 m
111mIll
C Oh
wjjjJ A A
V- jifv-4
22
flitl
22
I A A A
II
228
AAA A A
owlIc
Az.
LAc! L 4 v
229
1111
ill'!ill K
I ji
'-I
jidJ
4.)
Ii
230
111 r -4M
2 3
111 tli i1 l1
1-4 N -4
m232
II
.1232
C1 -4 P-4 r-4
Ir- ii U
2N3
An V
411
m -ol
234
V AA
A
~235
111A A V A V A
%Di tnr NI)-El. ElQ %EIp
2 3
~IiIJill A A V A
mr14
.. ..
.IJ
.dJ a~ r-o~ a
*1 . .Iz
0
237
r4 C('4 ('4
ji~~~P FIIW W V
0iI0 r-44
2 38
'1
Jil m
IAr-r4 - D -d)vc
2 39
N240
240
k
ii ji a Q
I- r- NIf
241
4C'4 u-4 A 4
inn
co rl k C'4
9-l 4 fn, C14 f
242
jldIII4
galr
AjLn r ~LA CO~~ 04% v mC
e.,4 CJ
243
vi VA A V
qq~i ii
.48 - fnc CC) tA~ og -4 D c
V.-4 C4 e *
244
A A
A*vmflla L O (4q
245
111 .4 A, A A 0
z ,' F
424
-4 r-4 C~4 C4 0
uk AA V A A
247
C1 - r 1 C4 0
424
C4
kD co P- m n n r- C1 C44.
II 5
249
r1I1 4 ' -4 4
A250
A A V A A
III
Ihec Drl wr4L r- "OeA
25
R L t 6 C
i~Cl4
AjAA A
fll -; r- 6 L
H U U o
z
P4 CNI m C
252
c4 I
11110
Cr
r- i* ri Ln -It -4 C V)
U- Id-)r R
C%4 M4 CN
2 53
min
r- 4 -4 C~4 i-
'25
co~ m * in
enI-I I I Iw
25
cl!" A - 1- C4U j
%0 C
256
m r
256
r4 r
1111
A Ai Vj
Ii ill
lk 2D
in44J,
I IfA j4 C4C4 V C4 -46
258
p4 - -
IDIw259.
Rom R;nq iim.k
484
I I
260
A CAAV
If inIIIe CI ID
a * 0*261
t 1 1. . . i. .
262
lif -4 4 e
ak rIII3
264
~I
h26
I uauuI I I I I I
111 %0 1*4 qw -4 C
s ii 9III
IrU- (4 C
JJm
266
mlCiif A 4 A4
s I
cii m r-fn426
N1 msCV
OD
268~i
II
9-4 O C'n (P u.4 A C104
269
bMUg
ail
Ai1
illl 4 4 ,
270
II
41
41 N~C' N 4 N C4I
27 1
1 9
4- LA AA (3 A
P-4 C4 4
CD -4 4
2 72
r-d l m r- -4
I IimI2 In
I I I I I ] I I I II II I . I
jil in4 m 4 c 4
III
A.A
A.b
I"
- l " " ' " U
274
$I I
II275
£1 iigIIA* 0
ia caA 'C C4 A A A
x M FS4 F4q
276
li
A 4
owIImI r ll
r g-C4 (In' ro4 '4
27 7
1111
IIIm 4 ;'j4rC4 mC
Nt
27
111
I I I C4
I~L ijn~@0
279
ji| I
11
I, II
4-0
- .I I
4'4
4.-
280
mmI • m m m
128
1111 II II -
I 0%
III:: il I:1l:
a' I .. ,-I
Ji i~ 281wn~ ~. l iehee m nilllllil l llllll
I
CI' r*
till C6
C;ill4 C
ill2
283
284
ji V A
ii a
I1 11428
owa a
-4 (N u-I (%4 C1
286
ji vs V V
ii S S U
II a a a aa a a a a a
U
I,ha
Si
IN U
(~d 5-4 5-4
Au287
a aJoJ
C1 INC~
288
a1
II V V
ii w M
ii J JJ a2a9
V V V
iii
ow
ii a a aI
3iI~ij.U
c.,a
IN290
~i V V.
he
291
ii V V V
ii a a a~
I 0 Sphi
S S
hi
ij~K
1' 5-4
iN292
i J 'I a a
In* ,
29
.14
C4 N.,
294
ji w. w
295
ji V.
:1
'U
29J
I
jI~I
'MW
II Jii
N
be
be
ij~U
J~4IN
297
LIST OF REFERENCES
1. American Management Systems, VHA Current ProcessDescriptions, pp. 2-1 to 2-64, February 21, 1989.
2. Draper, N. R., and Smith, H., Applied Regression Analysis,pp. 11-147, John Wiley & Sons, 1981.
3. McGill, R., Tukey, J. W., and Larsen, W. A., The AmericanStatistician, Variations of Box Plots, Vol. 32, No. 1, p.16, February 1978.
4. McCalla, P., and Nelder, J., Generalized Linear Models,Chapman & Hill Monograph on Statistics and Probability,1983.
298
INITIAL DISTRIBUTION LIST
1. Defense Technical Information Center 2Cameron StationAlexandria, Virginia 22304-6145
2. Library, Code 0142 2Naval Postgraduate SchoolMonterey, California 93943-5002
3. Defense Manpower Data Center 299-100 Pacific St.Suite 155AMonterey, California 93940
4. Laura D. JohnsonCode 55joNaval Postgraduate SchoolMonterey, California 93943-5000
5. Donald P. GaverCode 55GvNaval Postgraduate SchoolMonterey, California 93943-5000
6. Michele Williams6185 Wild Valley Ct.Alexandria, Virginia 22310
299