Received XXXX
(www.interscience.wiley.com) DOI: 10.1002/sim.0000
Models for waiting times in healthcare:
Comparative study using Scottish
administrative data
Arthur Sinko a, Alex Turnerb,Silviya Nikolovab∗, Matt Suttonb
The empirical modelling of waiting times distributions present a number of challenges due to their strong
positive skewness and multiple mass points. We compare the performance of a number of estimators previously
applied to health outcomes with similar distributional characteristics to model waiting times. We benchmark
estimator performance using several measures of in-sampleand out-of-sample prediction accuracy for different
data samples, years, and frequencies. Using administrative data for the Scottish National Health Service (NHS)
we find fairly large consistency across years and frequencies both when pooling data across ICD-10 chapters and
estimating separately for each year. Our results show that no model is optimal on all metrics, but generalised
linear models and count data models seem to be the most appropriate for modelling waiting times.
Copyright c© 0000 John Wiley & Sons, Ltd.
Keywords: health econometrics, waiting times, modelling
a Economics, School of Social Sciences, Arthur Lewis Bld., Oxford Rd., University of Manchester, Manchester, M13 9PL U.K.b Manchester Centre for Health Economics, Jean McFarlane Bld., Oxford Rd., University of Manchester, Manchester, M13 9PL U.K.∗Correspondence to: Manchester Centre for Health Economics, Jean McFarlane Bld., Oxford Rd., University of Manchester, Manchester, M13 9PL U.K. E-mail:[email protected]
Contract/grant sponsor: This research was supported by a NIHR Research Methods Opportunity Funding (RMOFS 2012/08)
1. Introduction
The distributions of many outcomes of interest in health economics are characterized by a strong positive skew,
a heavy right-hand tail, and multiple mass points. In addition to this, it is likely that the relationship between the
response variables and covariates is non-linear in nature and heteroscedasticity is present. Healthcare expenditure/cost
data, waiting times for medical treatment, length of in-hospital stay, number of physician visits, among other health
outcomes, present these distributional idiosyncrasies.
There is a burgeoning body of new estimators for skewed distributions with mass points and a heavy right tail,
but most of these new statistical methods have been applied to modelling healthcare costs only and are yet to find
widespread application in health economics. In this paper we compare models that have been previously applied to
healthcare costs and some generally used extensions as informed by a review of papers on health outcomes with such
distributional characteristics. We evaluate their applicability to modelling waiting times.
The ultimate goal of waiting times modelling is to recover a functional relationship between waiting times and
relevant covariates and to construct correct statistical inference. Waiting times for elective surgery are intrinsicto
public healthcare systems where healthcare is free at the point of use and prices cannot act as a rationing mechanism
to reconcile limited supply with potentially unlimited demand. In such systems waiting times act as a non-monetary
rationing device by deterring patients with small benefits from demanding treatment [1, 2, 3, 4, 5, 6]. Waiting times
also have important effects on patients. It has been shown that longer waiting times are a major source of dissatisfaction
for patients [3, 7, 8]. They also postpone patients’ benefitsand, due to discounting, cause reductions in benefits [9].
Also, although opinion regarding elective surgery is mixed, some theory and evidence suggests that longer waits
could have detrimental effects on health [9, 10, 11, 12]. In awider setting, waiting times have also been shown to
be a key determinant of satisfaction with public services [13, 14] and a key indicator of public sector inefficiency
[15, 16, 17, 18].
This paper contributes to the growing literature on modelling health outcomes with skewed distributions, mass
points, and heavy right tails by comparing performance of a wide range of models against waiting time data from
Scotland to estimate and predict the distribution of waiting times. We examine the ability of these models to address
the distributional characteristics of the data using a quasi-experimental design involving cross-validation.
Reforms to reduce waiting times through the imposition of maximum thresholds were introduced in Scotland in
2003. This changed the waiting times distribution by reducing the long waiting times as well as the mean and median
waits of elective patients. We use pre- and post-reform waiting times data for 2002 and 2007 respectively to analyse
suitability of different models. We study the waiting timesfor elective treatment for a wide range of conditions. We
regress elective waits on a range of patient characteristics and measures of disease severity. Comparative performance
is determined using in-sample and out-of-sample prediction measures of goodness of fit, bias and forecast accuracy
1
for different data samples, frequencies, and years.
The rest of this paper is structured as follows. The next section summarises related literature, introduces the models
applied to the evaluation of waiting times and, where available, provides details of previous applications to waiting
times data. The different metrics used for model comparisonare explained next. Section 3 describes the data and choice
of variables used in all regression specifications. Resultsare then presented followed by Discussion and Conclusion
sections.
2. Comparison of Model Performance
2.1. Model Comparison Literature
The bulk of existing studies assessing comparative model performance have utilised healthcare cost data. Much
of the early focus of these studies centred on how to deal withthe mass point at zero, and, thus, much of early
literature assessed the merits of using two-part and multi-part models over the traditional one-part specification used in
ordinary least squares (OLS) [19]. By reducing skewness, conducting transformations before applying OLS can help in
returning the distribution to normality. As a result, otherpapers have chosen to focus solely on the decision of whether
or not to transform the dependent variable [20]. However, policy makers require predictions on the raw scale which
leads to the tricky problem of re-transformation, a problemmade increasingly complex if heteroscedasticity is found
in the errors [19, 21]. Later work encompassed both of these facets, by comparing one and two-part specifications of
the OLS model with one and two-part specifications of log and square-root transformed OLS, as well as with one and
two-part specifications of generalised linear models (GLMs) [22, 23]. By explicitly modelling non-linearity, GLMs can
deal with skewed nature of health distributions, whilst still generating transformations on the raw scale, thus avoiding
the challenges of re-transformation.
Although some studies continued to make use of zero observations, many have chosen to focus only on positive
costs, which has led to the comparison of more complex models. Previous literature compares transformed OLS
models to GLMs with different specifications of the link and distribution functions [24]. Basu and Rathouz [25]
propose an extended estimating equations (EEE) model, a very flexible extension of the the GLM model, and compare
its performance to the its various nested GLM specifications. Both this paper and a later paper [26] find that the EEE
fits better than any of its nested models. Similar to the GLM family of models, parametric survival models such as
the Weibull, exponential and Gompertz models, as well as themore flexible generalised gamma model (GGM), can
also capture non-linearity. Manninget al. [21] recognise this, and compare the GGM (along with some other nested
alternatives) to an OLS regression on log-transformed costs, finding that the GGM potentially provides a more robust
estimator than simpler models. The use of survival models istaken further by Basuet al. [27], who compare the log
2
transformed OLS, GLM and EEE models to the Cox proportional hazard model, a semi-parametric model where the
functional form of the baseline hazard model is left unspecified and is calculated along with the parameters of the
model in estimation. The flexibility of the GGM model can be extended further by using the generalised beta of the
second kind (GB2) distribution. This nests the GGM model as aspecial case, as well as other beta-type distributions
such as the Dagum, Beta-2 and Singh-Maddala, which can be used to capture heavy-tails in some distributions [28].
Jones at al [29] compares the performance of the GB2 model to all of its nested models, finding that the GGM and
Beta-2 models perform the best.
Two recent papers compare a wide-range of models of different types. Hill and Miller [30] compare the GGM to
both the EEE and various specifications of the GLM as well as linear and log-transformed OLS model using healthcare
expenditure data stratified by insurance type and age. Previous studies had not compared the EEE and GGM models,
as neither the GGM or EEE models are nested within each other,although both share the mean and variance of the
gamma model as a special case. Hill and Miller [30] find that the EEE model performs as well as, or better than the
other models in all of the stratified distributions. Jones [31] provides the most comprehensive comparison of models
to date. He provides a comparison of all of the aforementioned models as well as some extensions. These include a
finite mixture model, which allows the researcher to controlfor heterogeneity in the data, an error correction model
(ECM) estimated by both non-linear least squares and Poisson maximum likelihood, and a modified version of the
GGM model, which accounts for additional heteroscedasticity by allowing some of its parameters to be a function of
covariates [21]. They find no one model is optimal, with performance differing depending on the criterion considered.
The conclusions from Joneset al. [29] typify the findings across all comparison studies. In general, the appropriate
specifications for modelling costs differ dependent on whatcosts are considered [30]2, on the data sets used [30], on
the sample size [29, 32] and whether a researcher favours bias or precision [22].
2.2. Waiting Times Literature
Despite its problems with modelling variables with the distributional idiosyncrasies of waiting times, the popularity of
OLS in empirical papers is still widespread, even if only as abaseline model from which results from more complex
models are compared (see for example [8, 33, 34, 35, 36, 37, 38, 39, 40]).
OLS on log-transformed waits has been a popular choice to account for the skewness of the waiting times
distribution. This method has been applied primarily to assess the degree of discrimination in waiting times by patient
group. In particular they test whether waiting times systematically differ by condition [41], income [9, 41, 42, 43],
race/ethnicity [44, 45], education [9] and other socioeconomic factors such as age, gender, insurance status, nationality
and employment status [41, 43, 46]. This method has also beenapplied to test whether greater choice between
2They find that the optimal model varied over different stratifications of the sample by age and whether total expenditure or only expenditure on drugs whereconsidered.
3
providers, measured by concentration indices, is associated with lower mean waiting times [47], and the impact of
differences in prioritisation policies in Scotland and Norway on the distribution of waits [48].
The use of quantile regression to model health variables, which can help to address skewness by using the median
rather than the means as the measure of central tendency, is very limited, and even more so with respect to waiting
times. Only one study used it to test whether higher socioeconomic status leads to shorter waiting times for elective
surgery in Australia [42]; and in particular whether discrimination based on socioeconomic status is greater at longeror
shorter waits. Evidence from published studies suggests that non-linear models are a popular choice in the estimation
of waiting times. Iversen and Luras [49] employ two of the most popular count data models, the Poisson model and
negative binomial model, as well as a negative binomial model with an added random effect, to test whether general
practitioners (GPs) in Norway with a shortage of patients offered shorter waiting times to attract new patients. Siciliani
and Verzulli [50] use the negative binomial model to test whether higher patient socioeconomic status is associated
with reduced waiting times for elderly Europeans. Finally,Roll et al. [51] test whether waiting times for outpatient
care in Germany differed significantly across individuals based on the type of insurance and income, using the negative
binomial model. The use of duration/survival models has been widespread in the estimation of healthcare costs [31].
However, application of duration analysis to the modellingof waiting times is a relatively recent phenomenon. Arnesen
et al. [52] use the Cox proportional hazard model to test whether gender and socioeconomic status could explain
variations in waiting time for inpatient surgery in Norway.Dimakouet al. [53] use the Kaplan-Meier estimator to
study the distributional changes in waiting times in the English NHS, using Hospital Episodes Statistics (HES) data.
They also employed a range of parametric proportional hazard and accelerated failure time models to analyse the
effects of provider and patient characteristics on waitingtimes. Laudicellaet al. [9] assess the effect of socioeconomic
status on waiting times in England. They do so using the standard Cox proportional hazard model along with its
extended and stratified counterparts. There have been applications of the GGM with additional heteroscedasticity [31]
and the GB2 distribution [29, 31] to estimating healthcare costs, but to our knowledge there have been no applications
to the estimation of waiting times.
Due to its advantages, generalised linear models have emerged as a popular choice by researchers in the modelling
of many health variables, in particular healthcare costs [20, 21, 23, 54]. However, despite its popularity in other areas
of health economics, there are no examples of GLM models being used in the estimation of waiting times.
2.3. Empirical Models
In this paper we focus only on models that have already been applied in previous literature and some generally used
extensions, as informed by the review of waiting time papersand previous comparative studies. Jones [31] provides
an excellent summary of the econometrics behind the majority of the models presented here. As such, where models
overlap, only a brief explanation of the model, and rationale for its use, is provided. New models are presented in
4
greater detail. The econometric specifications of each model will be summarised in Table 1, as well as methods for
deriving predictions.
2.3.1. Ordinary Least SquaresAn ordinary least squares (OLS) regression of the level of waiting times on a set of
regressors,yi = xiβ + εi, provides a good starting point in modelling waiting times.It is computationally inexpensive,
and as it is specified on the original waiting times scale, difficult re-transformations are avoided. However, standard
linear regression models fail to reflect consistently and reliably the conditional mean of a skewed distributions with
mass points because of the asymmetry in the response function and/or the inefficiency due to the common failure to
deal with heteroscedasticity.
2.3.2. Regressions on transformed waiting timesEarly methods in dealing with skewed, and more generally non-
normal, distributions centred on transformations of the dependent variable [20, 55, 56]. The behaviour of many
outcomes in health can be approximated well by log-normal distributions and thus taking logarithms returns the
distribution to normality. Even without log-normality, taking logarithms reduces skewness, making the distribution
more symmetric and closer to normality. Assuming normalityis achieved after transformation, inference is now
valid when applying OLS to the transformed data. However this produces predictions on the log scale, and predicted
waits on the raw scale are usually the ones of interest. The problems with the transformation approach arise due to
problems associated with re-transformation, in particular when heteroscedasticity is present. Under non-normality,
a simple exponential of predictions on the log scale do not provide unbiased predictions on the raw scale. If errors
are homoscedastic, this can be remedied using the smearing estimator [19]. However, under heteroscedasticity these
predictions are biased, and a much more complex smearing estimator is required.
Square root transformations have also been a popular choiceto reduce skewness. However, as with the log-
transformed model, the square root-transformed model produces predictions on the transformed rather than the raw
scale, and similar problems with re-transformation exists.
Rather than applying a particular transformation, Box and Cox [57] propose a flexible transformation, self-named,
which calculates the type of transformation during estimation. This was later extended by Chaze [58] to account
for censored data. However, the standard Box-Cox transformation is not applied here as it does not allow for
heteroscedasticity. The EEE model, explained later, performs the same transformation, but can take heteroscedasticity
into account, and is thus preferred.
2.4. Quantile Regression
Traditional regression estimation methods, such as OLS, model the mean of the dependent variable, conditional on
the covariates, and estimate the conditional mean function, E[Yi|Xi] = Xiβ. Quantile regression, on the other hand,
5
allows the estimation of quantiles (or percentiles) of the dependent variable, conditional on covariates. In this casewe
estimate the conditional quantile function:
Qτ [Yi|Xi] = Xiβτ (2.1)
where0 < τ < 1 defines the quantile of the dependent variable to be estimated, and parameters of interest,β, are
contingent on the value ofτ .
Unlike OLS, which uses the mean as the measure of central tendency and minimises the sum of squared residuals
to derive parameter estimates, quantile regression uses the median as the measure of central tendency, and parameter
estimates,βτ , are estimated by minimizing the sum of asymmetrically weighted absolute residuals [59]:
minβτ
N∑
i=yi≥βτXi
τ |yi − βτXi|+
N∑
i=yi<βτXi
(1− τ)|yi − βτXi| (2.2)
This implies weights are equal toτ if residuals are positive, and equal to(1− τ) when residuals are negative.
Quantile regression can be used to estimate the effects of covariates on different parts of the waiting times
distribution. This is an obvious benefit over OLS which can only estimate their effects at the mean, and thus gives
a much less comprehensive picture of how waiting times are determined. However it also has benefits in dealing
with distributional issues associated with waiting times.Because estimation involves minimising absolute rather than
squared residuals, less weight is placed on extreme values and thus estimated parameters are less sensitive to heavy
right-hand tails.
As we are only interested in some measure of central tendencyof waiting times and not any particular quantile of its
distribution, we evaluate the regression function at the 50th percentile (τ = 0.5), in which case the quantile regression
is equal to a median regression.
2.5. Non-Linear regression methods
As explained previously, although transforming waiting times can correct for the non-normality in waiting time
distribution, it creates difficulties if predictions are required on the raw scale, particularly in the presence of
heteroscedasticity. To circumvent this issue, non-linearmethods can be employed which directly account for
distributional aspects.
Non-linearity in these models is specified in the relationship between waiting times and explanatory variables.
Many of these models assume that the conditional mean of waiting times depend on some function of the exponential
function:
E[yi|xi] = µi = f(exp(xiβ)) (2.3)
6
As a result many of these models come under the umbrella ofexponential conditional mean (ECM) models. The
exponential function accommodates the skewed distribution of waits and recognises that waiting times take only non-
negative values [31].
These models are outlined extensively in Jones [31], and thus readers are directed there if they require detailed
econometric specifications.
2.5.1. Non-linear least squares (NLLS) estimationECM models can be estimated using non-linear least squares
(NLLS). In this case, the ECM model is specified as the non-linear regression:
E[yi|xi] = exp(x′iβ) (2.4)
The relevant first-order/moment conditions for this model3 are solved iteratively to give parameter estimates. This
approach uses only the first moment, rather than the full probability distribution, and thus may be more robust than
maximum likelihood. However it also may be less efficient, depending on the form of the variance function [31].
2.5.2. Count data modelsWaiting times are a form of count data as time waited can only take non-negative integer
values, and these integer values arise from counting ratherthan ranking [60]. The benchmark model for count data is
the Poisson model. This model assumes that the dependent variable,yi, follows a Poisson distribution with meanµi.
The Poisson probability distribution is given by:
P (yi|µi) =e−µiµyi
i
yi!(2.5)
whereµi represent the mean ofyi. This distribution is characterised by equidispersion, such that the mean is equal
to the variance i.e.E[y] = var[y] = µ. The Poisson regression model incorporates observed heterogeneity into the
Poisson distribution function, such thatE[yi|xi] = var[yi|xi] = µi = exp(xiβ).
In practice, the variance is usually greater than the mean (overdispersion), and thus the Poisson model rarely fits the
data well [61, 62]. To overcome the problems of the Poisson model the negative binomial model can be used. This
directly takes into account overdispersion, through an inclusion of an additional parameter. The negative binomial
distribution is given by:
P (yi|µi, νi) =Γ(yi + νi)
yi! Γ(νi)
(
νiνi + µi
)νi ( µi
νi + µi
)yi
(2.6)
3See [31] for details of these.
7
whereΓ represents the Gamma probability distribution,νi =1αi
determines the degree of dispersion andα > 0
defines the overdispersion parameter.
The negative binomial regression model incorporates both observed and unobserved heterogeneity into the
conditional mean such thatE[yi|xi] = µi = exp(x′iβ + ei) [61]. The Poisson model forms a special case of the
negative binomial model, with the two models being analogous if α = 0, and thus a test of this can be used
for model selection purposes. An extension of this model is the generalised negative binomial model. This
model is semi-parametric in nature as the shape parameter ispredicted during estimation. Another way to handle
overdispersion is through zero-inflated models. In these models (zero-inflated Poisson and zero-inflated negative
binomial), overdispersion is accounted for by changing themean structure to explicitly model the production of zero
counts [61]. It assumes two latent groups from which zero counts could be generated: an “always zero” group in which
patients have a zero wait with certainty, and a “sometimes-zero” group, where zeros are generated “by chance” as
would usually be assumed in count data models [61]4. This involves estimating a binary model, usually a logit, to
ascertain from which group a zero is generated, and a count model to model the positive count data. The Vuong test
[64] can be used to determine whether a zero-inflated model isappropriateTests on parameters in the zero-inflated
binomial model can also be tested, to choose between this model and the zero-inflated variant of the Poisson model5.
Zero-inflated models are special cases of the finite mixture negative binomial model and finite mixture Poisson models,
which permit mixing with respect to both zero and positive counts [63]. These more general cases will be explained in
more detail later.
2.5.3. Survival/Duration models: Hazard modelsInpatient waiting times form a type of duration data where the
”duration” refers to the time between some start date and some end date or “event”. Here the start date represents
admission to the waiting list, defined by the date at which a consultant has recommended a patient for treatment, the
end date refers to admission to treatment, and the duration represents the waiting time6.
Duration analysis involves the estimation of two functions, the survival function and thehazard function. The
survival function,S(t), defines the probability that admission to treatment is later than some specified time,t, or the
probability that a patient remains on the waiting list (untreated) until a given time. The hazard rate is defined as the
probability of admission at time t, given a patient has remained untreated up until time t, or more simply, the rate at
which patients leave the waiting list [53]. A key characteristic of the hazard rate is the concept of duration dependence,
which determines how the hazard rate changes over time. Survival and hazard functions are mathematically related,
4hurdle count data models also exist which assumes zero values are generated from a different process from positive counts. However, this falls under the umbrellaof two-part models which are beyond the scope of this paper. This model is outlined in detail in Deb and Trivedi [63], and thus readers are referred here for agreater explanation5hereα = 0 is tested. If this is rejected then the zero-inflated negative binomial is preferred6In this paper we present only a brief outline of survival models. A detailed overview of survival analysis is presented inJenkins [65] and a shorter overviewpresented in Jones [31].
8
with each hazard function producing a particular survival function and vice versa [53]. This relationship is defined by
the following:
h(t,X) =f(t)
S(t)(2.7)
wheref(t) represents the density of the duration distribution.
However the aim here is to generate predictions of waiting times using the covariate estimates. This type of analysis
can be carried out using either parametric models, which requires the specification of the distribution to model the
hazard rate, or semi-parametric models, where this distribution is estimated within the model.
Specifying parametric models requires assumptions to be made in two areas: (1) The shape of the hazard function,
and (2) whether covariates are time-dependent. The shape ofthe hazard function specifies how the hazard rate
changes with time, and is determined by the choice of the distribution used to model it. There are 6 commonly used
parametric duration models: exponential, Weibull, log-normal, log-logistic, Gompertz and the generalised gamma. The
exponential model is the most simple of these and assumes that hazard rate is constant over time. The Weibull model
generates increased flexibility by allowing hazard rates tobe non-constant over time, but restricts the relationship to
be monotonic. The Gompertz model again assumes a monotonic relationship, and hazard rates are assumed to either
increase or decrease with time. The log-logistic and log-normal models further increase flexibility by allowing non-
monotonic, unimodal hazards, in the form of an inverted U-shape relationship. The generalised gamma model (GGM)
is the most flexible of all parametric hazard models. The GGM includes 2 shape parameters, and nests the exponential,
Weibull, log-normal, and standard gamma models as special cases. Given this, Wald tests on parameters in the GGM
can be used to select between this more flexible model and its nested alternatives. For non-nested models, model
selection can be carried out using information criterion and log-likelihood values.
A further increase in flexibility can be generated through the use of the Generalised beta of the second kind (GB2)
distribution, which introduces a third shape parameter. This third shape parameter can capture excess kurtosis to
account for a heavy tail7. The GB2 distribution nests the GGM as a special case, as wellother 3-parameter distributions
(Burr-Singh-Maddala (BSM), Dagum and Beta-2) and 2-parameter distribution (Lomax and Fisk). Similar to the
GGM, tests on parameter estimates can be used to select between the GB2 and its nested models. Both the GGM and
GB2 models can be extended to account for additional heteroscedasticity. In the GGM model this involves assuming
that one of the shape parameters is equal to an exponential ofa linear combination of a set of regressors [21], and in
the GB2 involves allowing the natural log of a shape parameter to vary with the covariates [67]. We estimate the GB2
model, but due to computational expense do not employ any of its nested models. No code was available for these
7GB2 model can be estimated by maximum likelihood using Stephen Jenkins’gb2fit command in Stata [66].
9
extensions and thus these were not performed. The distribution, hazard function and expected duration of each model
are also summarised in Table 1.
The time-dependency of covariates is determined by whetherwe use a proportional hazard (PH) model, or an
accelerated failure time (AFT) model. PH models assume there is a baseline hazard function, that depends on time
but not on other variables which affect waiting times, whichare assumed to be time-invariant. This implies that the
baseline hazard is constant across all individuals. The covariates simply scale the hazard function for each individual.
The AFT model assumes that covariates are time-dependent, in particular that the log of survival time,T , is linearly
dependent on the covariates [65]. In an attempt to reduce therestrictive assumptions imposed by parametric models,
Cox [68] developed a semi-parametric approach to survival analysis. Named the Cox proportional hazards (Cox PH)
model, this leaves the distribution of the baseline hazard function unspecified. Estimates of the baseline hazard are
calculated before estimation of the conditional mean, which utilises the partial likelihood function. Conditioning on
the covariates means the baseline hazard can be factored outof the partial likelihood function.
2.6. Generalised linear models (GLM)
2.6.1. Traditional parametric GLMOrdinary least squares has the restrictive assumption of linearity in parameters,
such that the expected value of the outcome of interest must be a linear function of the regressors i.e. that it estimates
the conditional mean functionE[yi|xi] = µi = xTi β. Generalised linear models (GLMs) combat this by allowing the
dependent variables to have distributions other than the normal distribution. GLMs specify the conditional mean
function directly:
E[yi|xi] = µi = f(x′
iβ) (2.8)
This allows the conditional mean to dependent on the regressors non-linearly. The non-linearity is determined by
the first component of the GLM model; the link function,g(.), where:
g(µi) = x′
iβ ⇒ g−1(µi) = f(x′
iβ) (2.9)
The most frequently used link functions are the identity link, where, like OLS, the mean depends on the covariates
additively, and the log link, where the mean is a multiplicative function of the covariates [31]. However other
specifications of the link function are available, namely the logit, probit, cloglog, negative binomial, log-log, and
log-complement functions, as well as any power and odds power functions [69].
The second component of the GLM is the distribution function. This function specifies the relationship between the
conditional variance and the mean:
10
V ar[yi|xi] = ν(µi) (2.10)
This implies that the GLM model restricts the conditional variance to be a function of the mean. The choice of the
distribution is at the discretion of the researcher, but is restricted to those which belong to the linear exponential family
[70]. The various specifications for the link and distribution functions can be combined freely, but not all combinations
make sense (see [69] for further details). The appropriate combination is not always clear given the sample data, but
the choice of link for a given distribution can by guided by the canonical parameterisation of the GLM model [70]. The
most appropriate link function can be selected using the Pregibon link test [71], described later, and the appropriate
distribution using a modified version of the [72] test presented in [20]
GLMs are estimated based on quasi-score functions or classical “estimating equations” (see [31] for an explanation).
Given that GLMs use only the linear exponential family of distributions, the GLM estimator has the pseudo- or quasi-
maximum likelihood property and thus estimates are consistent as long as the mean function is correctly specified
[73].
The GLM has advantages over other methods which deal with thenon-normality of waiting times. Firstly, given
that the mean function,µ(x), is transformed rather than the dependent variable, predictions are generated on the raw
waiting times scale, and thus avoids the re-transformationproblems faced by the log and square root models. Also,
GLMs inherently take into account heteroscedasticity through the choice of the distribution function.
We will estimate the GLM’s using the same combinations of link and distribution functions used in Jones [31]
namely square root-gamma, log-gamma, the Poisson and log-normal8.
2.6.2. Semiparametric and nonparametric GLMsAs a result of the problems of selecting the optimal distribution and
corresponding link function, Basu and Rathouz [25] developed the extended estimating equations (EEE) estimator.
This semiparametric estimator uses the Box-Cox transformation for the link function, which includes log and power
links as special cases. This is combined with a general powerfunction for the variance which nests all of the common
GLM distributions. The additional parameters in each function are estimated along with the regression coefficients by
quasi-maximum likelihood using extended estimating equations [31]9.
Chiou and Muller [75] introduce even more flexibility into the model by leaving both the link and variance functions
unspecified. The model is estimated using a 3 stage approach,which involves estimating the link and variance functions
non-parametrically, before substituting these into the quasi-maximum likelihood function. They find that the resulting
8We note that the GLM model with the poisson family is equivalent to estimating a poisson regression for count data, and thus results will be very similar to thatfrom a Poisson regression. Results could differ slightly due to differences in starting values and convergence criteria of the algorithms [74].9This can be estimated using Anirban Basu’spglm command in Stata.
11
parameter estimates are asymptotically efficient in comparison to both the quasi-maximum likelihood estimator and
the standard GLM estimator (when both the link and variance functions are treated as known).
2.7. Finite mixture models
Many variables of interest in health economics are characterised by bi-modal or multi-modal distributions. These are
not dealt with well by the models outlined previously. Mixture models are well suited to modelling outcomes where
its values, or the effect of covariates on its values, differsystematically between groups of individuals.
In finite mixture models individuals are assumed to be heterogeneous across latent classesj = 1, . . . , C, but
homogeneous within these classes, conditional on covariates10. The effect of covariates,xi, on the outcome of interest,
yi, is assumed to vary over thej classes. The probability that individuali, belongs to classj, is given byπij , where
0 < πij < 1 andn∑
i=1
πij = 1. The density ofyi, conditional on covariates,xi, but unconditional on class membership,
is given by:
f(yi|xi;πi1, ........, πiC ;β1, ......, βC) (2.11)
i.e. is the weighted average of the densities for each class,with the weights equal to the probabilities of being in each
class. Class probabilities are estimated along with the parameter estimates, where these probabilities can be treatedas
either fixed or variable across members of the class.
Post-estimation, each individual is assigned a posterior probability of belonging to each class, which depends on the
relative contribution of that class to the individual’s likelihood function (see [31] for more details). Each individual is
then assigned to the class with highest posterior probability. The outcome variable can then be predicted separately for
each class.
Although the number of available distributions for mixed models is large, the majority of applications in health
economics have used either a mix of gammas or a mix of negativebinomials. Finite mixture gamma models have found
widespread application in the modelling of continuous variables such as healthcare costs (for example [31, 32, 76]
although a mixture of log-normal densities has also been used to model this variable [77]. Finite mixture negative
binomial models and its extensions have been widely used to model measures of healthcare utilisation such as number
of GP visits and visits to private/public sector specialists (see for example [63, 76, 77, 78, 79, 80, 81]. Mixing of other
distributions has seen greater popularity outside of thesetwo healthcare variables. For example a finite mixture normal
model was used to estimate the effect of prenatal care on birth weight [82].
Given that waiting time is a count variable, the FMM model we estimate here will utilise a mix of negative binomials.
Negative binomials are used to model count data in the case ofoverdispersion, usually generated by large amount of
10Here we present finite mixture models as latent class models.However these models can also be used in the case where membership to classes is observed,for example in a two-part model, where positive and zero waits are modelled separately, or mulit-part models, applied todifferent categories of waiting times,separated by primary condition or patient characteristics.
12
zero values. As zero waits are relatively infrequent in our sample (0.39% for the 2002 sample), it is possible that the
equidispersion property of the Poisson distribution will be satisfied. Thus, an FMM with a mixture of Poisson densities
will also be estimated.
3. Data
We use the Scottish Morbidity Record 01 (SMR01) data set. It records detailed information on all admissions to acute
hospitals including patient characteristics such as waiting time, age, number of co-morbidity conditions, and disease
type. These data collect information on the distribution ofwaiting times only for patients “admitted for treatment from
the waiting list”. As such, it measures the full duration of waiting for patients who were treated.
We extract a subset of patients from the full-year population who were admitted for elective procedures. We next
restrict our attention to only the first hospital stay for each patient in each year. We lose, respectively, 33.3% and
35% of the sample for years 2002 and 2007. We also exclude observations where the waiting time is longer than two
years as these are most likely coding errors. This results inadditionally constricting the sample by 2.3% and 0.4% for
years 2002 and 2007. We exclude from analysis pregnancy and conditions originating in the perinatal period (ICD-10
chapters 15 and 16) because of small number of observations.We have omitted as well external causes of morbidity
and mortality and codes for special purposes (ICD-10 chapters 20 and 22) because the same ICD-10 code can be used
to describe more than one medical condition with different severity.Finally, we disregard observations with missing
data on waiting times. This omits 2.2% and 2.6% from the original 2002 and 2007 samples. As a result, our analytical
sample has 657,443 observations in total with 321,929 patient observation for 2002 and 335,514 for 2007. In preparing
our data for analysis we follow Janulevicuiteet al. [48].
4. Statistical Analysis
4.1. Baseline specification
We use a linear additive specification to model the relationship between waiting times and regressors. For simplicity
we use a parsimonious set of explanatory variables. These include a dummy for gender, age as a categorical variable,
and Charlson index. We construct the index using primary diagnosis and other medical conditions information [83].
It places disease conditions into 17 “Charlson categories”based on their ICD-10 codes. Weights are assigned to each
category which are increasing in severity. The sum of these weights across all conditions for a particular patient
provides the comorbidity index for that patient. Where appropriate we use robust standard errors.
13
4.2. Quasi-Experimental Design
Following Jones [31] we assess model performance using bothin-sample estimation and out-of-sample metrics. “In-
sample” metrics are generated by estimating the model on thefull sample. The out-of-sample predictions are estimated
usingv artificially constructed subsamples generated in the following way. We first estimate model parameters using
all data, but excluding the subsample of interest. Second, these parameters along with explanatory variables are used
to predict the model outcome for the excluded subsample. Thefirst and second step repeatv times to cover the
original data set. The subsamples are drawn randomly from the original data without replacement. The cross-validation
technique eliminates the problem of in-sample overfitting.
The model comparison measures used in this study focus only on those assessing how well models estimate in-
and out-of-sampleE[y|x]. Often a researcher’s primary interest is to estimate the marginal effect, i.e.∂E[y|x]/∂x for
a covariatex. However, for nonlinear specifications, to the best of our knowledge, there is no good way to compare
marginal effects across different models. Instead we compare E[y|x] which is a necessary condition for unbiased
estimation of marginal effects [27].
4.3. Evaluation of Performance
4.3.1. Pre-comparison model selectionSome of the models we consider are not estimated in levels. The process of
retransformation heavily depends on the distribution of error terms. The retransformation methods available quite
often rely on the assumption of normality. Normality of the waiting time distribution, and thus errors, are tested using
both the Shapiro-Wilks test [84] and the D’Agostino test [85]. If normality is rejected, the re-transformation method
depends on the presence of heteroscedasticity. Under homoscedasticity, raw-scale predictions can be consistently
estimated from the log model using the Duan smearing estimator11. A similar re-transformation can be carried out for
the square root model. This involves adding the mean residual from the square root model to the squared prediction.
If heteroscedasticity is present, consistent re-transformation requires knowledge of its form. As our regressors arenot
binary in nature, the process is too complex and thus Duan’s smearing estimators will still be applied, recognising that
this may itself lead to biased predictions. The test for heteroscedasticity depends on the normality of the distribution.
Given the normality assumption of the standard Breusch-Pagan test, non-normality renders this test inconsistent. If
non-normality is found, a heteroscedasticity test which relaxes the normality assumption for the errors (by specifying
an i.i.d distribution) is used.
One benefit of flexible models such as count and duration models, is the ability to choose between the more general
specifications and their nested alternatives. In order to evaluate the competing parametric hazard models, we test
the restrictions imposed by each nested model of the GGM distribution using Wald tests. ifκ = 0 is not rejected,
11Transformed predictions using this smearing estimator canbe produced using Christopher Baum’slevpredict command in Stata.
14
the log-normal model is optimal. Ifκ = 1 is not rejected the Weibull model is optimal.ln(p) = 0 implies p is not
significantly different from 1, and thus the standard gamma model is optimal. Ifk = p = 1 is not rejected, then the
exponential model is optimal. If all restrictions are rejected, the GGM model is optimal. We report the results of these
tests for each frequency-year12 for the full aggregate samples13 (see Tables A1 and A2 in [86]). We also report the rates
of rejection for each restriction across all ICD-10 chapters at a 5 percent significance level for each frequency-year
(Tables A4, A5 in [86]). To select between non-nested models, we use Akaike and Bayesian Information Criteria as
well as the log-likelihood value for both the full and constant only models.
We test for optimal count model specification. The formal test of α = 0 from the negative binomial model is carried
out to ascertain whether overdispersion is present and thuswhether the negative binomial model should be preferred
to the Poisson model. Vuong test is then carried out to determine whether the zero-inflated variant of the model should
be used. A test ofα = 0 from the zero-inflated negative binomial model is conductedto test between this model and
the zero-inflated Poisson model14. We report the results of these tests for both the aggregate sample and individual
chapters for each frequency year (Tables B1-B4 in [86]). We also report the percentage of chapters for which each test
was rejected for each frequency-year (Table B5 in [86]).
4.3.2. In-Sample TestsWe follow Jones [31] in measuring comparative performance using both specification tests
and goodness of fit measures. We use Pregibon link and Pearsontests to test models’ misspecification. The Pregibon
test tests whether(Xβ)2 has any explanatory power in addition toXβ15. The Pearson test checks whether there is a
significant correlation between fitted residuals andXβ. Significance in both cases implies model misspecification.
We employ a range of goodness of fit measures for the estimation sample. Measures include theR2 from an auxiliary
regression of actual waiting times on predicted values, theroot mean squared error (RMSE), the mean absolute
prediction error (MAPE), and mean prediction error (MPE), where all predictions are those calculated on the raw
scale. The formulae used in the calculation of each of these measures are given below:
RMSE =
√
√
√
√
n∑
i=1
(yi − yi)2
n(4.12)
MAPE =
n∑
i=1
|yi − yi|
n(4.13)
12Here frequency relates to either weekly or daily waits, and the year to either 2002 or 200713These are the sample which pool data across all chapters for each year14Given that the variables which are thought to affect the latent class of zero observations are unknown, if a zero-inflatedmodel is required, we will include thesame variables in the logit model, as we do to model the count data. If a negative binomial is the preferred model, the generalised form of the negative model willalso be estimated.15 if it has then this implies that the true non-linearity in therelationship between the waiting times and the regressors is greater than assumed in the model beingestimated.
15
MPE =
n∑
i=1
(yi − yi)
n(4.14)
R2 = 1−
n∑
i=1
yi − (α+ βyi)2
n∑
i=1
(yi − yi)2(4.15)
whereα andβ represent the parameter estimates from the auxiliary regression. Note that MPE can be interpreted for
non-linear models as a measure of bias.
These metrics are calculated for both the separate ICD-chapter samples16 and the aggregate sample (see Appendix
C in [86]) for each frequency-year. To be parsimonious we only fully describe results for one ICD-10 chapter, using
the largest sample size andR2 as the choice criteria. We then check consistency of these results across other chapters
and the aggregate sample. This is done by counting the numberof occasions a model performs best on each metric,
and calculating the average rank of each model across chapters for each metric.For the link and Pearson tests, a model
receives a count for a chapter if it passes each test, rather than for a top rank based on the greatest p-value. Average
ranks then based on the proportion of all chapters for which amodel passed each test17.
4.3.3. Out-of-Sample TestsThe Copas test is used as the sole specification test for the validation sample. Heavily
parameterized models have the potential to over-fit samplesof data, leading to poor out-of-sample forecast accuracy
[31]. The Copas test provides a good measure of out-of-sample performance and guards against over-fitting when
models are used for prediction purposes [54, 87]. The Copas test involves running a simple OLS regression of
y = α0 + α1y + ǫ, wherey represents the predictions fromv-fold cross-validation. If there is no overfitting then we
would expectα0 = 0 andα1 = 1. A deviation from these values represents evidence of overfitting, and more generally,
poor predictive power. As in Jones [31] we use a less restrictive specification of the test, i.e.H0 : α1 = 1, as well a test
of the more restrictive null,H0 : α0 = 0, α1 = 1.
For the validation sample, we employ the same goodness of fit measures as for the estimation sample, but omit the
R2 value. In this case, the MPE measures the degree of bias of predicted values within the forecast sample, and can
be interpreted as a measure of the accuracy of predictions atan aggregate level. The MAPE can be interpreted as a
measure of the accuracy of individual predictions [29].
The results for separate ICD-10 chapters are available uponrequest, while count and average rank tables are
represented in Appendix D in [86].
16Available upon request17the smaller the rank, the better the model
16
5. Results and Discussion
5.1. Distributional characteristics of waiting times levels and residuals
In this subsection we will discuss the distributions of waiting times and error terms for levels, log and square root
regressions. The sample distribution, defined in terms of weekly and daily frequencies, and for both 2002 and 2007,
illustrate the challenges in modelling waiting times. Sample statistics are presented in Table 2. Mean waits are
considerably greater than median waits, and the distribution is skewed to the right. The kurtosis of the distribution
is significantly larger than three, indicative of a heavy right-hand tail. Non-normality is confirmed by both histograms
and formal tests for levels, log and square root transformations for weekly and daily frequencies18 (Figures 1 and 2).
The residuals from linear regressions on level, log and square root transformation of waiting times (Figures 3 and 4)
exhibit the same non-normality behaviour.
5.2. Pre-comparison model selection
Given non-normality of errors, simple exponentiated coefficients from the log-transformed model provide biased
estimates of the true coefficients on the raw scale. To account for this we employ Duan’s smearing estimator noting
it may result in bias due to heteroscedasticity, but considerably less than if an exponential retransformation was
used. Also, given non-normality and heteroscedasticity, the additive retransformation is used to generate raw scale
predictions for the square-root transformed model.
Results from formal tests for duration model selection indicate that models which allow a more flexible relationship
between the hazard function and time perform better. As such, the exponential model, which assumes the hazard rate
is constant over time, performs poorly across all metrics for both weekly and daily waits. For weekly waiting times,
the log-normal model also performs well, with estimated statistics being close to that of the generalised gamma, but
Weibull model performs poorly. For daily waits, we observe an opposite pattern with the Weibull model performing
well and the log-normal model performing poorly (See Section A in [86]).
For count data model results reject the Poisson model in favour of the negative binomial model. This is consistent
across all chapters in each frequency-year. Vuong test suggests that zero-inflation is not required for weekly waits in
any year for the aggregate samples. This story is identical for the separate ICD-10 chapters apart from for chapter 13
in 2002 were zero-inflation is needed. For daily waits the results differ with zero-inflated model is appropriate for the
aggregate sample in 2002, and for 50% and 23.5% of the ICD-chapters for years 2002 and 2007 respectively. This
difference is expected given how weekly waits were defined: Aweekly wait of zero was assigned if and only if the
daily wait also took a value of zero, and thus for there is an identical amount of zero observations for both definitions.
18Shapiro-Wilks and D’Agostino formal tests are not reported, they are available upon request.
17
However, aggregating waiting times to a weekly level increases the amount of observations for each positive value,
making zero values less important (See Tables B1 – B5 in [86]).
5.3. In-Sample Results
ICD-10 chapter 2 (Neoplasms/Cancer), has the largest sample size in 2002 and the third largest sample size in 2007.
We use this chapter to demonstrate model comparison in details. Relative performance of models for this chapter in
terms of the MPE, MAPE, RMSE andR2 for both 2002 and 2007, as well as the Pregibon link and Pearson correlation
tests are presented in Tables 3 and 4 for the weekly and daily frequencies respectively. The best performing model(s)
for each metric are highlighted in bold.19Relative performance of models is very similar in each year and frequency of
waiting times for the MPE, MAPE, RMSE andR2 metrics. Consistent with findings in Veazieet al. [22] and Joneset
al. [29], results for this chapter imply a trade-off between bias, as measured by the MPE, and precision, as measured
by MAPE, since no model performs best on both of these metricsfor any frequency-year. This result holds for all
ICD-chapters and aggregate samples.
OLS on level waits, OLS on square root waits, the ECM model estimated by Poisson maximum likelihood, and
the GLM with a log link and Poisson distribution perform the best on the MPE for all frequency-years, indicating that
these models provide the least biased estimates for this chapter. Results from the count and average rank tables confirm
consistency of these findings across ICD-chapters, with these models being the only models to produce the lowest
MPE for at least one chapter, and are the four best placed models in terms of average rank. These are also the four best
performing models in the aggregate samples for all frequency-years. Count data models and other GLM specifications
also perform well across all chapters shown by their moderately high average ranks. The Cox proportional hazard
model performs particularly badly on this metric, with the average error being almost as large as the mean wait in
each frequency-year. Estimates from all the parametric duration models, log-OLS, and quantile regression models, as
well as the EEE when it converged, are also heavily biased. This is again consistent across chapters, with these models
providing the poorest average ranks, and consistent with the results from the aggregate sample.
A different pattern is observed for the MAPE, where quantileregression performs the best and parametric duration
models exhibits similarly good performance in all frequency-years for the cancer chapter. Again consistency is found
across chapters, with these models providing clearly superior average ranks, and also are the only models to perform
best for at least one chapter for the vast majority of frequency-years. These are also the best performing models in the
aggregate samples. These results indicate that although estimated waits from these models are biased, they are the most
accurate. The EEE, log-OLS, and Cox PH models again perform poorly both for the cancer chapter, across chapters,
and in the aggregate sample, although the EEE does provide the most accurate in-sample predictions for chapter 16 for
19Results are missing for non-convergence cases.
18
both weekly and daily waits in 200220. Count models and all GLM specifications again perform well across chapters
for all frequency-years, indicated by they good average ranks and relative performance in aggregate samples.
In terms of goodness of fit, the ECM model estimated by NLLS andthe GLM with a log link and normal distribution
achieve the highestR2 and lowest RMSE for all frequency-years, although all otherGLM specifications, OLS on level,
log, and square root waits, and all count data models generate similar levels of performance. Results from the best count
calculations indicate results for the cancer chapter and across chapters are similar: The GLM log-normal, the ECM
model estimated by NLLS and the GLM with a log link and normal distribution, all generate either the highestR2 or
lowest RMSE for at least one chapter for all frequency-years, and the zero-inflated Poisson model meets this criteria
for three out of the four frequency-years. These models alsoperform the best on average ranks and in the aggregate
samples. In general both the Cox PH and parametric duration model perform very poorly for chapter 2, across chapters
and in the estimation sample. However, surprisingly, the generalised gamma model is found to be the best fitting model
in some chapters despite its poor performance elsewhere.
The Cox proportional hazard model, the GLM log-normal, and the GLM sqrt-gamma are only models to pass the
Pregibon link test for the daily frequency in 2002, indicating that they are the only models to capture the true non-
linearity between waiting times in 2002 and the regressors.This is surprising given that other models such as the count
models perform consistently well on the other metrics. The story is similar across other frequency-years, but all forms
of the negative binomial model also pass the link test there.However, these results are not totally indicative of the
results across other chapters. Although the models which pass the link test for chapter 2 tend to be those for which
the test was passed for the most other chapters, all models but the ECM model estimated both by NLLS and Poisson
ML, the GLM log-poisson, and the GLM sqrt-gamma, pass the link test for over half of the ICD-10 chapters for all
frequency-years21. When the EEE model converged and standard errors were produced, this model also performed
well on the link test, passing for at least a half of all chapters for all frequency-years.
Furthermore, for chapter 2, all parametric duration modelsand a finite mixture of negative binomials also fail the
Pearson test for all frequency-years indicating further misspecification22. The story is consistent across chapters with
these models passing the Pearson test for less than half of the 18 chapters studied. OLS on transformed waits also
consistently perform poorly across chapters. Also, despite its good performance on the link test, the Cox proportional
hazard model fails the Pearson test for all frequency-yearsfor the cancer chapter. Looking at results across all chapters,
for frequencies and years, OLS on level waits, the ECM estimated by both Poisson ML and NLLS, all count data
models, and all GLM specifications excluding the GLM sqrt-gamma, perform exceptionally well on the Pearson test,
20this chapter relates to diseases of the genitourinary system.21Also, the GLM sqrt-gamma only fails to pass for over half of chapters for the daily frequency in 2002, and all models pass the link test for over half of thechapters for the weekly frequency in 2007.22The quantile regression model and OLS on both square root andlogged waits also fail both specification tests for all frequency-years apart from the dailyfrequency in 2002
19
with the test indicating miss-specification for either 1 or 0chapters. OLS on transformed waits, and both survival
models perform very poorly on this metric for all frequency-years. Results are similar when considering the results
from aggregate data.
In general, full sample results indicate that generalised linear models with log links and count data models are
the most appropriate for the modelling of waiting times data. Allowing more flexibility in the count models, either
through zero-inflation or using the generalised negative binomial, do not significantly improve performance. However,
introducing additional flexibility into the GLM model through use of EEE, has large negative effects on performance.
This model, as well as quantile regression and all duration models (parametric and Cox PH), perform poorly across
most of the metrics and thus are not appropriate for modelling waiting times with this dataset23. Also notable is that
the OLS model performs well across the majority of criteria,a finding common in many of the model comparisons
using healthcare cost data. Performance is similar when OLSis applied to square-root transformed waits. However,
model performance is reduced when using the logarithm of waits as the dependent variable, especially in terms of bias
and accuracy.
5.4. Validation Sample Results
Results from the validation forecasts largely confirm results from the in-sample estimations. OLS on both the level and
square-root transformed waits, GLM log-Poisson, and the ECM estimated by Poisson ML, and all count data models
are amongst the models generating out-of-sample predictions with very little bias for chapter 2. However, count and
average rank results indicate that relative performance ofcount models have improved. The negative binomial and its
zero-inflated and generalised extensions generate the lowest MPE for some chapters in some frequency-years. We also
find that the square root transformed OLS model generates thelowest MPE for any chapter in any frequency-year and
its average ranking deteriorates. This change is mirrored in the aggregate results. The Cox PH, generalised gamma,
and quantile regression consistently perform poorly on this metric across all chapters and frequency-years.
Quantile regression and parametric duration models produce the lowest MAPE for chapter 2 for all frequency-years.
This performance is mirrored across chapters with these models being the only models that generate the lowest MAPE
for chapters in any frequency-year, and have by far the best average rank figures. This indicates that good in-sample
accuracy has been carried through to good out-of-sample accuracy. As before the Cox PH, log-transformed OLS, and
EEE models generate predictions with substantial inaccuracy, for both chapter 2 and across chapters, indicated by their
poor average ranks for each frequency-year.
The ECM model estimated by NLLS and the GLM log-normal again produce the best predictions for the cancer
chapter, as measured by the RMSE. This is consistent across all frequencies and years. All other GLM specifications,
23The disadvantages of the EEE increases further when taking into account the problems with convergence.
20
OLS on level, log, and square root waits, and all count data models also perform well on this metric. Count and
rank results indicate this performance is carried through across all chapters. Both the semi-parametric Cox model
and parametric duration models perform particularly bad onthis metric in all frequency years, generating very poor
average ranks.
The Copas test was used as a test for overfitting and more generally for out-of-sample forecast accuracy. As expected,
the Copas test indicates that there is no overfitting when theOLS regression on level waits, the simplest of all model,
for the cancer chapter is considered. In addition, more complex models such as the generalised and zero-inflated
count data models pass the Copas test, despite being most susceptible to overfitting the data. All count data and
GLM specifications with log links pass the Copas test in each frequency-year. The quantile regression, all duration
models, and the log and square-root transformed OLS models are found to overfit the data. This indicates they are not
appropriate for modelling waiting times in this dataset. This finding is consistent across chapters with models that pass
the Copas test for the cancer chapter doing so for over two-thirds of all chapters in each frequency-year, and models
that fail, overfitting the data for the majority of other chapters in each frequency-year.
6. Conclusions
We estimate a wide range of models used in the analysis of inpatient waiting times and other skewed health outcomes
using Scottish waiting times data. Testing parameter estimates in a generalised gamma model and information criteria
are used as a tool to evaluate duration models. We also run tests to determine the optimal count data model and whether
subsequent zero-inflation is needed. Comparative performance was determined using a quasi-experimental design.
Specification tests, and measure of bias, accuracy and goodness of fit were used to determine in-sample performance.
V -fold cross-validation techniques were then used to generate a set of out-of-sample forecasts. The same measures of
bias, accuracy and fit were calculated on this sample and Copas test was run to check for overfitting.
The results suggest that a wide range of models may be appropriate for the modelling of waiting time. In particular,
count data models generate unbiased predictions, with reasonable accuracy, and capture well the non-linearity in
the relationship between waiting times and the regressors,as well as not overfitting the data. Results are similar for
generalised linear models. We also find that a simple linear OLS specification performs well on the majority of metrics,
calling into question the need to employ more advanced and computationally intensive techniques.
21
Bibliography
References
[1] Iversen T. A theory of hospital waiting lists.Journalof HealthEconomics 1993;12(1):55–71.
[2] Iversen T. The effect of a private sector on the waiting time in a national health service.Journalof Health
Economics 1997;16(4):381–396.
[3] Lindsay C, Feigenbaum B. Rationing by waiting lists.TheAmericanEconomicReview 1984; :404–417.
[4] Martin S, Smith P. Rationing by waiting lists: an empirical investigation.Journalof Public Economics 1999;
71(1):141–164.
[5] Cullis J, Jones P, Propper C. Waiting lists and medical treatment: analysis and policies.Handbookof Health
Economics 2000;1:1201–1249.
[6] Dusheiko M, Gravelle H, Jacobs R. The effect of practice budgets on patient waiting times: allowing for selection
bias.HealthEconomics 2004;13(10):941–958.
[7] Propper C. The disutility of time spent on the United Kingdom’s National Health Service waiting lists.Journal
of HumanResources 1995; :677–700.
[8] Siciliani L, Hurst J. Tackling excessive waiting times for elective surgery: a comparative analysis of policies in
12 OECD countries.HealthPolicy 2005;72(2):201–215.
[9] Laudicella M, Siciliani L, Cookson R. Waiting times and socioeconomic status: Evidence from England.Social
Science& Medicine 2012; .
[10] Gravelle H, Siciliani L. Ramsey waits: Allocating public health service resources when there is rationing by
waiting.Journalof HealthEconomics 2008;27(5):1143.
[11] Appleby J, Boyle S, Devlin N, Harley M, Harrison A, Thorlby R. Do English NHS waiting time targets distort
treatment priorities in orthopaedic surgery?Journalof HealthServicesResearchandPolicy 2005;10(3):167–172.
[12] Nikolova S, Harrison M, Sutton M. Does waiting time influence the effectiveness of surgery? Evidence from the
national PROMs dataset 2014. University of Mancheter working paper.
[13] Sanmartin C, Berthelot J, McIntosh C,etal.. Determinants of unacceptable waiting times for specialized services
in Canada.HealthcarePolicy 2007;2(3):e140.
22
[14] Cutler D. Equality, efficiency, and market fundamentals: the dynamics of international medical-care reform.
Journalof EconomicLiterature 2002;40(3):881–906.
[15] Cullis J, Jones P. Inpatient waiting: a discussion and policy proposal.British MedicalJournal(Clinical research
ed.) 1983;287(6403):1483.
[16] Cullis J, Jones P. National health service waiting lists: A discussion of competing explanations and a policy
proposal.Journalof HealthEconomics 1985;4(2):119–135.
[17] Smith P. Performance management in British health care: will it deliver? HealthAffairs 2002;21(3):103–115.
[18] Oliver A. The English National Health Service: 1979-2005.HealthEconomics 2005;14(S1):S75–S99.
[19] Duan N, Manning W, Morris C, Newhouse J. A comparison of alternative models for the demand for medical
care.Journalof Business& EconomicStatistics 1983;1(2):115–126.
[20] Manning W, Mullahy J. Estimating log models: to transform or not to transform?Journalof HealthEconomics
2001;20(4):461–494.
[21] Manning W, Basu A, Mullahy J. Generalized modeling approaches to risk adjustment of skewed outcomes data.
Journalof HealthEconomics 2005;24(3):465–488.
[22] Veazie P, Manning W, Kane R. Improving risk adjustment for Medicare capitated reimbursement using nonlinear
models.MedicalCare 2003;41(6):741–752.
[23] Buntin M, Zaslavsky A. Too much ado about two-part models and transformation?: Comparing methods of
modeling Medicare expenditures.Journalof HealthEconomics 2004;23(3):525–542.
[24] Montez-Rath M, Christiansen C, Ettner S, Loveland S, Rosen A. Performance of statistical models to predict
mental health and substance abuse cost.BMC MedicalResearchMethodology 2006;6(1):53.
[25] Basu A, Rathouz P. Estimating marginal and incrementaleffects on health outcomes using flexible link and
variance function models.Biostatistics 2005;6(1):93–109.
[26] Basu A, Arondekar B, Rathouz P. Scale of interest versusscale of estimation: comparing alternative estimators
for the incremental costs of a comorbidity.HealthEconomics 2006;15(10):1091–1107.
[27] Basu A, Manning W, Mullahy J. Comparing alternative models: log vs Cox proportional hazard?Health
Economics 2004;13(8):749–765.
[28] Mullahy J. Econometric modeling of health care costs and expenditures: a survey of analytical issues and related
policy considerations.MedicalCare 2009;47(7 Supplement1):S104–S108.
23
[29] Jones A, Lomas J, Rice N,et al.. Applying beta-type size distributions to healthcare cost regressions.Health,
EconometricsandDataGroup(HEDG) WorkingPapers 2011; .
[30] Hill S, Miller G. Health expenditure estimation and functional form: applications of the generalized gamma and
extended estimating equations models.HealthEconomics 2010;19(5):608–627.
[31] Jones A.Modelsfor healthcare. University of York., Centre for Health Economics, 2010.
[32] Deb P, Burgess J. A quasi-experimental comparison of econometric models for health care expenditures.Hunter
CollegeDepartmentof EconomicsWorkingPapers 2003;212.
[33] Johar M, Keane M, Jones G, Savage E, Stavrunova O. Differences in waiting times for elective admissions in
NSW public hospitals: A decomposition analysis by non-clinical factors.TechnicalReport, CHERE Working
Paper 2010.
[34] Johar M, Jones G, Keane M, Savage E, Stavrunova O. Waiting times for elective surgery and the decision to buy
private health insurance.HealthEconomics 2011;20(S1):68–86.
[35] Reyes J. Do female physicians capture their scarcity value? the case of ob/gyns.TechnicalReport, National
Bureau of Economic Research 2006.
[36] Carlsen F, Kaarboe O. Notatserie i helseøkonomi.TechnicalReport.
[37] Cooper Z, McGuire A, Jones S, Le Grand J. Equity, waitingtimes, and NHS reforms: retrospective study.British
MedicalJournal 2009;339.
[38] Pell J, Pell A, Norrie J, Ford I, Cobbe S. Effect of socioeconomic deprivation on waiting time for cardiac surgery:
retrospective cohort study.British MedicalJournal 2000;320(7226):15.
[39] Hauck K, Street A. Do targets matter? A comparison of English and Welsh national health priorities.Health
Economics 2007;16(3):275–290.
[40] Sloan F, Lorant J. The role of waiting time: Evidence from physicians’ practices.TheJournalof Business 1977;
50(4):486–507.
[41] Tinghog G, Andersson D, Tinghog P, Lyttkens C. Horizontal inequality in rationing by waiting lists.International
Journalof HealthServices 2010; .
[42] Sharma A, Siciliani L, Harris A.Waitingtimesandsocioeconomicstatus:doessampleselectionmatter? Monash
University, Business and Economics, Centre for Health Economics, 2011.
24
[43] Monstad K, Engesæter L, Espehaug B. Waiting time socioeconomic status-an individual level analysis.Technical
Report 2010.
[44] James C, Bourgeois F, Shannon M. Association of race/ethnicity with emergency department wait times.
Pediatrics 2005;115(3):e310–e315.
[45] Park C, Lee M, Epstein A. Variation in emergency department wait times for children by race/ethnicity and
payment source.HealthServicesResearch 2009;44(6):2022–2039.
[46] Johar M, Jones G, Keane M, Savage E, Stavrunova O. Discrimination in a universal health system: Explaining
socioeconomic waiting time gaps.Schoolof FinanceandEconomics,Universityof Technology,SydneyWorking
PaperSeries 2011; .
[47] Siciliani L, Martin S. An empirical analysis of the impact of choice on waiting times.HealthEconomics 2007;
16(8):763–779.
[48] Janulevicuite J, Askildsen E, Kaarboe O, Holmas T, Sutton M. The impact of different prioritisation policies on
waiting times: Case studies of Norway and Scotland.SocialScienceandMedicine 2013;97(November):1–6.
[49] Iversen T, Luras H. The interaction between patient shortage and patients waiting time.TechnicalReport, Oslo
University, Health Economics Research Programme 2009.
[50] Siciliani L, Verzulli R. Waiting times and socioeconomic status among elderly Europeans: evidence from
SHARE.HealthEconomics 2009;18(11):1295–1306.
[51] Roll K, Stargardt T, Schreyogg J. Effect of type of insurance and income on waiting time for outpatient care.The
GenevaPapersonRisk andInsurance-IssuesandPractice 2012; .
[52] Arnesen K, Erikssen J, Stavem K. Gender and socioeconomic status as determinants of waiting time for inpatient
surgery in a system with implicit queue management.HealthPolicy 2002;62(3):329–341.
[53] Dimakou S, Parkin D, Devlin N, Appleby J. Identifying the impact of government targets on waiting times in the
NHS.HealthCareManagementScience 2009;12(1):1–10.
[54] Blough D, Madden C, Hornbrook M. Modeling risk using generalized linear models.Journal of Health
Economics 1999;18(2):153–171.
[55] Manning W,et al.. The logged dependent variable, heteroscedasticity, and the retransformation problem.Journal
of HealthEconomics 1998;17(3):283–296.
25
[56] Mullahy J. Much ado about two: reconsidering retransformation and the two-part model in Health Econometrics.
Journalof HealthEconomics 1998;17(3):247–281.
[57] Box G, Cox D. An analysis of transformations.Journalof theRoyalStatisticalSociety.SeriesB(Methodological)
1964; :211–252.
[58] Chaze J. Assessing household health expenditure with Box–Cox censoring models.HealthEconomics 2005;
14(9):893–907.
[59] Koenker R, Hallock K. Quantile regression: An introduction. Journalof EconomicPerspectives 2001;15(4):43–
56.
[60] Cameron A, Trivedi P.Regressionanalysisof countdata, vol. 30. Cambridge University Press, 1998.
[61] Long J. Regressionmodels for categoricaland limited dependentvariables, vol. 7. Sage Publications,
Incorporated, 1997.
[62] Maddala G.Limited-dependentandqualitativevariablesin Econometrics, vol. 3. Cambridge University Press,
1986.
[63] Deb P, Trivedi P. Demand for medical care by the elderly:a finite mixture approach.Journalof Applied
Econometrics 1997;12:313–336.
[64] Vuong Q. Likelihood ratio tests for model selection andnon-nested hypotheses.Econometrica 1989; :307–333.
[65] Jenkins S. Survival analysis.Unpublishedmanuscript,Institutefor SocialandEconomicResearch,Universityof
Essex,Colchester,UK 2005; .
[66] Jenkins S. Distributionally-sensitive inequality indices and the GB2 income distribution.Reviewof Incomeand
Wealth 2009;55(2):392–398.
[67] Sun J, Frees E, Rosenberg M. Heavy-tailed longitudinaldata modeling using copulas.Insurance:Mathematics
andEconomics 2008;42(2):817–830.
[68] Cox D. Regression models and life-tables.Journalof the Royal StatisticalSociety.SeriesB (Methodological)
1972; :187–220.
[69] McCullagh P, Nelder J.Generalizedlinearmodels, vol. 37. Chapman & Hall/CRC, 1989.
[70] Nelder J, Wedderburn R. Generalized linear models.Journalof theRoyalStatisticalSociety.SeriesA (General)
1972; :370–384.
26
[71] Pregibon D. Goodness of link tests for generalized linear models.AppliedStatistics 1980; .
[72] Park R. Estimation with heteroscedastic error terms.Econometrica 1966; :888–888.
[73] Gourieroux C, Monfort A, Trognon A. Pseudo maximum likelihood methods: Theory.Econometrica 1984; :681–
700.
[74] Analyzing count data. URLhttp://www.ats.ucla.edu/stat/stata/library/count.htm.
[75] Chiou J, Muller H. Quasi-likelihood regression with unknown link and variance functions.Journal of the
AmericanStatisticalAssociation 1998;93(444):1376–1387.
[76] Shah N, Craig B, Banerjee R, Tulledge-Scheitel S, Naessens J. Evaluating health plan benefit change: A finite
mixture model approach.AvailableatSSRN995052 2007; .
[77] Deb P, Holmes A. Estimates of use and costs of behavioural health care: a comparison of standard and finite
mixture models.HealthEconomics 2000;9(6):475–489.
[78] Deb P, Trivedi P. The structure of demand for health care: latent class versus two-part models.Journalof Health
Economics 2002;21(4):601–625.
[79] Jimenez-Martın S, Labeaga J, Martınez-Granado M. Latent class versus two-part models in the demand for
physician services across the European Union.HealthEconomics 2002;11(4):301–321.
[80] Bago d’Uva T. Latent class models for utilisation of health care.HealthEconomics 2006;15(4):329–343.
[81] Lourenco O, Ferreira P. Utilization of public health centres in Portugal: effect of time costs and other
determinants. finite mixture models applied to truncated samples.HealthEconomics 2005;14(9):939–953.
[82] Conway K, Deb P. Is prenatal care really ineffective? Or, is the “devil” in the distribution?Journalof Health
Economics 2005;24(3):489–513.
[83] Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi J, Saunders L, Beck C, Feasby T, Ghali W.
Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data.MedicalCare 2005;
:1130–1139.
[84] Shapiro S, Wilk M. An analysis of variance test for normality (complete samples).Biometrika 1965;52(3/4):591–
611.
[85] D’agostino R, Belanger A, D’Agostino Jr R. A suggestionfor using powerful and informative tests of normality.
TheAmericanStatistician 1990;44(4):316–321.
27
[86] Sinko A, Turner A, Nikolova S, Sutton M. Appendix for “Models for waiting times in healthcare: Comparative
study using Scottish administrative data”.TechnicalReport 2014.
[87] Copas J. Regression, prediction and shrinkage.Journalof theRoyalStatisticalSociety.SeriesB(Methodological)
1983; :311–354.
28
7. Figures
0.0
5.1
Den
sity
0 20 40 60 80 100Weeks
Weekly 2002
0.0
05.0
1.01
5.02
Den
sity
0 200 400 600 800Days
Daily 2002
0.5
11.
52
Den
sity
0 1 2 3 4 5Log Weeks
Log Weekly 2002
0.1
.2.3
.4D
ensi
ty
0 2 4 6 8Log Days
Log Daily 2002
0.2
.4.6
.8D
ensi
ty
0 2 4 6 8 10Root Weeks
Root Weekly 2002
0.0
5.1
Den
sity
0 10 20 30Root Days
Root Daily 2002
Figure 1. HISTOGRAM PLOTS OF LEVEL, LOG AND ROOT WAITING TIMES: 2002
0.0
2.04
.06.
08.1
Den
sity
0 20 40 60 80 100Weeks
Weekly 2007
0.0
05.0
1.0
15D
ensi
ty
0 200 400 600 800Days
Daily 2007
0.5
11.
5D
ensi
ty
0 1 2 3 4 5Log Weeks
Log Weekly 2007
0.1
.2.3
.4D
ensi
ty
0 2 4 6 8Log Days
Log Daily 2007
0.2
.4.6
Den
sity
0 2 4 6 8 10Root Weeks
Root Weekly 2007
0.0
5.1
.15
Den
sity
0 10 20 30Root Days
Root Daily 2007
Figure 2. HISTOGRAM PLOTS OF LEVEL, LOG AND ROOT WAITING TIMES: 2007
29
0.0
5.1
Den
sity
0 50 100Residuals
Weekly 2002: OLS residuals
0.0
05.0
1.01
5.02
Den
sity
0 200 400 600 800Residuals
Daily 2002: OLS residuals
0.5
11.
52
Den
sity
−2 −1 0 1 2 3Residuals
Log Weekly 2002: OLS residuals
0.1
.2.3
.4D
ensi
ty
−4 −2 0 2 4Residuals
Log Daily 2002: OLS residuals
0.2
.4.6
.8D
ensi
ty
−5 0 5 10Residuals
Root Weekly 2002: OLS residuals
0.0
5.1
Den
sity
−10 0 10 20Residuals
Root Daily 2002: OLS residuals
Figure 3. HISTOGRAM PLOTS OFOLS RESIDUALS FROM LEVEL, LOG AND ROOT REGRESSIONS: 2002
0.0
2.04
.06.
08.1
Den
sity
0 50 100Residuals
Weekly 2007: OLS residuals
0.0
05.0
1.0
15D
ensi
ty
0 200 400 600 800Residuals
Daily 2007: OLS residuals
0.5
11.
5D
ensi
ty
−2 −1 0 1 2 3Residuals
Log Weekly 2007: OLS residuals
0.1
.2.3
.4D
ensi
ty
−4 −2 0 2 4Residuals
Log Daily 2007: OLS residuals
0.2
.4.6
Den
sity
−2 0 2 4 6 8Residuals
Root Weekly 2007: OLS residuals
0.0
5.1
.15
Den
sity
−10 0 10 20Residuals
Root Daily 2007: OLS residuals
Figure 4. HISTOGRAM PLOTS OFOLS RESIDUALS FROM LEVEL, LOG AND ROOT REGRESSIONS: 2007
30
8. Tables
Table 1.SUMMARY OF MODEL ESTIMATORS AND METHOD OF GENERATING PREDICTIONS
Model Estimator Predictions (µ(x)))
Ordinary Least Squares (OLS)
Linear OLS Y = x′β + ǫ µ(x) = x′β
Log OLS (with Duan smearing estimator) ln(Y ) = x′β + ǫ µ(x) = exp(x′β).s
E[exp(ǫ)] = constant s = N−1n∑
i=1
exp(ǫi)
Square-root OLS (with additive smearing estimator)√Y = x′β + ǫ µ(x) = (x′β)2 + s
E[ǫ)] = constant s = N−1n∑
i=1
ǫi
Survival/Duration Models
Exponential model h(t,X) = λiˆE[ti] = exp(−xiβ)
λi = exp(xiβ)
Weibull model h(t,X) = λip(λit)p−1 ˆE[ti] = ( 1
λ)1p Γ(1 + 1
p)
λi = exp(xiβ)
Log-logistic model h(t,X) =λ
1γ t[
γ1−1]
γ[1+(λt)1γ ]
E[T ] = 1
λ× γπ
sin(γπ)
λi = exp(xiβ)
Log-normal model h(t,X) =1
tσ√
2πexp[
−1
2σ2 {ln(t)−µ}2]
1−Φ{ ln(t)−µσ
}formulae N/A
µ = xiβ
Gompertz model h(t,X) = λeγt formulae N/Aλi = exp(xiβ)
Generalised gamma model f(t) = λp(λt)pκ−1e−(λt)p
Γ(κ)formulae N/A
λi = exp(xiβ)
Generalized beta of the second kind (GB2) f(y) = ayap−1
bapB(p,q)[1+(yb)a](p+q)
ˆE(y) = b[Γ(p+ 1
a)Γ(q− 1
a)
Γ(p)Γ(q)]
B(u, υ) = Γ(u)Γ(υ)/Γ(u + υ)
Cox proportional hazard model h[yi|xi] = h0(yi)λi µ(x) = x′β
λi = exp(xiβ)
Generalised Linear Models (GLMs)
Log-Poisson ln(µ(x)) = x′β µ(x) = exp(x′β)
V (Y |X = x) = µ(x)
Log-gamma ln(µ(x)) = x′β µ(x) = exp(x′β)
V (Y |X = x) = θ1(µ(x))2
Log-normal ln(µ(x)) = x′β µ(x) = exp(x′β)
V (Y |X = x) = θ1
Square root-gamma√
µ(x) = x′β µ(x) = (x′β)2
V (Y |X = x) = θ1(µ(x))2
Extended Estimating Equations (EEE) x′iβ = g(µi;λ) µ(x) = (x′β.λ + 1)
1λ
Count Data Models
Poisson Pp(Yi = yi|xi) =e−µiµ
yii
yi!µp(x) = n = exp(x′β)
Negative binomial Pnb(Yi = yi|xi) =θθµ
yii
Γ(θ+yi)
Γ(yi+1)Γ(θ)(µi+θ)θ+yiµnb(x) = n = exp(x′β)
Zero-Inflated Poisson{
Pp(yi|xi)(1 − p0(γ′zi)) yi > 0
p0(γ′zi) + Pp(0|xi)(1 − p0(γ′zi)) yi = 0µ(xi, zi) = µp(1 − p0)
Zero-Inflated negative binomial{
Pnb(yi|xi)(1 − p0(γ′zi)) yi > 0
p0(γ′zi) + Pnb(0|xi)(1 − p0(γ′zi)) yi = 0µ(xi, zi) = µnb(1 − p0)
Generalised negative binomial P (Yi = yi) = nn+βyi
(
n+βyiyi
)
αyi (1 − α)n+βyi−yi µ = nα
1−αβ
Finite Mixture Models (FMM)
Poisson P (Yi = yi|xi) =∑
nk=1 πiPp(yi|xi; θk) µ =
∑
nk=1 µp(θk)
Negative Binomial P (Yi = yi|xi) =∑
ki=1
πiPnb(yi|xi; θk) µ =∑
nk=1
µnb(θk)
Others
Quantile regression (median) Qτ [yi|xi] = xiβτ µ(x) = x′βτ
(ECM) - NLLS E[yi|xi] = exp(xiβ) µ(x) = exp(x′β)
31
Table 2.CHARACTERISTICS OF THE2002AND 2007SAMPLES
2002 2007Daily Weekly Daily Weekly
N 321922 321922 335487 335487Mean 79.39171 11.75959 62.96882 9.413593Median 42 6 44 7Standard deviation 99.7035 14.2399 71.15008 10.16192Skewness 2.272736 2.273856 3.479698 3.481343Kurtosis 9.129328 9.131606 21.89617 21.91007Maximum 728 104 728 10499th percentile 448 64 373 5495th percentile 305 44 169 2590th percentile 213 31 125 1875th percentile 100 15 87 1325th percentile 15 3 19 710th percentile 5 1 7 15th percentile 2 1 3 11st percentile 1 1 1 1Minimum 0 0 0 0
32
Table 3.RESULTS FORCHAPTER 2: CANCER, WEEKLY FREQUENCY, IN-SAMPLE PREDICTION
Model MPE MAPE RMSE LINKpval Pearson test R2
(p-value) (p-value)
Year 2002
OLS -6.82e-08 5.142 8.957 0 1.000 0.113OLS onln(y) -0.474 5.394 8.987 2.22e-16 0 0.114OLS on
√y 8.99e-08 5.182 8.981 0 0 0.115
Cox PH 6.026 6.073 11.388 0.089 0 0.108Generalized Gamma (PH) 2.985 4.590 9.615 4.44e-16 0 0.111Exponential (PH) 2.057 4.602 9.258 .0246 0 0.114Gompertz (PH) 2.627 4.572 9.453 .103 0 0.113Log-logistic (PH) 2.671 4.557 9.453 0 0 0.113Log-normal (PH) 2.611 4.567 9.447 7.82e-13 0 0.113Weibull (PH) 2.045 4.6035 9.255 0.025 0 0.114NLLS 0.025 5.107 8.932 0 0.291 0.118GLM log-Poisson -3.9e-08 5.120 8.935 0 0.375 0.117GLM log-gamma 0.008 5.129 8.952 .0995135 0.165 0.114GLM log-normal 0.025 5.107 8.932 1.72e-5 0.291 0.118GLM sqrt-gamma 0.012 5.139 8.967 0.075 0.119 0.111EEE . . . . . .FMM - negative binomial 0.246 5.065 8.980 1.10e-07 0 0.114Quantile regression (median) 2.584 4.553 9.411 . 0 0.105ECM - Poisson ML -1.22e-07 5.1204 8.935 0 0.375 0.117Zero-inflated poisson -1.4e-5 5.120 8.935 3.21e-06 0.376 0.117Negative Binomial 0.009 5.125 8.947 .001 0.119 0.115Zero-inflated negative binomial 0.009 5.125 8.947 0.007 0.119 0.115Generalized Negative Binomial 0.009 5.125 8.947 .001 0.119 0.115
Year 2007
OLS -3.5e-09 4.009 7.072 3.1e-15 1.000 0.110OLS onln(y) -0.696 4.316 7.111 5.73e-12 7.68e-14 0.110OLS on
√y -1.1e-07 4.0367 7.081 0 4.7e-28 0.111
Cox PH 5.176 5.214 9.234 0.740 0 0.107Generalized Gamma (PH) 2.160 3.690 7.482 2.0e-11 0 0.107Exponential (PH) 1.775 3.689 7.348 0.439 0 0.108Gompertz (PH) 2.044 3.690 7.436 0.581 0 0.108Log-logistic (PH) 1.934 3.668 7.382 1.8e-15 0 0.109Log-normal (PH) 1.923 3.679 7.392 1.1e-07 0 0.108Weibull (PH) 1.503 3.703 7.278 0.190 0 0.108NLLS 0.013 3.985 7.058 2.2e-08 0.427 0.113GLM log-Poisson -4.7e-08 3.996 7.061 0 .606 0.113GLM log-gamma -0.001 4.009 7.074 0.570 0.914 0.109GLM log-normal 0.014 3.985 7.058 3.5e-06 0.427 0.113GLM sqrt-gamma -4.0e-4 4.017 7.081 0.948 0.986 0.108FMM - negative binomial 0.139 3.987 7.083 2.8e-06 4.9e-28 0.111Quantile regression (median) 1.970 3.661 7.376 . 0 0.108ECM - Poisson ML -3.9e-08 3.996 7.061 0 0.606 0.113Zero-inflated poisson 8.4e-06 3.996 7.061 0.001 0.599 0.113Negative Binomial 9.9e-4 4.004 7.070 0.734 0.745 0.110Zero-inflated negative binomial 9.9e-4 4.004 7.070 0.788 0.745 0.110Generalized Negative Binomial 9.9e-4 4.004 7.070 0.734 0.745 0.110
33
Table 4.RESULTS FORCHAPTER 2: CANCER, DAILY FREQUENCY IN-SAMPLE PREDICTION
Model MPE MAPE RMSE LINKpval Pearson test R2
(p-value) (p-value)
Year 2002
OLS -1.4e-07 36.067 62.750 0 1.000 0.113OLS onln(y) -2.272 36.937 62.835 3.3e-11 0.037 0.112OLS on
√y 1.3e-07 36.341 62.933 0 0 0.115
Cox PH 42.874 42.897 79.377 0.361 0 0.107Generalized Gamma (PH) 19.615 32.063 66.568 1.5e-06 0 0.113Exponential (PH) 13.471 32.434 64.682 0.158 0 0.114Gompertz (PH) 18.595 32.151 66.421 0.501 0 0.112Log-logistic (PH) 19.754 32.055 66.575 2.5e-10 0 0.112Log-normal (PH) 21.072 32.146 67.161 1.5e-08 0 0.111Weibull (PH) 17.166 32.102 65.752 0.080 0 0.114NLLS 0.195 35.776 62.566 0 0.277 0.119GLM log-Poisson 1.60e-07 35.877 62.591 0 0.365 0.118GLM log-gamma 0.046 35.973 62.734 0.352 0.264 0.114GLM log-normal 0.195 35.776 62.566 5.0e-5 0.277 0.119GLM sqrt-gamma 0.067 36.052 62.851 0.256 0.211 0.111EEE 42.581 42.593 79.084 . 0 0.112FMM - negative binomial 1.292 35.656 62.890 8.0e-09 3.4e-35 0.114Quantile regression (median) 18.517 32.037 66.084 0 0 0.109ECM - Poisson ML 1.7e-07 35.878 62.591 0 0.365 0.118Zero-inflated poisson 4.6e-4 35.878 62.591 9.4e-06 0.361 0.118Negative Binomial 0.048 35.968 62.728 0.147 0.245 0.114Zero-inflated negative binomial 0.048 35.968 62.728 0.297 0.245 0.114Generalized Negative Binomial 0.048 35.968 62.728 0.147 0.245 0.114
Year 2007
OLS 3.7e-07 28.110 49.548 1.3e-15 1.000 0.111OLS onln(y) -2.998 29.112 49.760 9.0e-07 7.8e-22 0.108OLS on
√y -2.9e-07 28.304 49.619 0 5.3-28 0.111
Cox PH 36.648 36.671 64.183 0.297 0 0.106Generalized Gamma (PH) 14.496 25.835 52.090 0.013 0 0.108Exponential (PH) 11.504 25.946 51.291 0.121 0 0.108Gompertz (PH) 14.312 25.942 52.189 0.250 0 0.108Log-logistic (PH) 14.472 25.806 52.010 2.5e-08 0 0.108Log-normal (PH) 15.821 25.918 52.545 2.3e-4 0 0.106Weibull (PH) 12.830 25.890 51.656 0.244 0 0.108NLLS 0.109 27.911 49.446 1.1e-10 0.402 0.114GLM log-Poisson 2.4e-07 28.002 49.470 0 0.599 0.113GLM log-gamma -0.027 28.127 49.584 0.286 0.612 0.109GLM log-normal 0.109 27.911 49.446 8.8e-06 0.402 0.114GLM sqrt-gamma -0.019 28.182 49.634 0.698 0.706 0.107FMM - negative binomial 0.280 28.263 49.669 0 2.2e-33 0.110Quantile regression (median) 13.147 25.763 51.501 0 0 0.108ECM - Poisson ML -6.5e-08 28.002 49.470 0 0.599 0.113Zero-inflated poisson 7.7e-5 28.005 49.472 0.003 0.608 0.113Negative Binomial -0.023 28.122 49.580 0.170 0.673 0.109Zero-inflated negative binomial -0.023 28.122 49.580 0.321 0.673 0.109Generalized Negative Binomial -0.023 28.122 49.579 0.170 0.673 0.109
34
Table 5.RESULTS FORCHAPTER 2: CANCER, COPAS RESULTS, WEEKLY FREQUENCY
β1 β0 MPE APE RMSE χ2(1) χ2(2)
(p-value) (p-value)
Year 2002
OLS 0.996 0.024 -4.8e-5 5.144 8.961 0.807 0.971OLS onln(y) 1.250 -2.252 -0.473 5.400 8.995 0 0OLS on
√y 1.306 -2.035 -0.001 5.185 8.984 0 0
Cox PH -7.166 11.040 6.026 6.072 11.387 0 0Generalized Gamma (PH) 2.141 -1.235 2.985 4.591 9.616 0 0Exponential (PH) 1.468 -.107 2.057 4.604 9.261 0 0Gompertz (PH) 1.771 -0.500 2.627 4.574 9.455 0 0Log-logistic (PH) 1.686 -0.084 2.671 4.559 9.455 0 0Log-normal (PH) 1.765 -.503 2.612 4.569 9.448 0 0Weibull (PH) 1.464 -0.105 2.045 4.605 9.258 0 0NLLS 0.981 0.154 0.025 5.109 8.937 0.175 0.348GLM log-Poisson 1.009 -0.062 -1.6e-4 5.123 8.940 0.527 0.819GLM log-gamma 1.017 -0.106 0.008 5.132 8.956 0.257 0.518GLM log-normal 0.981 0.154 0.025 5.110 8.937 0.175 0.348GLM sqrt-gamma 1.021 -0.125 0.012 5.142 8.971 0.181 0.397EEE -6.1e-15 6.665 -4.0e+12 4.0e+12 3.2e+13 . 0FMM - Poisson -8.4e-34 6.641 -4.4e+29 4.4e+29 9.0e+14 . 0FMM - negative binomial 1.258 -1.405 0.244 5.068 8.983 0 0Quantile regression (median) 1.437 0.817 2.589 4.557 9.417 0 0ECM - Poisson ML 1.009 -0.061 4.1e-5 5.123 8.940 0.535 0.825Zero-inflated poisson 1.009 -0.062 9.8e+5 5.123 8.940 0.531 0.822Negative Binomial 1.020 -0.123 0.009 5.128 8.951 0.188 0.413Zero-inflated negative binomial 1.020 -0.120 0.009 5.128 8.952 0.196 0.425Generalized Negative Binomial 1.020 -0.121 0.009 5.128 8.952 0.194 0.422
Year 2007
OLS 0.996 0.023 -1.9e-4 4.011 7.075 .793 0.966OLS onln(y) 1.127 -1.518 -0.697 4.318 7.094 0 0OLS on
√y 1.201 -1.154 -0.001 4.039 7.083 0 0
Cox PH -5.727 8.982 5.176 5.214 9.233 0 0Generalized Gamma (PH) 1.672 -0.280 2.160 3.691 7.483 0 0Exponential (PH) 1.434 .035 1.775 3.690 7.351 0 0Gompertz (PH) 1.580 -0.127 2.044 3.691 7.438 0 0Log-logistic (PH) 1.392 .422 1.934 3.670 7.383 0 0Log-normal (PH) 1.496 0.006 1.923 3.680 7.394 0 0Weibull (PH) 1.351 -0.002 1.503 3.705 7.281 0 0NLLS 0.984 0.107 0.014 3.987 7.063 0.277 0.520GLM log-Poisson 1.003 -0.020 2.6e-5 3.998 7.065 0.824 0.976GLM log-gamma 0.994 0.032 -0.001 4.011 7.078 0.709 0.932GLM log-normal 0.984 0.107 0.014 3.987 7.062 0.278 0.522GLM sqrt-gamma 0.996 0.023 -4.2e-4 4.019 7.085 0.794 0.967EEE . . . . .FMM - Poisson -4.0e-25 5.744 -1.6e+21 1.6e+21 7.8e+16 . 0FMM - negative binomial 6.4e-28 5.742 -2.0e+23 2.0e+23 566405.6 . 0Quantile regression (median) 1.314 0.755 1.946 3.666 7.373 0 0ECM - Poisson ML 1.004 -0.022 -6.4e-06 3.998 7.065 0.801 0.969Zero-inflated poisson 1.004 -0.023 3.7e-4 3.998 7.065 0.795 0.967Negative Binomial 1.001 -0.003 9.1e-4 4.006 7.074 0.965 0.999Zero-inflated negative binomial 1.001 -0.006 0.001 4.006 7.073 0.938 0.997Generalized Negative Binomial 1.001 -0.006 0.001 4.006 7.073 0.931 0.996
35
Table 6.RESULTS FORCHAPTER 2: CANCER, COPAS RESULTS, DAILY FREQUENCY
β1 β0 MPE APE RMSE χ2(1) χ2(2)
(p-value) (p-value)
Year 2002
OLS 0.997 0.142 9.2e-4 36.081 62.776 0.828 0.977OLS onln(y) 0.970 -0.880 -2.251 36.921 62.923 0 0OLS on
√y 1.315 -13.738 -0.010 36.357 62.951 0 0
Cox PH -44.873 72.122 42.874 42.897 79.377 0 0Generalized Gamma (PH) 1.732 1.902 19.615 32.074 66.583 0 0Exponential (PH) 1.462 -0.542 13.471 32.443 64.699 0 0Gompertz (PH) 1.870 -3.330 18.595 32.159 66.433 0 0Log-logistic (PH) 1.687 3.235 19.754 32.066 66.585 0 0Log-normal (PH) 1.835 2.090 21.072 32.155 67.172 0 0Weibull (PH) 1.640 0.132 17.166 32.112 65.768 0 0NLLS .9803169 1.047 .194 35.795 62.601 0.169 0.328GLM log-Poisson 1.010 -0.416 3.7e-4 35.895 62.622 0.517 0.811GLM log-gamma 1.013 -0.539 0.045 35.989 62.761 0.374 0.668GLM log-normal 0.980 1.047 .196 35.795 62.602 0.170 0.328GLM sqrt-gamma 1.016 -0.613 0.068 36.069 62.878 0.309 0.584EEE -1.6e-08 43.590 -4842513 4854459 3.6e+07 . 0FMM - Poisson . . . . .FMM - negative binomial 1.222 -8.089 1.292 35.667 62.911 5.39e-34 4.8e-36Quantile regression (median) 1.592 3.751 18.535 32.054 66.106 0 0ECM - Poisson ML 1.010 -0.427 0.002 35.894 62.621 0.505 0.801Zero-inflated poisson 1.009 -0.414 -9.0e-4 35.896 62.622 0.521 0.814Negative Binomial 1.014 -0.555 0.047 35.985 62.757 0.359 0.651Zero-inflated negative binomial 1.014 -0.554 0.050 35.983 62.758 0.358 0.649Generalized Negative Binomial 1.014 -0.551 0.049 35.985 62.758 0.361 0.652
Year 2007
OLS 0.996 0.137 -0.001 28.122 49.570 0.810 0.971OLS onln(y) 0.867 2.342 -2.998 29.176 49.877 0 0OLS on
√y 1.201 -7.478 -0.008 28.316 49.632 0 0
Cox PH -35.846 58.783 36.648 36.671 64.182 0 0Generalized Gamma (PH) 1.474 3.558 14.496 25.844 52.102 0 0Exponential (PH) 1.425 0.428 11.503 25.955 51.306 0 0Gompertz (PH) 1.652 -0.843 14.312 25.949 52.199 0 0Log-logistic (PH) 1.394 5.377 14.471 25.816 52.018 0 0Log-normal (PH) 1.512 4.685 15.821 25.927 52.553 0 0Weibull (PH) 1.488 0.768 12.829 25.899 51.671 0 0NLLS 0.983 0.729 0.109 27.926 49.474 0.266 0.496GLM log-Poisson 1.004 -0.146 -2.2e-4 28.0157 49.496 0.799 0.968GLM log-gamma 0.988 0.406 -0.027 28.140 49.607 0.452 0.750GLM log-normal 0.983 0.750 0.109 27.928 49.477 0.250 0.475GLM sqrt-gamma 0.990 0.345 -0.020 28.195 49.657 0.531 0.820EEE 17.600 -16.243 34.210 34.491 62.623 0 0FMM - Poisson . . . . .FMM - negative binomial 1.221 -7.708 .4336 28.197 49.669 2.0e-31 8.3e-31Quantile regression (median) 1.325 5.314 13.150 25.783 51.515 0 0ECM - Poisson ML 1.004 -0.140 -9.8e-4 28.017 49.497 0.808 0.971Zero-inflated poisson 1.004 -0.136 -0.002 28.019 49.498 0.815 0.973Negative Binomial 0.990 0.357 -0.023 28.134 49.601 0.509 0.801Zero-inflated negative binomial 0.989 0.378 -0.022 28.136 49.605 0.487 0.782Generalized Negative Binomial 0.989 0.370 -0.023 28.136 49.604 0.495 0.790
36