Models for waiting times in healthcare: Comparative study ...

Received XXXX

(www.interscience.wiley.com) DOI: 10.1002/sim.0000

Models for waiting times in healthcare:

Comparative study using Scottish

administrative data

Arthur Sinko a, Alex Turnerb,Silviya Nikolovab∗, Matt Suttonb

The empirical modelling of waiting times distributions present a number of challenges due to their strong

positive skewness and multiple mass points. We compare the performance of a number of estimators previously

applied to health outcomes with similar distributional characteristics to model waiting times. We benchmark

estimator performance using several measures of in-sampleand out-of-sample prediction accuracy for different

data samples, years, and frequencies. Using administrative data for the Scottish National Health Service (NHS)

we find fairly large consistency across years and frequencies both when pooling data across ICD-10 chapters and

estimating separately for each year. Our results show that no model is optimal on all metrics, but generalised

linear models and count data models seem to be the most appropriate for modelling waiting times.

Copyright c© 0000 John Wiley & Sons, Ltd.

Keywords: health econometrics, waiting times, modelling

a Economics, School of Social Sciences, Arthur Lewis Bld., Oxford Rd., University of Manchester, Manchester, M13 9PL U.K.b Manchester Centre for Health Economics, Jean McFarlane Bld., Oxford Rd., University of Manchester, Manchester, M13 9PL U.K.∗Correspondence to: Manchester Centre for Health Economics, Jean McFarlane Bld., Oxford Rd., University of Manchester, Manchester, M13 9PL U.K. E-mail:[email protected]

Contract/grant sponsor: This research was supported by a NIHR Research Methods Opportunity Funding (RMOFS 2012/08)

1. Introduction

The distributions of many outcomes of interest in health economics are characterized by a strong positive skew,

a heavy right-hand tail, and multiple mass points. In addition to this, it is likely that the relationship between the

response variables and covariates is non-linear in nature and heteroscedasticity is present. Healthcare expenditure/cost

data, waiting times for medical treatment, length of in-hospital stay, number of physician visits, among other health

outcomes, present these distributional idiosyncrasies.

There is a burgeoning body of new estimators for skewed distributions with mass points and a heavy right tail,

but most of these new statistical methods have been applied to modelling healthcare costs only and are yet to find

widespread application in health economics. In this paper we compare models that have been previously applied to

healthcare costs and some generally used extensions as informed by a review of papers on health outcomes with such

distributional characteristics. We evaluate their applicability to modelling waiting times.

The ultimate goal of waiting times modelling is to recover a functional relationship between waiting times and

relevant covariates and to construct correct statistical inference. Waiting times for elective surgery are intrinsicto

public healthcare systems where healthcare is free at the point of use and prices cannot act as a rationing mechanism

to reconcile limited supply with potentially unlimited demand. In such systems waiting times act as a non-monetary

rationing device by deterring patients with small benefits from demanding treatment [1, 2, 3, 4, 5, 6]. Waiting times

also have important effects on patients. It has been shown that longer waiting times are a major source of dissatisfaction

for patients [3, 7, 8]. They also postpone patients’ benefitsand, due to discounting, cause reductions in benefits [9].

Also, although opinion regarding elective surgery is mixed, some theory and evidence suggests that longer waits

could have detrimental effects on health [9, 10, 11, 12]. In awider setting, waiting times have also been shown to

be a key determinant of satisfaction with public services [13, 14] and a key indicator of public sector inefficiency

[15, 16, 17, 18].

This paper contributes to the growing literature on modelling health outcomes with skewed distributions, mass

points, and heavy right tails by comparing performance of a wide range of models against waiting time data from

Scotland to estimate and predict the distribution of waiting times. We examine the ability of these models to address

the distributional characteristics of the data using a quasi-experimental design involving cross-validation.

Reforms to reduce waiting times through the imposition of maximum thresholds were introduced in Scotland in

2003. This changed the waiting times distribution by reducing the long waiting times as well as the mean and median

waits of elective patients. We use pre- and post-reform waiting times data for 2002 and 2007 respectively to analyse

suitability of different models. We study the waiting timesfor elective treatment for a wide range of conditions. We

regress elective waits on a range of patient characteristics and measures of disease severity. Comparative performance

is determined using in-sample and out-of-sample prediction measures of goodness of fit, bias and forecast accuracy

1

for different data samples, frequencies, and years.

The rest of this paper is structured as follows. The next section summarises related literature, introduces the models

applied to the evaluation of waiting times and, where available, provides details of previous applications to waiting

times data. The different metrics used for model comparisonare explained next. Section 3 describes the data and choice

of variables used in all regression specifications. Resultsare then presented followed by Discussion and Conclusion

sections.

2. Comparison of Model Performance

2.1. Model Comparison Literature

The bulk of existing studies assessing comparative model performance have utilised healthcare cost data. Much

of the early focus of these studies centred on how to deal withthe mass point at zero, and, thus, much of early

literature assessed the merits of using two-part and multi-part models over the traditional one-part specification used in

ordinary least squares (OLS) [19]. By reducing skewness, conducting transformations before applying OLS can help in

returning the distribution to normality. As a result, otherpapers have chosen to focus solely on the decision of whether

or not to transform the dependent variable [20]. However, policy makers require predictions on the raw scale which

leads to the tricky problem of re-transformation, a problemmade increasingly complex if heteroscedasticity is found

in the errors [19, 21]. Later work encompassed both of these facets, by comparing one and two-part specifications of

the OLS model with one and two-part specifications of log and square-root transformed OLS, as well as with one and

two-part specifications of generalised linear models (GLMs) [22, 23]. By explicitly modelling non-linearity, GLMs can

deal with skewed nature of health distributions, whilst still generating transformations on the raw scale, thus avoiding

the challenges of re-transformation.

Although some studies continued to make use of zero observations, many have chosen to focus only on positive

costs, which has led to the comparison of more complex models. Previous literature compares transformed OLS

models to GLMs with different specifications of the link and distribution functions [24]. Basu and Rathouz [25]

propose an extended estimating equations (EEE) model, a very flexible extension of the the GLM model, and compare

its performance to the its various nested GLM specifications. Both this paper and a later paper [26] find that the EEE

fits better than any of its nested models. Similar to the GLM family of models, parametric survival models such as

the Weibull, exponential and Gompertz models, as well as themore flexible generalised gamma model (GGM), can

also capture non-linearity. Manninget al. [21] recognise this, and compare the GGM (along with some other nested

alternatives) to an OLS regression on log-transformed costs, finding that the GGM potentially provides a more robust

estimator than simpler models. The use of survival models istaken further by Basuet al. [27], who compare the log

2

transformed OLS, GLM and EEE models to the Cox proportional hazard model, a semi-parametric model where the

functional form of the baseline hazard model is left unspecified and is calculated along with the parameters of the

model in estimation. The flexibility of the GGM model can be extended further by using the generalised beta of the

second kind (GB2) distribution. This nests the GGM model as aspecial case, as well as other beta-type distributions

such as the Dagum, Beta-2 and Singh-Maddala, which can be used to capture heavy-tails in some distributions [28].

Jones at al [29] compares the performance of the GB2 model to all of its nested models, finding that the GGM and

Beta-2 models perform the best.

Two recent papers compare a wide-range of models of different types. Hill and Miller [30] compare the GGM to

both the EEE and various specifications of the GLM as well as linear and log-transformed OLS model using healthcare

expenditure data stratified by insurance type and age. Previous studies had not compared the EEE and GGM models,

as neither the GGM or EEE models are nested within each other,although both share the mean and variance of the

gamma model as a special case. Hill and Miller [30] find that the EEE model performs as well as, or better than the

other models in all of the stratified distributions. Jones [31] provides the most comprehensive comparison of models

to date. He provides a comparison of all of the aforementioned models as well as some extensions. These include a

finite mixture model, which allows the researcher to controlfor heterogeneity in the data, an error correction model

(ECM) estimated by both non-linear least squares and Poisson maximum likelihood, and a modified version of the

GGM model, which accounts for additional heteroscedasticity by allowing some of its parameters to be a function of

covariates [21]. They find no one model is optimal, with performance differing depending on the criterion considered.

The conclusions from Joneset al. [29] typify the findings across all comparison studies. In general, the appropriate

specifications for modelling costs differ dependent on whatcosts are considered [30]2, on the data sets used [30], on

the sample size [29, 32] and whether a researcher favours bias or precision [22].

2.2. Waiting Times Literature

Despite its problems with modelling variables with the distributional idiosyncrasies of waiting times, the popularity of

OLS in empirical papers is still widespread, even if only as abaseline model from which results from more complex

models are compared (see for example [8, 33, 34, 35, 36, 37, 38, 39, 40]).

OLS on log-transformed waits has been a popular choice to account for the skewness of the waiting times

distribution. This method has been applied primarily to assess the degree of discrimination in waiting times by patient

group. In particular they test whether waiting times systematically differ by condition [41], income [9, 41, 42, 43],

race/ethnicity [44, 45], education [9] and other socioeconomic factors such as age, gender, insurance status, nationality

and employment status [41, 43, 46]. This method has also beenapplied to test whether greater choice between

2They find that the optimal model varied over different stratifications of the sample by age and whether total expenditure or only expenditure on drugs whereconsidered.

3

providers, measured by concentration indices, is associated with lower mean waiting times [47], and the impact of

differences in prioritisation policies in Scotland and Norway on the distribution of waits [48].

The use of quantile regression to model health variables, which can help to address skewness by using the median

rather than the means as the measure of central tendency, is very limited, and even more so with respect to waiting

times. Only one study used it to test whether higher socioeconomic status leads to shorter waiting times for elective

surgery in Australia [42]; and in particular whether discrimination based on socioeconomic status is greater at longeror

shorter waits. Evidence from published studies suggests that non-linear models are a popular choice in the estimation

of waiting times. Iversen and Luras [49] employ two of the most popular count data models, the Poisson model and

negative binomial model, as well as a negative binomial model with an added random effect, to test whether general

practitioners (GPs) in Norway with a shortage of patients offered shorter waiting times to attract new patients. Siciliani

and Verzulli [50] use the negative binomial model to test whether higher patient socioeconomic status is associated

with reduced waiting times for elderly Europeans. Finally,Roll et al. [51] test whether waiting times for outpatient

care in Germany differed significantly across individuals based on the type of insurance and income, using the negative

binomial model. The use of duration/survival models has been widespread in the estimation of healthcare costs [31].

However, application of duration analysis to the modellingof waiting times is a relatively recent phenomenon. Arnesen

et al. [52] use the Cox proportional hazard model to test whether gender and socioeconomic status could explain

variations in waiting time for inpatient surgery in Norway.Dimakouet al. [53] use the Kaplan-Meier estimator to

study the distributional changes in waiting times in the English NHS, using Hospital Episodes Statistics (HES) data.

They also employed a range of parametric proportional hazard and accelerated failure time models to analyse the

effects of provider and patient characteristics on waitingtimes. Laudicellaet al. [9] assess the effect of socioeconomic

status on waiting times in England. They do so using the standard Cox proportional hazard model along with its

extended and stratified counterparts. There have been applications of the GGM with additional heteroscedasticity [31]

and the GB2 distribution [29, 31] to estimating healthcare costs, but to our knowledge there have been no applications

to the estimation of waiting times.

Due to its advantages, generalised linear models have emerged as a popular choice by researchers in the modelling

of many health variables, in particular healthcare costs [20, 21, 23, 54]. However, despite its popularity in other areas

of health economics, there are no examples of GLM models being used in the estimation of waiting times.

2.3. Empirical Models

In this paper we focus only on models that have already been applied in previous literature and some generally used

extensions, as informed by the review of waiting time papersand previous comparative studies. Jones [31] provides

an excellent summary of the econometrics behind the majority of the models presented here. As such, where models

overlap, only a brief explanation of the model, and rationale for its use, is provided. New models are presented in

4

greater detail. The econometric specifications of each model will be summarised in Table 1, as well as methods for

deriving predictions.

2.3.1. Ordinary Least SquaresAn ordinary least squares (OLS) regression of the level of waiting times on a set of

regressors,yi = xiβ + εi, provides a good starting point in modelling waiting times.It is computationally inexpensive,

and as it is specified on the original waiting times scale, difficult re-transformations are avoided. However, standard

linear regression models fail to reflect consistently and reliably the conditional mean of a skewed distributions with

mass points because of the asymmetry in the response function and/or the inefficiency due to the common failure to

deal with heteroscedasticity.

2.3.2. Regressions on transformed waiting timesEarly methods in dealing with skewed, and more generally non-

normal, distributions centred on transformations of the dependent variable [20, 55, 56]. The behaviour of many

outcomes in health can be approximated well by log-normal distributions and thus taking logarithms returns the

distribution to normality. Even without log-normality, taking logarithms reduces skewness, making the distribution

more symmetric and closer to normality. Assuming normalityis achieved after transformation, inference is now

valid when applying OLS to the transformed data. However this produces predictions on the log scale, and predicted

waits on the raw scale are usually the ones of interest. The problems with the transformation approach arise due to

problems associated with re-transformation, in particular when heteroscedasticity is present. Under non-normality,

a simple exponential of predictions on the log scale do not provide unbiased predictions on the raw scale. If errors

are homoscedastic, this can be remedied using the smearing estimator [19]. However, under heteroscedasticity these

predictions are biased, and a much more complex smearing estimator is required.

Square root transformations have also been a popular choiceto reduce skewness. However, as with the log-

transformed model, the square root-transformed model produces predictions on the transformed rather than the raw

scale, and similar problems with re-transformation exists.

Rather than applying a particular transformation, Box and Cox [57] propose a flexible transformation, self-named,

which calculates the type of transformation during estimation. This was later extended by Chaze [58] to account

for censored data. However, the standard Box-Cox transformation is not applied here as it does not allow for

heteroscedasticity. The EEE model, explained later, performs the same transformation, but can take heteroscedasticity

into account, and is thus preferred.

2.4. Quantile Regression

Traditional regression estimation methods, such as OLS, model the mean of the dependent variable, conditional on

the covariates, and estimate the conditional mean function, E[Yi|Xi] = Xiβ. Quantile regression, on the other hand,

5

allows the estimation of quantiles (or percentiles) of the dependent variable, conditional on covariates. In this casewe

estimate the conditional quantile function:

Qτ [Yi|Xi] = Xiβτ (2.1)

where0 < τ < 1 defines the quantile of the dependent variable to be estimated, and parameters of interest,β, are

contingent on the value ofτ .

Unlike OLS, which uses the mean as the measure of central tendency and minimises the sum of squared residuals

to derive parameter estimates, quantile regression uses the median as the measure of central tendency, and parameter

estimates,βτ , are estimated by minimizing the sum of asymmetrically weighted absolute residuals [59]:

minβτ

N∑

i=yi≥βτXi

τ |yi − βτXi|+

N∑

i=yi<βτXi

(1− τ)|yi − βτXi| (2.2)

This implies weights are equal toτ if residuals are positive, and equal to(1− τ) when residuals are negative.

Quantile regression can be used to estimate the effects of covariates on different parts of the waiting times

distribution. This is an obvious benefit over OLS which can only estimate their effects at the mean, and thus gives

a much less comprehensive picture of how waiting times are determined. However it also has benefits in dealing

with distributional issues associated with waiting times.Because estimation involves minimising absolute rather than

squared residuals, less weight is placed on extreme values and thus estimated parameters are less sensitive to heavy

right-hand tails.

As we are only interested in some measure of central tendencyof waiting times and not any particular quantile of its

distribution, we evaluate the regression function at the 50th percentile (τ = 0.5), in which case the quantile regression

is equal to a median regression.

2.5. Non-Linear regression methods

As explained previously, although transforming waiting times can correct for the non-normality in waiting time

distribution, it creates difficulties if predictions are required on the raw scale, particularly in the presence of

heteroscedasticity. To circumvent this issue, non-linearmethods can be employed which directly account for

distributional aspects.

Non-linearity in these models is specified in the relationship between waiting times and explanatory variables.

Many of these models assume that the conditional mean of waiting times depend on some function of the exponential

function:

E[yi|xi] = µi = f(exp(xiβ)) (2.3)

6

As a result many of these models come under the umbrella ofexponential conditional mean (ECM) models. The

exponential function accommodates the skewed distribution of waits and recognises that waiting times take only non-

negative values [31].

These models are outlined extensively in Jones [31], and thus readers are directed there if they require detailed

econometric specifications.

2.5.1. Non-linear least squares (NLLS) estimationECM models can be estimated using non-linear least squares

(NLLS). In this case, the ECM model is specified as the non-linear regression:

E[yi|xi] = exp(x′iβ) (2.4)

The relevant first-order/moment conditions for this model3 are solved iteratively to give parameter estimates. This

approach uses only the first moment, rather than the full probability distribution, and thus may be more robust than

maximum likelihood. However it also may be less efficient, depending on the form of the variance function [31].

2.5.2. Count data modelsWaiting times are a form of count data as time waited can only take non-negative integer

values, and these integer values arise from counting ratherthan ranking [60]. The benchmark model for count data is

the Poisson model. This model assumes that the dependent variable,yi, follows a Poisson distribution with meanµi.

The Poisson probability distribution is given by:

P (yi|µi) =e−µiµyi

i

yi!(2.5)

whereµi represent the mean ofyi. This distribution is characterised by equidispersion, such that the mean is equal

to the variance i.e.E[y] = var[y] = µ. The Poisson regression model incorporates observed heterogeneity into the

Poisson distribution function, such thatE[yi|xi] = var[yi|xi] = µi = exp(xiβ).

In practice, the variance is usually greater than the mean (overdispersion), and thus the Poisson model rarely fits the

data well [61, 62]. To overcome the problems of the Poisson model the negative binomial model can be used. This

directly takes into account overdispersion, through an inclusion of an additional parameter. The negative binomial

distribution is given by:

P (yi|µi, νi) =Γ(yi + νi)

yi! Γ(νi)

(

νiνi + µi

)νi ( µi

νi + µi

)yi

(2.6)

3See [31] for details of these.

7

whereΓ represents the Gamma probability distribution,νi =1αi

determines the degree of dispersion andα > 0

defines the overdispersion parameter.

The negative binomial regression model incorporates both observed and unobserved heterogeneity into the

conditional mean such thatE[yi|xi] = µi = exp(x′iβ + ei) [61]. The Poisson model forms a special case of the

negative binomial model, with the two models being analogous if α = 0, and thus a test of this can be used

for model selection purposes. An extension of this model is the generalised negative binomial model. This

model is semi-parametric in nature as the shape parameter ispredicted during estimation. Another way to handle

overdispersion is through zero-inflated models. In these models (zero-inflated Poisson and zero-inflated negative

binomial), overdispersion is accounted for by changing themean structure to explicitly model the production of zero

counts [61]. It assumes two latent groups from which zero counts could be generated: an “always zero” group in which

patients have a zero wait with certainty, and a “sometimes-zero” group, where zeros are generated “by chance” as

would usually be assumed in count data models [61]4. This involves estimating a binary model, usually a logit, to

ascertain from which group a zero is generated, and a count model to model the positive count data. The Vuong test

[64] can be used to determine whether a zero-inflated model isappropriateTests on parameters in the zero-inflated

binomial model can also be tested, to choose between this model and the zero-inflated variant of the Poisson model5.

Zero-inflated models are special cases of the finite mixture negative binomial model and finite mixture Poisson models,

which permit mixing with respect to both zero and positive counts [63]. These more general cases will be explained in

more detail later.

2.5.3. Survival/Duration models: Hazard modelsInpatient waiting times form a type of duration data where the

”duration” refers to the time between some start date and some end date or “event”. Here the start date represents

admission to the waiting list, defined by the date at which a consultant has recommended a patient for treatment, the

end date refers to admission to treatment, and the duration represents the waiting time6.

Duration analysis involves the estimation of two functions, the survival function and thehazard function. The

survival function,S(t), defines the probability that admission to treatment is later than some specified time,t, or the

probability that a patient remains on the waiting list (untreated) until a given time. The hazard rate is defined as the

probability of admission at time t, given a patient has remained untreated up until time t, or more simply, the rate at

which patients leave the waiting list [53]. A key characteristic of the hazard rate is the concept of duration dependence,

which determines how the hazard rate changes over time. Survival and hazard functions are mathematically related,

4hurdle count data models also exist which assumes zero values are generated from a different process from positive counts. However, this falls under the umbrellaof two-part models which are beyond the scope of this paper. This model is outlined in detail in Deb and Trivedi [63], and thus readers are referred here for agreater explanation5hereα = 0 is tested. If this is rejected then the zero-inflated negative binomial is preferred6In this paper we present only a brief outline of survival models. A detailed overview of survival analysis is presented inJenkins [65] and a shorter overviewpresented in Jones [31].

8

with each hazard function producing a particular survival function and vice versa [53]. This relationship is defined by

the following:

h(t,X) =f(t)

S(t)(2.7)

wheref(t) represents the density of the duration distribution.

However the aim here is to generate predictions of waiting times using the covariate estimates. This type of analysis

can be carried out using either parametric models, which requires the specification of the distribution to model the

hazard rate, or semi-parametric models, where this distribution is estimated within the model.

Specifying parametric models requires assumptions to be made in two areas: (1) The shape of the hazard function,

and (2) whether covariates are time-dependent. The shape ofthe hazard function specifies how the hazard rate

changes with time, and is determined by the choice of the distribution used to model it. There are 6 commonly used

parametric duration models: exponential, Weibull, log-normal, log-logistic, Gompertz and the generalised gamma. The

exponential model is the most simple of these and assumes that hazard rate is constant over time. The Weibull model

generates increased flexibility by allowing hazard rates tobe non-constant over time, but restricts the relationship to

be monotonic. The Gompertz model again assumes a monotonic relationship, and hazard rates are assumed to either

increase or decrease with time. The log-logistic and log-normal models further increase flexibility by allowing non-

monotonic, unimodal hazards, in the form of an inverted U-shape relationship. The generalised gamma model (GGM)

is the most flexible of all parametric hazard models. The GGM includes 2 shape parameters, and nests the exponential,

Weibull, log-normal, and standard gamma models as special cases. Given this, Wald tests on parameters in the GGM

can be used to select between this more flexible model and its nested alternatives. For non-nested models, model

selection can be carried out using information criterion and log-likelihood values.

A further increase in flexibility can be generated through the use of the Generalised beta of the second kind (GB2)

distribution, which introduces a third shape parameter. This third shape parameter can capture excess kurtosis to

account for a heavy tail7. The GB2 distribution nests the GGM as a special case, as wellother 3-parameter distributions

(Burr-Singh-Maddala (BSM), Dagum and Beta-2) and 2-parameter distribution (Lomax and Fisk). Similar to the

GGM, tests on parameter estimates can be used to select between the GB2 and its nested models. Both the GGM and

GB2 models can be extended to account for additional heteroscedasticity. In the GGM model this involves assuming

that one of the shape parameters is equal to an exponential ofa linear combination of a set of regressors [21], and in

the GB2 involves allowing the natural log of a shape parameter to vary with the covariates [67]. We estimate the GB2

model, but due to computational expense do not employ any of its nested models. No code was available for these

7GB2 model can be estimated by maximum likelihood using Stephen Jenkins’gb2fit command in Stata [66].

9

extensions and thus these were not performed. The distribution, hazard function and expected duration of each model

are also summarised in Table 1.

The time-dependency of covariates is determined by whetherwe use a proportional hazard (PH) model, or an

accelerated failure time (AFT) model. PH models assume there is a baseline hazard function, that depends on time

but not on other variables which affect waiting times, whichare assumed to be time-invariant. This implies that the

baseline hazard is constant across all individuals. The covariates simply scale the hazard function for each individual.

The AFT model assumes that covariates are time-dependent, in particular that the log of survival time,T , is linearly

dependent on the covariates [65]. In an attempt to reduce therestrictive assumptions imposed by parametric models,

Cox [68] developed a semi-parametric approach to survival analysis. Named the Cox proportional hazards (Cox PH)

model, this leaves the distribution of the baseline hazard function unspecified. Estimates of the baseline hazard are

calculated before estimation of the conditional mean, which utilises the partial likelihood function. Conditioning on

the covariates means the baseline hazard can be factored outof the partial likelihood function.

2.6. Generalised linear models (GLM)

2.6.1. Traditional parametric GLMOrdinary least squares has the restrictive assumption of linearity in parameters,

such that the expected value of the outcome of interest must be a linear function of the regressors i.e. that it estimates

the conditional mean functionE[yi|xi] = µi = xTi β. Generalised linear models (GLMs) combat this by allowing the

dependent variables to have distributions other than the normal distribution. GLMs specify the conditional mean

function directly:

E[yi|xi] = µi = f(x′

iβ) (2.8)

This allows the conditional mean to dependent on the regressors non-linearly. The non-linearity is determined by

the first component of the GLM model; the link function,g(.), where:

g(µi) = x′

iβ ⇒ g−1(µi) = f(x′

iβ) (2.9)

The most frequently used link functions are the identity link, where, like OLS, the mean depends on the covariates

additively, and the log link, where the mean is a multiplicative function of the covariates [31]. However other

specifications of the link function are available, namely the logit, probit, cloglog, negative binomial, log-log, and

log-complement functions, as well as any power and odds power functions [69].

The second component of the GLM is the distribution function. This function specifies the relationship between the

conditional variance and the mean:

10

V ar[yi|xi] = ν(µi) (2.10)

This implies that the GLM model restricts the conditional variance to be a function of the mean. The choice of the

distribution is at the discretion of the researcher, but is restricted to those which belong to the linear exponential family

[70]. The various specifications for the link and distribution functions can be combined freely, but not all combinations

make sense (see [69] for further details). The appropriate combination is not always clear given the sample data, but

the choice of link for a given distribution can by guided by the canonical parameterisation of the GLM model [70]. The

most appropriate link function can be selected using the Pregibon link test [71], described later, and the appropriate

distribution using a modified version of the [72] test presented in [20]

GLMs are estimated based on quasi-score functions or classical “estimating equations” (see [31] for an explanation).

Given that GLMs use only the linear exponential family of distributions, the GLM estimator has the pseudo- or quasi-

maximum likelihood property and thus estimates are consistent as long as the mean function is correctly specified

[73].

The GLM has advantages over other methods which deal with thenon-normality of waiting times. Firstly, given

that the mean function,µ(x), is transformed rather than the dependent variable, predictions are generated on the raw

waiting times scale, and thus avoids the re-transformationproblems faced by the log and square root models. Also,

GLMs inherently take into account heteroscedasticity through the choice of the distribution function.

We will estimate the GLM’s using the same combinations of link and distribution functions used in Jones [31]

namely square root-gamma, log-gamma, the Poisson and log-normal8.

2.6.2. Semiparametric and nonparametric GLMsAs a result of the problems of selecting the optimal distribution and

corresponding link function, Basu and Rathouz [25] developed the extended estimating equations (EEE) estimator.

This semiparametric estimator uses the Box-Cox transformation for the link function, which includes log and power

links as special cases. This is combined with a general powerfunction for the variance which nests all of the common

GLM distributions. The additional parameters in each function are estimated along with the regression coefficients by

quasi-maximum likelihood using extended estimating equations [31]9.

Chiou and Muller [75] introduce even more flexibility into the model by leaving both the link and variance functions

unspecified. The model is estimated using a 3 stage approach,which involves estimating the link and variance functions

non-parametrically, before substituting these into the quasi-maximum likelihood function. They find that the resulting

8We note that the GLM model with the poisson family is equivalent to estimating a poisson regression for count data, and thus results will be very similar to thatfrom a Poisson regression. Results could differ slightly due to differences in starting values and convergence criteria of the algorithms [74].9This can be estimated using Anirban Basu’spglm command in Stata.

11

parameter estimates are asymptotically efficient in comparison to both the quasi-maximum likelihood estimator and

the standard GLM estimator (when both the link and variance functions are treated as known).

2.7. Finite mixture models

Many variables of interest in health economics are characterised by bi-modal or multi-modal distributions. These are

not dealt with well by the models outlined previously. Mixture models are well suited to modelling outcomes where

its values, or the effect of covariates on its values, differsystematically between groups of individuals.

In finite mixture models individuals are assumed to be heterogeneous across latent classesj = 1, . . . , C, but

homogeneous within these classes, conditional on covariates10. The effect of covariates,xi, on the outcome of interest,

yi, is assumed to vary over thej classes. The probability that individuali, belongs to classj, is given byπij , where

0 < πij < 1 andn∑

i=1

πij = 1. The density ofyi, conditional on covariates,xi, but unconditional on class membership,

is given by:

f(yi|xi;πi1, ........, πiC ;β1, ......, βC) (2.11)

i.e. is the weighted average of the densities for each class,with the weights equal to the probabilities of being in each

class. Class probabilities are estimated along with the parameter estimates, where these probabilities can be treatedas

either fixed or variable across members of the class.

Post-estimation, each individual is assigned a posterior probability of belonging to each class, which depends on the

relative contribution of that class to the individual’s likelihood function (see [31] for more details). Each individual is

then assigned to the class with highest posterior probability. The outcome variable can then be predicted separately for

each class.

Although the number of available distributions for mixed models is large, the majority of applications in health

economics have used either a mix of gammas or a mix of negativebinomials. Finite mixture gamma models have found

widespread application in the modelling of continuous variables such as healthcare costs (for example [31, 32, 76]

although a mixture of log-normal densities has also been used to model this variable [77]. Finite mixture negative

binomial models and its extensions have been widely used to model measures of healthcare utilisation such as number

of GP visits and visits to private/public sector specialists (see for example [63, 76, 77, 78, 79, 80, 81]. Mixing of other

distributions has seen greater popularity outside of thesetwo healthcare variables. For example a finite mixture normal

model was used to estimate the effect of prenatal care on birth weight [82].

Given that waiting time is a count variable, the FMM model we estimate here will utilise a mix of negative binomials.

Negative binomials are used to model count data in the case ofoverdispersion, usually generated by large amount of

10Here we present finite mixture models as latent class models.However these models can also be used in the case where membership to classes is observed,for example in a two-part model, where positive and zero waits are modelled separately, or mulit-part models, applied todifferent categories of waiting times,separated by primary condition or patient characteristics.

12

zero values. As zero waits are relatively infrequent in our sample (0.39% for the 2002 sample), it is possible that the

equidispersion property of the Poisson distribution will be satisfied. Thus, an FMM with a mixture of Poisson densities

will also be estimated.

3. Data

We use the Scottish Morbidity Record 01 (SMR01) data set. It records detailed information on all admissions to acute

hospitals including patient characteristics such as waiting time, age, number of co-morbidity conditions, and disease

type. These data collect information on the distribution ofwaiting times only for patients “admitted for treatment from

the waiting list”. As such, it measures the full duration of waiting for patients who were treated.

We extract a subset of patients from the full-year population who were admitted for elective procedures. We next

restrict our attention to only the first hospital stay for each patient in each year. We lose, respectively, 33.3% and

35% of the sample for years 2002 and 2007. We also exclude observations where the waiting time is longer than two

years as these are most likely coding errors. This results inadditionally constricting the sample by 2.3% and 0.4% for

years 2002 and 2007. We exclude from analysis pregnancy and conditions originating in the perinatal period (ICD-10

chapters 15 and 16) because of small number of observations.We have omitted as well external causes of morbidity

and mortality and codes for special purposes (ICD-10 chapters 20 and 22) because the same ICD-10 code can be used

to describe more than one medical condition with different severity.Finally, we disregard observations with missing

data on waiting times. This omits 2.2% and 2.6% from the original 2002 and 2007 samples. As a result, our analytical

sample has 657,443 observations in total with 321,929 patient observation for 2002 and 335,514 for 2007. In preparing

our data for analysis we follow Janulevicuiteet al. [48].

4. Statistical Analysis

4.1. Baseline specification

We use a linear additive specification to model the relationship between waiting times and regressors. For simplicity

we use a parsimonious set of explanatory variables. These include a dummy for gender, age as a categorical variable,

and Charlson index. We construct the index using primary diagnosis and other medical conditions information [83].

It places disease conditions into 17 “Charlson categories”based on their ICD-10 codes. Weights are assigned to each

category which are increasing in severity. The sum of these weights across all conditions for a particular patient

provides the comorbidity index for that patient. Where appropriate we use robust standard errors.

13

4.2. Quasi-Experimental Design

Following Jones [31] we assess model performance using bothin-sample estimation and out-of-sample metrics. “In-

sample” metrics are generated by estimating the model on thefull sample. The out-of-sample predictions are estimated

usingv artificially constructed subsamples generated in the following way. We first estimate model parameters using

all data, but excluding the subsample of interest. Second, these parameters along with explanatory variables are used

to predict the model outcome for the excluded subsample. Thefirst and second step repeatv times to cover the

original data set. The subsamples are drawn randomly from the original data without replacement. The cross-validation

technique eliminates the problem of in-sample overfitting.

The model comparison measures used in this study focus only on those assessing how well models estimate in-

and out-of-sampleE[y|x]. Often a researcher’s primary interest is to estimate the marginal effect, i.e.∂E[y|x]/∂x for

a covariatex. However, for nonlinear specifications, to the best of our knowledge, there is no good way to compare

marginal effects across different models. Instead we compare E[y|x] which is a necessary condition for unbiased

estimation of marginal effects [27].

4.3. Evaluation of Performance

4.3.1. Pre-comparison model selectionSome of the models we consider are not estimated in levels. The process of

retransformation heavily depends on the distribution of error terms. The retransformation methods available quite

often rely on the assumption of normality. Normality of the waiting time distribution, and thus errors, are tested using

both the Shapiro-Wilks test [84] and the D’Agostino test [85]. If normality is rejected, the re-transformation method

depends on the presence of heteroscedasticity. Under homoscedasticity, raw-scale predictions can be consistently

estimated from the log model using the Duan smearing estimator11. A similar re-transformation can be carried out for

the square root model. This involves adding the mean residual from the square root model to the squared prediction.

If heteroscedasticity is present, consistent re-transformation requires knowledge of its form. As our regressors arenot

binary in nature, the process is too complex and thus Duan’s smearing estimators will still be applied, recognising that

this may itself lead to biased predictions. The test for heteroscedasticity depends on the normality of the distribution.

Given the normality assumption of the standard Breusch-Pagan test, non-normality renders this test inconsistent. If

non-normality is found, a heteroscedasticity test which relaxes the normality assumption for the errors (by specifying

an i.i.d distribution) is used.

One benefit of flexible models such as count and duration models, is the ability to choose between the more general

specifications and their nested alternatives. In order to evaluate the competing parametric hazard models, we test

the restrictions imposed by each nested model of the GGM distribution using Wald tests. ifκ = 0 is not rejected,

11Transformed predictions using this smearing estimator canbe produced using Christopher Baum’slevpredict command in Stata.

14

the log-normal model is optimal. Ifκ = 1 is not rejected the Weibull model is optimal.ln(p) = 0 implies p is not

significantly different from 1, and thus the standard gamma model is optimal. Ifk = p = 1 is not rejected, then the

exponential model is optimal. If all restrictions are rejected, the GGM model is optimal. We report the results of these

tests for each frequency-year12 for the full aggregate samples13 (see Tables A1 and A2 in [86]). We also report the rates

of rejection for each restriction across all ICD-10 chapters at a 5 percent significance level for each frequency-year

(Tables A4, A5 in [86]). To select between non-nested models, we use Akaike and Bayesian Information Criteria as

well as the log-likelihood value for both the full and constant only models.

We test for optimal count model specification. The formal test of α = 0 from the negative binomial model is carried

out to ascertain whether overdispersion is present and thuswhether the negative binomial model should be preferred

to the Poisson model. Vuong test is then carried out to determine whether the zero-inflated variant of the model should

be used. A test ofα = 0 from the zero-inflated negative binomial model is conductedto test between this model and

the zero-inflated Poisson model14. We report the results of these tests for both the aggregate sample and individual

chapters for each frequency year (Tables B1-B4 in [86]). We also report the percentage of chapters for which each test

was rejected for each frequency-year (Table B5 in [86]).

4.3.2. In-Sample TestsWe follow Jones [31] in measuring comparative performance using both specification tests

and goodness of fit measures. We use Pregibon link and Pearsontests to test models’ misspecification. The Pregibon

test tests whether(Xβ)2 has any explanatory power in addition toXβ15. The Pearson test checks whether there is a

significant correlation between fitted residuals andXβ. Significance in both cases implies model misspecification.

We employ a range of goodness of fit measures for the estimation sample. Measures include theR2 from an auxiliary

regression of actual waiting times on predicted values, theroot mean squared error (RMSE), the mean absolute

prediction error (MAPE), and mean prediction error (MPE), where all predictions are those calculated on the raw

scale. The formulae used in the calculation of each of these measures are given below:

RMSE =

√

√

√

√

n∑

i=1

(yi − yi)2

n(4.12)

MAPE =

n∑

i=1

|yi − yi|

n(4.13)

12Here frequency relates to either weekly or daily waits, and the year to either 2002 or 200713These are the sample which pool data across all chapters for each year14Given that the variables which are thought to affect the latent class of zero observations are unknown, if a zero-inflatedmodel is required, we will include thesame variables in the logit model, as we do to model the count data. If a negative binomial is the preferred model, the generalised form of the negative model willalso be estimated.15 if it has then this implies that the true non-linearity in therelationship between the waiting times and the regressors is greater than assumed in the model beingestimated.

15

MPE =

n∑

i=1

(yi − yi)

n(4.14)

R2 = 1−

n∑

i=1

yi − (α+ βyi)2

n∑

i=1

(yi − yi)2(4.15)

whereα andβ represent the parameter estimates from the auxiliary regression. Note that MPE can be interpreted for

non-linear models as a measure of bias.

These metrics are calculated for both the separate ICD-chapter samples16 and the aggregate sample (see Appendix

C in [86]) for each frequency-year. To be parsimonious we only fully describe results for one ICD-10 chapter, using

the largest sample size andR2 as the choice criteria. We then check consistency of these results across other chapters

and the aggregate sample. This is done by counting the numberof occasions a model performs best on each metric,

and calculating the average rank of each model across chapters for each metric.For the link and Pearson tests, a model

receives a count for a chapter if it passes each test, rather than for a top rank based on the greatest p-value. Average

ranks then based on the proportion of all chapters for which amodel passed each test17.

4.3.3. Out-of-Sample TestsThe Copas test is used as the sole specification test for the validation sample. Heavily

parameterized models have the potential to over-fit samplesof data, leading to poor out-of-sample forecast accuracy

[31]. The Copas test provides a good measure of out-of-sample performance and guards against over-fitting when

models are used for prediction purposes [54, 87]. The Copas test involves running a simple OLS regression of

y = α0 + α1y + ǫ, wherey represents the predictions fromv-fold cross-validation. If there is no overfitting then we

would expectα0 = 0 andα1 = 1. A deviation from these values represents evidence of overfitting, and more generally,

poor predictive power. As in Jones [31] we use a less restrictive specification of the test, i.e.H0 : α1 = 1, as well a test

of the more restrictive null,H0 : α0 = 0, α1 = 1.

For the validation sample, we employ the same goodness of fit measures as for the estimation sample, but omit the

R2 value. In this case, the MPE measures the degree of bias of predicted values within the forecast sample, and can

be interpreted as a measure of the accuracy of predictions atan aggregate level. The MAPE can be interpreted as a

measure of the accuracy of individual predictions [29].

The results for separate ICD-10 chapters are available uponrequest, while count and average rank tables are

represented in Appendix D in [86].

16Available upon request17the smaller the rank, the better the model

16

5. Results and Discussion

5.1. Distributional characteristics of waiting times levels and residuals

In this subsection we will discuss the distributions of waiting times and error terms for levels, log and square root

regressions. The sample distribution, defined in terms of weekly and daily frequencies, and for both 2002 and 2007,

illustrate the challenges in modelling waiting times. Sample statistics are presented in Table 2. Mean waits are

considerably greater than median waits, and the distribution is skewed to the right. The kurtosis of the distribution

is significantly larger than three, indicative of a heavy right-hand tail. Non-normality is confirmed by both histograms

and formal tests for levels, log and square root transformations for weekly and daily frequencies18 (Figures 1 and 2).

The residuals from linear regressions on level, log and square root transformation of waiting times (Figures 3 and 4)

exhibit the same non-normality behaviour.

5.2. Pre-comparison model selection

Given non-normality of errors, simple exponentiated coefficients from the log-transformed model provide biased

estimates of the true coefficients on the raw scale. To account for this we employ Duan’s smearing estimator noting

it may result in bias due to heteroscedasticity, but considerably less than if an exponential retransformation was

used. Also, given non-normality and heteroscedasticity, the additive retransformation is used to generate raw scale

predictions for the square-root transformed model.

Results from formal tests for duration model selection indicate that models which allow a more flexible relationship

between the hazard function and time perform better. As such, the exponential model, which assumes the hazard rate

is constant over time, performs poorly across all metrics for both weekly and daily waits. For weekly waiting times,

the log-normal model also performs well, with estimated statistics being close to that of the generalised gamma, but

Weibull model performs poorly. For daily waits, we observe an opposite pattern with the Weibull model performing

well and the log-normal model performing poorly (See Section A in [86]).

For count data model results reject the Poisson model in favour of the negative binomial model. This is consistent

across all chapters in each frequency-year. Vuong test suggests that zero-inflation is not required for weekly waits in

any year for the aggregate samples. This story is identical for the separate ICD-10 chapters apart from for chapter 13

in 2002 were zero-inflation is needed. For daily waits the results differ with zero-inflated model is appropriate for the

aggregate sample in 2002, and for 50% and 23.5% of the ICD-chapters for years 2002 and 2007 respectively. This

difference is expected given how weekly waits were defined: Aweekly wait of zero was assigned if and only if the

daily wait also took a value of zero, and thus for there is an identical amount of zero observations for both definitions.

18Shapiro-Wilks and D’Agostino formal tests are not reported, they are available upon request.

17

However, aggregating waiting times to a weekly level increases the amount of observations for each positive value,

making zero values less important (See Tables B1 – B5 in [86]).

5.3. In-Sample Results

ICD-10 chapter 2 (Neoplasms/Cancer), has the largest sample size in 2002 and the third largest sample size in 2007.

We use this chapter to demonstrate model comparison in details. Relative performance of models for this chapter in

terms of the MPE, MAPE, RMSE andR2 for both 2002 and 2007, as well as the Pregibon link and Pearson correlation

tests are presented in Tables 3 and 4 for the weekly and daily frequencies respectively. The best performing model(s)

for each metric are highlighted in bold.19Relative performance of models is very similar in each year and frequency of

waiting times for the MPE, MAPE, RMSE andR2 metrics. Consistent with findings in Veazieet al. [22] and Joneset

al. [29], results for this chapter imply a trade-off between bias, as measured by the MPE, and precision, as measured

by MAPE, since no model performs best on both of these metricsfor any frequency-year. This result holds for all

ICD-chapters and aggregate samples.

OLS on level waits, OLS on square root waits, the ECM model estimated by Poisson maximum likelihood, and

the GLM with a log link and Poisson distribution perform the best on the MPE for all frequency-years, indicating that

these models provide the least biased estimates for this chapter. Results from the count and average rank tables confirm

consistency of these findings across ICD-chapters, with these models being the only models to produce the lowest

MPE for at least one chapter, and are the four best placed models in terms of average rank. These are also the four best

performing models in the aggregate samples for all frequency-years. Count data models and other GLM specifications

also perform well across all chapters shown by their moderately high average ranks. The Cox proportional hazard

model performs particularly badly on this metric, with the average error being almost as large as the mean wait in

each frequency-year. Estimates from all the parametric duration models, log-OLS, and quantile regression models, as

well as the EEE when it converged, are also heavily biased. This is again consistent across chapters, with these models

providing the poorest average ranks, and consistent with the results from the aggregate sample.

A different pattern is observed for the MAPE, where quantileregression performs the best and parametric duration

models exhibits similarly good performance in all frequency-years for the cancer chapter. Again consistency is found

across chapters, with these models providing clearly superior average ranks, and also are the only models to perform

best for at least one chapter for the vast majority of frequency-years. These are also the best performing models in the

aggregate samples. These results indicate that although estimated waits from these models are biased, they are the most

accurate. The EEE, log-OLS, and Cox PH models again perform poorly both for the cancer chapter, across chapters,

and in the aggregate sample, although the EEE does provide the most accurate in-sample predictions for chapter 16 for

19Results are missing for non-convergence cases.

18

both weekly and daily waits in 200220. Count models and all GLM specifications again perform well across chapters

for all frequency-years, indicated by they good average ranks and relative performance in aggregate samples.

In terms of goodness of fit, the ECM model estimated by NLLS andthe GLM with a log link and normal distribution

achieve the highestR2 and lowest RMSE for all frequency-years, although all otherGLM specifications, OLS on level,

log, and square root waits, and all count data models generate similar levels of performance. Results from the best count

calculations indicate results for the cancer chapter and across chapters are similar: The GLM log-normal, the ECM

model estimated by NLLS and the GLM with a log link and normal distribution, all generate either the highestR2 or

lowest RMSE for at least one chapter for all frequency-years, and the zero-inflated Poisson model meets this criteria

for three out of the four frequency-years. These models alsoperform the best on average ranks and in the aggregate

samples. In general both the Cox PH and parametric duration model perform very poorly for chapter 2, across chapters

and in the estimation sample. However, surprisingly, the generalised gamma model is found to be the best fitting model

in some chapters despite its poor performance elsewhere.

The Cox proportional hazard model, the GLM log-normal, and the GLM sqrt-gamma are only models to pass the

Pregibon link test for the daily frequency in 2002, indicating that they are the only models to capture the true non-

linearity between waiting times in 2002 and the regressors.This is surprising given that other models such as the count

models perform consistently well on the other metrics. The story is similar across other frequency-years, but all forms

of the negative binomial model also pass the link test there.However, these results are not totally indicative of the

results across other chapters. Although the models which pass the link test for chapter 2 tend to be those for which

the test was passed for the most other chapters, all models but the ECM model estimated both by NLLS and Poisson

ML, the GLM log-poisson, and the GLM sqrt-gamma, pass the link test for over half of the ICD-10 chapters for all

frequency-years21. When the EEE model converged and standard errors were produced, this model also performed

well on the link test, passing for at least a half of all chapters for all frequency-years.

Furthermore, for chapter 2, all parametric duration modelsand a finite mixture of negative binomials also fail the

Pearson test for all frequency-years indicating further misspecification22. The story is consistent across chapters with

these models passing the Pearson test for less than half of the 18 chapters studied. OLS on transformed waits also

consistently perform poorly across chapters. Also, despite its good performance on the link test, the Cox proportional

hazard model fails the Pearson test for all frequency-yearsfor the cancer chapter. Looking at results across all chapters,

for frequencies and years, OLS on level waits, the ECM estimated by both Poisson ML and NLLS, all count data

models, and all GLM specifications excluding the GLM sqrt-gamma, perform exceptionally well on the Pearson test,

20this chapter relates to diseases of the genitourinary system.21Also, the GLM sqrt-gamma only fails to pass for over half of chapters for the daily frequency in 2002, and all models pass the link test for over half of thechapters for the weekly frequency in 2007.22The quantile regression model and OLS on both square root andlogged waits also fail both specification tests for all frequency-years apart from the dailyfrequency in 2002

19

with the test indicating miss-specification for either 1 or 0chapters. OLS on transformed waits, and both survival

models perform very poorly on this metric for all frequency-years. Results are similar when considering the results

from aggregate data.

In general, full sample results indicate that generalised linear models with log links and count data models are

the most appropriate for the modelling of waiting times data. Allowing more flexibility in the count models, either

through zero-inflation or using the generalised negative binomial, do not significantly improve performance. However,

introducing additional flexibility into the GLM model through use of EEE, has large negative effects on performance.

This model, as well as quantile regression and all duration models (parametric and Cox PH), perform poorly across

most of the metrics and thus are not appropriate for modelling waiting times with this dataset23. Also notable is that

the OLS model performs well across the majority of criteria,a finding common in many of the model comparisons

using healthcare cost data. Performance is similar when OLSis applied to square-root transformed waits. However,

model performance is reduced when using the logarithm of waits as the dependent variable, especially in terms of bias

and accuracy.

5.4. Validation Sample Results

Results from the validation forecasts largely confirm results from the in-sample estimations. OLS on both the level and

square-root transformed waits, GLM log-Poisson, and the ECM estimated by Poisson ML, and all count data models

are amongst the models generating out-of-sample predictions with very little bias for chapter 2. However, count and

average rank results indicate that relative performance ofcount models have improved. The negative binomial and its

zero-inflated and generalised extensions generate the lowest MPE for some chapters in some frequency-years. We also

find that the square root transformed OLS model generates thelowest MPE for any chapter in any frequency-year and

its average ranking deteriorates. This change is mirrored in the aggregate results. The Cox PH, generalised gamma,

and quantile regression consistently perform poorly on this metric across all chapters and frequency-years.

Quantile regression and parametric duration models produce the lowest MAPE for chapter 2 for all frequency-years.

This performance is mirrored across chapters with these models being the only models that generate the lowest MAPE

for chapters in any frequency-year, and have by far the best average rank figures. This indicates that good in-sample

accuracy has been carried through to good out-of-sample accuracy. As before the Cox PH, log-transformed OLS, and

EEE models generate predictions with substantial inaccuracy, for both chapter 2 and across chapters, indicated by their

poor average ranks for each frequency-year.

The ECM model estimated by NLLS and the GLM log-normal again produce the best predictions for the cancer

chapter, as measured by the RMSE. This is consistent across all frequencies and years. All other GLM specifications,

23The disadvantages of the EEE increases further when taking into account the problems with convergence.

20

OLS on level, log, and square root waits, and all count data models also perform well on this metric. Count and

rank results indicate this performance is carried through across all chapters. Both the semi-parametric Cox model

and parametric duration models perform particularly bad onthis metric in all frequency years, generating very poor

average ranks.

The Copas test was used as a test for overfitting and more generally for out-of-sample forecast accuracy. As expected,

the Copas test indicates that there is no overfitting when theOLS regression on level waits, the simplest of all model,

for the cancer chapter is considered. In addition, more complex models such as the generalised and zero-inflated

count data models pass the Copas test, despite being most susceptible to overfitting the data. All count data and

GLM specifications with log links pass the Copas test in each frequency-year. The quantile regression, all duration

models, and the log and square-root transformed OLS models are found to overfit the data. This indicates they are not

appropriate for modelling waiting times in this dataset. This finding is consistent across chapters with models that pass

the Copas test for the cancer chapter doing so for over two-thirds of all chapters in each frequency-year, and models

that fail, overfitting the data for the majority of other chapters in each frequency-year.

6. Conclusions

We estimate a wide range of models used in the analysis of inpatient waiting times and other skewed health outcomes

using Scottish waiting times data. Testing parameter estimates in a generalised gamma model and information criteria

are used as a tool to evaluate duration models. We also run tests to determine the optimal count data model and whether

subsequent zero-inflation is needed. Comparative performance was determined using a quasi-experimental design.

Specification tests, and measure of bias, accuracy and goodness of fit were used to determine in-sample performance.

V -fold cross-validation techniques were then used to generate a set of out-of-sample forecasts. The same measures of

bias, accuracy and fit were calculated on this sample and Copas test was run to check for overfitting.

The results suggest that a wide range of models may be appropriate for the modelling of waiting time. In particular,

count data models generate unbiased predictions, with reasonable accuracy, and capture well the non-linearity in

the relationship between waiting times and the regressors,as well as not overfitting the data. Results are similar for

generalised linear models. We also find that a simple linear OLS specification performs well on the majority of metrics,

calling into question the need to employ more advanced and computationally intensive techniques.

21

Bibliography

References

[1] Iversen T. A theory of hospital waiting lists.Journalof HealthEconomics 1993;12(1):55–71.

[2] Iversen T. The effect of a private sector on the waiting time in a national health service.Journalof Health

Economics 1997;16(4):381–396.

[3] Lindsay C, Feigenbaum B. Rationing by waiting lists.TheAmericanEconomicReview 1984; :404–417.

[4] Martin S, Smith P. Rationing by waiting lists: an empirical investigation.Journalof Public Economics 1999;

71(1):141–164.

[5] Cullis J, Jones P, Propper C. Waiting lists and medical treatment: analysis and policies.Handbookof Health

Economics 2000;1:1201–1249.

[6] Dusheiko M, Gravelle H, Jacobs R. The effect of practice budgets on patient waiting times: allowing for selection

bias.HealthEconomics 2004;13(10):941–958.

[7] Propper C. The disutility of time spent on the United Kingdom’s National Health Service waiting lists.Journal

of HumanResources 1995; :677–700.

[8] Siciliani L, Hurst J. Tackling excessive waiting times for elective surgery: a comparative analysis of policies in

12 OECD countries.HealthPolicy 2005;72(2):201–215.

[9] Laudicella M, Siciliani L, Cookson R. Waiting times and socioeconomic status: Evidence from England.Social

Science& Medicine 2012; .

[10] Gravelle H, Siciliani L. Ramsey waits: Allocating public health service resources when there is rationing by

waiting.Journalof HealthEconomics 2008;27(5):1143.

[11] Appleby J, Boyle S, Devlin N, Harley M, Harrison A, Thorlby R. Do English NHS waiting time targets distort

treatment priorities in orthopaedic surgery?Journalof HealthServicesResearchandPolicy 2005;10(3):167–172.

[12] Nikolova S, Harrison M, Sutton M. Does waiting time influence the effectiveness of surgery? Evidence from the

national PROMs dataset 2014. University of Mancheter working paper.

[13] Sanmartin C, Berthelot J, McIntosh C,etal.. Determinants of unacceptable waiting times for specialized services

in Canada.HealthcarePolicy 2007;2(3):e140.

22

[14] Cutler D. Equality, efficiency, and market fundamentals: the dynamics of international medical-care reform.

Journalof EconomicLiterature 2002;40(3):881–906.

[15] Cullis J, Jones P. Inpatient waiting: a discussion and policy proposal.British MedicalJournal(Clinical research

ed.) 1983;287(6403):1483.

[16] Cullis J, Jones P. National health service waiting lists: A discussion of competing explanations and a policy

proposal.Journalof HealthEconomics 1985;4(2):119–135.

[17] Smith P. Performance management in British health care: will it deliver? HealthAffairs 2002;21(3):103–115.

[18] Oliver A. The English National Health Service: 1979-2005.HealthEconomics 2005;14(S1):S75–S99.

[19] Duan N, Manning W, Morris C, Newhouse J. A comparison of alternative models for the demand for medical

care.Journalof Business& EconomicStatistics 1983;1(2):115–126.

[20] Manning W, Mullahy J. Estimating log models: to transform or not to transform?Journalof HealthEconomics

2001;20(4):461–494.

[21] Manning W, Basu A, Mullahy J. Generalized modeling approaches to risk adjustment of skewed outcomes data.

Journalof HealthEconomics 2005;24(3):465–488.

[22] Veazie P, Manning W, Kane R. Improving risk adjustment for Medicare capitated reimbursement using nonlinear

models.MedicalCare 2003;41(6):741–752.

[23] Buntin M, Zaslavsky A. Too much ado about two-part models and transformation?: Comparing methods of

modeling Medicare expenditures.Journalof HealthEconomics 2004;23(3):525–542.

[24] Montez-Rath M, Christiansen C, Ettner S, Loveland S, Rosen A. Performance of statistical models to predict

mental health and substance abuse cost.BMC MedicalResearchMethodology 2006;6(1):53.

[25] Basu A, Rathouz P. Estimating marginal and incrementaleffects on health outcomes using flexible link and

variance function models.Biostatistics 2005;6(1):93–109.

[26] Basu A, Arondekar B, Rathouz P. Scale of interest versusscale of estimation: comparing alternative estimators

for the incremental costs of a comorbidity.HealthEconomics 2006;15(10):1091–1107.

[27] Basu A, Manning W, Mullahy J. Comparing alternative models: log vs Cox proportional hazard?Health

Economics 2004;13(8):749–765.

[28] Mullahy J. Econometric modeling of health care costs and expenditures: a survey of analytical issues and related

policy considerations.MedicalCare 2009;47(7 Supplement1):S104–S108.

23

[29] Jones A, Lomas J, Rice N,et al.. Applying beta-type size distributions to healthcare cost regressions.Health,

EconometricsandDataGroup(HEDG) WorkingPapers 2011; .

[30] Hill S, Miller G. Health expenditure estimation and functional form: applications of the generalized gamma and

extended estimating equations models.HealthEconomics 2010;19(5):608–627.

[31] Jones A.Modelsfor healthcare. University of York., Centre for Health Economics, 2010.

[32] Deb P, Burgess J. A quasi-experimental comparison of econometric models for health care expenditures.Hunter

CollegeDepartmentof EconomicsWorkingPapers 2003;212.

[33] Johar M, Keane M, Jones G, Savage E, Stavrunova O. Differences in waiting times for elective admissions in

NSW public hospitals: A decomposition analysis by non-clinical factors.TechnicalReport, CHERE Working

Paper 2010.

[34] Johar M, Jones G, Keane M, Savage E, Stavrunova O. Waiting times for elective surgery and the decision to buy

private health insurance.HealthEconomics 2011;20(S1):68–86.

[35] Reyes J. Do female physicians capture their scarcity value? the case of ob/gyns.TechnicalReport, National

Bureau of Economic Research 2006.

[36] Carlsen F, Kaarboe O. Notatserie i helseøkonomi.TechnicalReport.

[37] Cooper Z, McGuire A, Jones S, Le Grand J. Equity, waitingtimes, and NHS reforms: retrospective study.British

MedicalJournal 2009;339.

[38] Pell J, Pell A, Norrie J, Ford I, Cobbe S. Effect of socioeconomic deprivation on waiting time for cardiac surgery:

retrospective cohort study.British MedicalJournal 2000;320(7226):15.

[39] Hauck K, Street A. Do targets matter? A comparison of English and Welsh national health priorities.Health

Economics 2007;16(3):275–290.

[40] Sloan F, Lorant J. The role of waiting time: Evidence from physicians’ practices.TheJournalof Business 1977;

50(4):486–507.

[41] Tinghog G, Andersson D, Tinghog P, Lyttkens C. Horizontal inequality in rationing by waiting lists.International

Journalof HealthServices 2010; .

[42] Sharma A, Siciliani L, Harris A.Waitingtimesandsocioeconomicstatus:doessampleselectionmatter? Monash

University, Business and Economics, Centre for Health Economics, 2011.

24

[43] Monstad K, Engesæter L, Espehaug B. Waiting time socioeconomic status-an individual level analysis.Technical

Report 2010.

[44] James C, Bourgeois F, Shannon M. Association of race/ethnicity with emergency department wait times.

Pediatrics 2005;115(3):e310–e315.

[45] Park C, Lee M, Epstein A. Variation in emergency department wait times for children by race/ethnicity and

payment source.HealthServicesResearch 2009;44(6):2022–2039.

[46] Johar M, Jones G, Keane M, Savage E, Stavrunova O. Discrimination in a universal health system: Explaining

socioeconomic waiting time gaps.Schoolof FinanceandEconomics,Universityof Technology,SydneyWorking

PaperSeries 2011; .

[47] Siciliani L, Martin S. An empirical analysis of the impact of choice on waiting times.HealthEconomics 2007;

16(8):763–779.

[48] Janulevicuite J, Askildsen E, Kaarboe O, Holmas T, Sutton M. The impact of different prioritisation policies on

waiting times: Case studies of Norway and Scotland.SocialScienceandMedicine 2013;97(November):1–6.

[49] Iversen T, Luras H. The interaction between patient shortage and patients waiting time.TechnicalReport, Oslo

University, Health Economics Research Programme 2009.

[50] Siciliani L, Verzulli R. Waiting times and socioeconomic status among elderly Europeans: evidence from

SHARE.HealthEconomics 2009;18(11):1295–1306.

[51] Roll K, Stargardt T, Schreyogg J. Effect of type of insurance and income on waiting time for outpatient care.The

GenevaPapersonRisk andInsurance-IssuesandPractice 2012; .

[52] Arnesen K, Erikssen J, Stavem K. Gender and socioeconomic status as determinants of waiting time for inpatient

surgery in a system with implicit queue management.HealthPolicy 2002;62(3):329–341.

[53] Dimakou S, Parkin D, Devlin N, Appleby J. Identifying the impact of government targets on waiting times in the

NHS.HealthCareManagementScience 2009;12(1):1–10.

[54] Blough D, Madden C, Hornbrook M. Modeling risk using generalized linear models.Journal of Health

Economics 1999;18(2):153–171.

[55] Manning W,et al.. The logged dependent variable, heteroscedasticity, and the retransformation problem.Journal

of HealthEconomics 1998;17(3):283–296.

25

[56] Mullahy J. Much ado about two: reconsidering retransformation and the two-part model in Health Econometrics.

Journalof HealthEconomics 1998;17(3):247–281.

[57] Box G, Cox D. An analysis of transformations.Journalof theRoyalStatisticalSociety.SeriesB(Methodological)

1964; :211–252.

[58] Chaze J. Assessing household health expenditure with Box–Cox censoring models.HealthEconomics 2005;

14(9):893–907.

[59] Koenker R, Hallock K. Quantile regression: An introduction. Journalof EconomicPerspectives 2001;15(4):43–

56.

[60] Cameron A, Trivedi P.Regressionanalysisof countdata, vol. 30. Cambridge University Press, 1998.

[61] Long J. Regressionmodels for categoricaland limited dependentvariables, vol. 7. Sage Publications,

Incorporated, 1997.

[62] Maddala G.Limited-dependentandqualitativevariablesin Econometrics, vol. 3. Cambridge University Press,

1986.

[63] Deb P, Trivedi P. Demand for medical care by the elderly:a finite mixture approach.Journalof Applied

Econometrics 1997;12:313–336.

[64] Vuong Q. Likelihood ratio tests for model selection andnon-nested hypotheses.Econometrica 1989; :307–333.

[65] Jenkins S. Survival analysis.Unpublishedmanuscript,Institutefor SocialandEconomicResearch,Universityof

Essex,Colchester,UK 2005; .

[66] Jenkins S. Distributionally-sensitive inequality indices and the GB2 income distribution.Reviewof Incomeand

Wealth 2009;55(2):392–398.

[67] Sun J, Frees E, Rosenberg M. Heavy-tailed longitudinaldata modeling using copulas.Insurance:Mathematics

andEconomics 2008;42(2):817–830.

[68] Cox D. Regression models and life-tables.Journalof the Royal StatisticalSociety.SeriesB (Methodological)

1972; :187–220.

[69] McCullagh P, Nelder J.Generalizedlinearmodels, vol. 37. Chapman & Hall/CRC, 1989.

[70] Nelder J, Wedderburn R. Generalized linear models.Journalof theRoyalStatisticalSociety.SeriesA (General)

1972; :370–384.

26

[71] Pregibon D. Goodness of link tests for generalized linear models.AppliedStatistics 1980; .

[72] Park R. Estimation with heteroscedastic error terms.Econometrica 1966; :888–888.

[73] Gourieroux C, Monfort A, Trognon A. Pseudo maximum likelihood methods: Theory.Econometrica 1984; :681–

700.

[74] Analyzing count data. URLhttp://www.ats.ucla.edu/stat/stata/library/count.htm.

[75] Chiou J, Muller H. Quasi-likelihood regression with unknown link and variance functions.Journal of the

AmericanStatisticalAssociation 1998;93(444):1376–1387.

[76] Shah N, Craig B, Banerjee R, Tulledge-Scheitel S, Naessens J. Evaluating health plan benefit change: A finite

mixture model approach.AvailableatSSRN995052 2007; .

[77] Deb P, Holmes A. Estimates of use and costs of behavioural health care: a comparison of standard and finite

mixture models.HealthEconomics 2000;9(6):475–489.

[78] Deb P, Trivedi P. The structure of demand for health care: latent class versus two-part models.Journalof Health

Economics 2002;21(4):601–625.

[79] Jimenez-Martın S, Labeaga J, Martınez-Granado M. Latent class versus two-part models in the demand for

physician services across the European Union.HealthEconomics 2002;11(4):301–321.

[80] Bago d’Uva T. Latent class models for utilisation of health care.HealthEconomics 2006;15(4):329–343.

[81] Lourenco O, Ferreira P. Utilization of public health centres in Portugal: effect of time costs and other

determinants. finite mixture models applied to truncated samples.HealthEconomics 2005;14(9):939–953.

[82] Conway K, Deb P. Is prenatal care really ineffective? Or, is the “devil” in the distribution?Journalof Health

Economics 2005;24(3):489–513.

[83] Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi J, Saunders L, Beck C, Feasby T, Ghali W.

Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data.MedicalCare 2005;

:1130–1139.

[84] Shapiro S, Wilk M. An analysis of variance test for normality (complete samples).Biometrika 1965;52(3/4):591–

611.

[85] D’agostino R, Belanger A, D’Agostino Jr R. A suggestionfor using powerful and informative tests of normality.

TheAmericanStatistician 1990;44(4):316–321.

27

http://www.ats.ucla.edu/stat/stata/library/count.htm

[86] Sinko A, Turner A, Nikolova S, Sutton M. Appendix for “Models for waiting times in healthcare: Comparative

study using Scottish administrative data”.TechnicalReport 2014.

[87] Copas J. Regression, prediction and shrinkage.Journalof theRoyalStatisticalSociety.SeriesB(Methodological)

1983; :311–354.

28

7. Figures

0.0

5.1

Den

sity

0 20 40 60 80 100Weeks

Weekly 2002

0.0

05.0

1.01

5.02

Den

sity

0 200 400 600 800Days

Daily 2002

0.5

11.

52

Den

sity

0 1 2 3 4 5Log Weeks

Log Weekly 2002

0.1

.2.3

.4D

ensi

ty

0 2 4 6 8Log Days

Log Daily 2002

0.2

.4.6

.8D

ensi

ty

0 2 4 6 8 10Root Weeks

Root Weekly 2002

0.0

5.1

Den

sity

0 10 20 30Root Days

Root Daily 2002

Figure 1. HISTOGRAM PLOTS OF LEVEL, LOG AND ROOT WAITING TIMES: 2002

0.0

2.04

.06.

08.1

Den

sity

0 20 40 60 80 100Weeks

Weekly 2007

0.0

05.0

1.0

15D

ensi

ty

0 200 400 600 800Days

Daily 2007

0.5

11.

5D

ensi

ty

0 1 2 3 4 5Log Weeks

Log Weekly 2007

0.1

.2.3

.4D

ensi

ty

0 2 4 6 8Log Days

Log Daily 2007

0.2

.4.6

Den

sity

0 2 4 6 8 10Root Weeks

Root Weekly 2007

0.0

5.1

.15

Den

sity

0 10 20 30Root Days

Root Daily 2007

Figure 2. HISTOGRAM PLOTS OF LEVEL, LOG AND ROOT WAITING TIMES: 2007

29

0.0

5.1

Den

sity

0 50 100Residuals

Weekly 2002: OLS residuals

0.0

05.0

1.01

5.02

Den

sity

0 200 400 600 800Residuals

Daily 2002: OLS residuals

0.5

11.

52

Den

sity

−2 −1 0 1 2 3Residuals

Log Weekly 2002: OLS residuals

0.1

.2.3

.4D

ensi

ty

−4 −2 0 2 4Residuals

Log Daily 2002: OLS residuals

0.2

.4.6

.8D

ensi

ty

−5 0 5 10Residuals

Root Weekly 2002: OLS residuals

0.0

5.1

Den

sity


Root Daily 2002: OLS residuals

Figure 3. HISTOGRAM PLOTS OFOLS RESIDUALS FROM LEVEL, LOG AND ROOT REGRESSIONS: 2002

0.0

2.04

.06.

08.1

Den

sity

0 50 100Residuals

Weekly 2007: OLS residuals

0.0

05.0

1.0

15D

ensi

ty

0 200 400 600 800Residuals

Daily 2007: OLS residuals

0.5

11.

5D

ensi

ty

−2 −1 0 1 2 3Residuals

Log Weekly 2007: OLS residuals

0.1

.2.3

.4D

ensi

ty

−4 −2 0 2 4Residuals

Log Daily 2007: OLS residuals

0.2

.4.6

Den

sity

−2 0 2 4 6 8Residuals

Root Weekly 2007: OLS residuals

0.0

5.1

.15

Den

sity


Root Daily 2007: OLS residuals

Figure 4. HISTOGRAM PLOTS OFOLS RESIDUALS FROM LEVEL, LOG AND ROOT REGRESSIONS: 2007

30

8. Tables

Table 1.SUMMARY OF MODEL ESTIMATORS AND METHOD OF GENERATING PREDICTIONS

Model Estimator Predictions (µ(x)))

Ordinary Least Squares (OLS)

Linear OLS Y = x′β + ǫ µ(x) = x′β

Log OLS (with Duan smearing estimator) ln(Y ) = x′β + ǫ µ(x) = exp(x′β).s

E[exp(ǫ)] = constant s = N−1n∑

i=1

exp(ǫi)

Square-root OLS (with additive smearing estimator)√Y = x′β + ǫ µ(x) = (x′β)2 + s

E[ǫ)] = constant s = N−1n∑

i=1

ǫi

Survival/Duration Models

Exponential model h(t,X) = λiÊ[ti] = exp(−xiβ)

λi = exp(xiβ)

Weibull model h(t,X) = λip(λit)p−1 Ê[ti] = ( 1

λ)1p Γ(1 + 1

p)

λi = exp(xiβ)

Log-logistic model h(t,X) =λ

1γ t[

γ1−1]

γ[1+(λt)1γ ]

E[T ] = 1

λ× γπ

sin(γπ)

λi = exp(xiβ)

Log-normal model h(t,X) =1

tσ√

2πexp[

−1

2σ2 {ln(t)−µ}2]

1−Φ{ ln(t)−µσ

}formulae N/A

µ = xiβ

Gompertz model h(t,X) = λeγt formulae N/Aλi = exp(xiβ)

Generalised gamma model f(t) = λp(λt)pκ−1e−(λt)p

Γ(κ)formulae N/A

λi = exp(xiβ)

Generalized beta of the second kind (GB2) f(y) = ayap−1

bapB(p,q)[1+(yb)a](p+q)

Ê(y) = b[Γ(p+ 1

a)Γ(q− 1

a)

Γ(p)Γ(q)]

B(u, υ) = Γ(u)Γ(υ)/Γ(u + υ)

Cox proportional hazard model h[yi|xi] = h0(yi)λi µ(x) = x′β

λi = exp(xiβ)

Generalised Linear Models (GLMs)

Log-Poisson ln(µ(x)) = x′β µ(x) = exp(x′β)

V (Y |X = x) = µ(x)

Log-gamma ln(µ(x)) = x′β µ(x) = exp(x′β)

V (Y |X = x) = θ1(µ(x))2

Log-normal ln(µ(x)) = x′β µ(x) = exp(x′β)

V (Y |X = x) = θ1

Square root-gamma√

µ(x) = x′β µ(x) = (x′β)2

V (Y |X = x) = θ1(µ(x))2

Extended Estimating Equations (EEE) x′iβ = g(µi;λ) µ(x) = (x′β.λ + 1)

1λ

Count Data Models

Poisson Pp(Yi = yi|xi) =e−µiµ

yii

yi!µp(x) = n = exp(x′β)

Negative binomial Pnb(Yi = yi|xi) =θθµ

yii

Γ(θ+yi)

Γ(yi+1)Γ(θ)(µi+θ)θ+yiµnb(x) = n = exp(x′β)

Zero-Inflated Poisson{

Pp(yi|xi)(1 − p0(γ′zi)) yi > 0

p0(γ′zi) + Pp(0|xi)(1 − p0(γ′zi)) yi = 0µ(xi, zi) = µp(1 − p0)

Zero-Inflated negative binomial{

Pnb(yi|xi)(1 − p0(γ′zi)) yi > 0

p0(γ′zi) + Pnb(0|xi)(1 − p0(γ′zi)) yi = 0µ(xi, zi) = µnb(1 − p0)

Generalised negative binomial P (Yi = yi) = nn+βyi

(

n+βyiyi

)

αyi (1 − α)n+βyi−yi µ = nα

1−αβ

Finite Mixture Models (FMM)

Poisson P (Yi = yi|xi) =∑

nk=1 πiPp(yi|xi; θk) µ =

∑

nk=1 µp(θk)

Negative Binomial P (Yi = yi|xi) =∑

ki=1

πiPnb(yi|xi; θk) µ =∑

nk=1

µnb(θk)

Others

Quantile regression (median) Qτ [yi|xi] = xiβτ µ(x) = x′βτ

(ECM) - NLLS E[yi|xi] = exp(xiβ) µ(x) = exp(x′β)

31

Table 2.CHARACTERISTICS OF THE2002AND 2007SAMPLES

2002 2007Daily Weekly Daily Weekly

N 321922 321922 335487 335487Mean 79.39171 11.75959 62.96882 9.413593Median 42 6 44 7Standard deviation 99.7035 14.2399 71.15008 10.16192Skewness 2.272736 2.273856 3.479698 3.481343Kurtosis 9.129328 9.131606 21.89617 21.91007Maximum 728 104 728 10499th percentile 448 64 373 5495th percentile 305 44 169 2590th percentile 213 31 125 1875th percentile 100 15 87 1325th percentile 15 3 19 710th percentile 5 1 7 15th percentile 2 1 3 11st percentile 1 1 1 1Minimum 0 0 0 0

32

Table 3.RESULTS FORCHAPTER 2: CANCER, WEEKLY FREQUENCY, IN-SAMPLE PREDICTION

Model MPE MAPE RMSE LINKpval Pearson test R2

(p-value) (p-value)

Year 2002

OLS -6.82e-08 5.142 8.957 0 1.000 0.113OLS onln(y) -0.474 5.394 8.987 2.22e-16 0 0.114OLS on

√y 8.99e-08 5.182 8.981 0 0 0.115

Cox PH 6.026 6.073 11.388 0.089 0 0.108Generalized Gamma (PH) 2.985 4.590 9.615 4.44e-16 0 0.111Exponential (PH) 2.057 4.602 9.258 .0246 0 0.114Gompertz (PH) 2.627 4.572 9.453 .103 0 0.113Log-logistic (PH) 2.671 4.557 9.453 0 0 0.113Log-normal (PH) 2.611 4.567 9.447 7.82e-13 0 0.113Weibull (PH) 2.045 4.6035 9.255 0.025 0 0.114NLLS 0.025 5.107 8.932 0 0.291 0.118GLM log-Poisson -3.9e-08 5.120 8.935 0 0.375 0.117GLM log-gamma 0.008 5.129 8.952 .0995135 0.165 0.114GLM log-normal 0.025 5.107 8.932 1.72e-5 0.291 0.118GLM sqrt-gamma 0.012 5.139 8.967 0.075 0.119 0.111EEE . . . . . .FMM - negative binomial 0.246 5.065 8.980 1.10e-07 0 0.114Quantile regression (median) 2.584 4.553 9.411 . 0 0.105ECM - Poisson ML -1.22e-07 5.1204 8.935 0 0.375 0.117Zero-inflated poisson -1.4e-5 5.120 8.935 3.21e-06 0.376 0.117Negative Binomial 0.009 5.125 8.947 .001 0.119 0.115Zero-inflated negative binomial 0.009 5.125 8.947 0.007 0.119 0.115Generalized Negative Binomial 0.009 5.125 8.947 .001 0.119 0.115

Year 2007

OLS -3.5e-09 4.009 7.072 3.1e-15 1.000 0.110OLS onln(y) -0.696 4.316 7.111 5.73e-12 7.68e-14 0.110OLS on

√y -1.1e-07 4.0367 7.081 0 4.7e-28 0.111

Cox PH 5.176 5.214 9.234 0.740 0 0.107Generalized Gamma (PH) 2.160 3.690 7.482 2.0e-11 0 0.107Exponential (PH) 1.775 3.689 7.348 0.439 0 0.108Gompertz (PH) 2.044 3.690 7.436 0.581 0 0.108Log-logistic (PH) 1.934 3.668 7.382 1.8e-15 0 0.109Log-normal (PH) 1.923 3.679 7.392 1.1e-07 0 0.108Weibull (PH) 1.503 3.703 7.278 0.190 0 0.108NLLS 0.013 3.985 7.058 2.2e-08 0.427 0.113GLM log-Poisson -4.7e-08 3.996 7.061 0 .606 0.113GLM log-gamma -0.001 4.009 7.074 0.570 0.914 0.109GLM log-normal 0.014 3.985 7.058 3.5e-06 0.427 0.113GLM sqrt-gamma -4.0e-4 4.017 7.081 0.948 0.986 0.108FMM - negative binomial 0.139 3.987 7.083 2.8e-06 4.9e-28 0.111Quantile regression (median) 1.970 3.661 7.376 . 0 0.108ECM - Poisson ML -3.9e-08 3.996 7.061 0 0.606 0.113Zero-inflated poisson 8.4e-06 3.996 7.061 0.001 0.599 0.113Negative Binomial 9.9e-4 4.004 7.070 0.734 0.745 0.110Zero-inflated negative binomial 9.9e-4 4.004 7.070 0.788 0.745 0.110Generalized Negative Binomial 9.9e-4 4.004 7.070 0.734 0.745 0.110

33

Table 4.RESULTS FORCHAPTER 2: CANCER, DAILY FREQUENCY IN-SAMPLE PREDICTION

Model MPE MAPE RMSE LINKpval Pearson test R2

(p-value) (p-value)

Year 2002

OLS -1.4e-07 36.067 62.750 0 1.000 0.113OLS onln(y) -2.272 36.937 62.835 3.3e-11 0.037 0.112OLS on

√y 1.3e-07 36.341 62.933 0 0 0.115

Cox PH 42.874 42.897 79.377 0.361 0 0.107Generalized Gamma (PH) 19.615 32.063 66.568 1.5e-06 0 0.113Exponential (PH) 13.471 32.434 64.682 0.158 0 0.114Gompertz (PH) 18.595 32.151 66.421 0.501 0 0.112Log-logistic (PH) 19.754 32.055 66.575 2.5e-10 0 0.112Log-normal (PH) 21.072 32.146 67.161 1.5e-08 0 0.111Weibull (PH) 17.166 32.102 65.752 0.080 0 0.114NLLS 0.195 35.776 62.566 0 0.277 0.119GLM log-Poisson 1.60e-07 35.877 62.591 0 0.365 0.118GLM log-gamma 0.046 35.973 62.734 0.352 0.264 0.114GLM log-normal 0.195 35.776 62.566 5.0e-5 0.277 0.119GLM sqrt-gamma 0.067 36.052 62.851 0.256 0.211 0.111EEE 42.581 42.593 79.084 . 0 0.112FMM - negative binomial 1.292 35.656 62.890 8.0e-09 3.4e-35 0.114Quantile regression (median) 18.517 32.037 66.084 0 0 0.109ECM - Poisson ML 1.7e-07 35.878 62.591 0 0.365 0.118Zero-inflated poisson 4.6e-4 35.878 62.591 9.4e-06 0.361 0.118Negative Binomial 0.048 35.968 62.728 0.147 0.245 0.114Zero-inflated negative binomial 0.048 35.968 62.728 0.297 0.245 0.114Generalized Negative Binomial 0.048 35.968 62.728 0.147 0.245 0.114

Year 2007

OLS 3.7e-07 28.110 49.548 1.3e-15 1.000 0.111OLS onln(y) -2.998 29.112 49.760 9.0e-07 7.8e-22 0.108OLS on

√y -2.9e-07 28.304 49.619 0 5.3-28 0.111

Cox PH 36.648 36.671 64.183 0.297 0 0.106Generalized Gamma (PH) 14.496 25.835 52.090 0.013 0 0.108Exponential (PH) 11.504 25.946 51.291 0.121 0 0.108Gompertz (PH) 14.312 25.942 52.189 0.250 0 0.108Log-logistic (PH) 14.472 25.806 52.010 2.5e-08 0 0.108Log-normal (PH) 15.821 25.918 52.545 2.3e-4 0 0.106Weibull (PH) 12.830 25.890 51.656 0.244 0 0.108NLLS 0.109 27.911 49.446 1.1e-10 0.402 0.114GLM log-Poisson 2.4e-07 28.002 49.470 0 0.599 0.113GLM log-gamma -0.027 28.127 49.584 0.286 0.612 0.109GLM log-normal 0.109 27.911 49.446 8.8e-06 0.402 0.114GLM sqrt-gamma -0.019 28.182 49.634 0.698 0.706 0.107FMM - negative binomial 0.280 28.263 49.669 0 2.2e-33 0.110Quantile regression (median) 13.147 25.763 51.501 0 0 0.108ECM - Poisson ML -6.5e-08 28.002 49.470 0 0.599 0.113Zero-inflated poisson 7.7e-5 28.005 49.472 0.003 0.608 0.113Negative Binomial -0.023 28.122 49.580 0.170 0.673 0.109Zero-inflated negative binomial -0.023 28.122 49.580 0.321 0.673 0.109Generalized Negative Binomial -0.023 28.122 49.579 0.170 0.673 0.109

34

Table 5.RESULTS FORCHAPTER 2: CANCER, COPAS RESULTS, WEEKLY FREQUENCY

β1 β0 MPE APE RMSE χ2(1) χ2(2)

(p-value) (p-value)

Year 2002

OLS 0.996 0.024 -4.8e-5 5.144 8.961 0.807 0.971OLS onln(y) 1.250 -2.252 -0.473 5.400 8.995 0 0OLS on

√y 1.306 -2.035 -0.001 5.185 8.984 0 0

Cox PH -7.166 11.040 6.026 6.072 11.387 0 0Generalized Gamma (PH) 2.141 -1.235 2.985 4.591 9.616 0 0Exponential (PH) 1.468 -.107 2.057 4.604 9.261 0 0Gompertz (PH) 1.771 -0.500 2.627 4.574 9.455 0 0Log-logistic (PH) 1.686 -0.084 2.671 4.559 9.455 0 0Log-normal (PH) 1.765 -.503 2.612 4.569 9.448 0 0Weibull (PH) 1.464 -0.105 2.045 4.605 9.258 0 0NLLS 0.981 0.154 0.025 5.109 8.937 0.175 0.348GLM log-Poisson 1.009 -0.062 -1.6e-4 5.123 8.940 0.527 0.819GLM log-gamma 1.017 -0.106 0.008 5.132 8.956 0.257 0.518GLM log-normal 0.981 0.154 0.025 5.110 8.937 0.175 0.348GLM sqrt-gamma 1.021 -0.125 0.012 5.142 8.971 0.181 0.397EEE -6.1e-15 6.665 -4.0e+12 4.0e+12 3.2e+13 . 0FMM - Poisson -8.4e-34 6.641 -4.4e+29 4.4e+29 9.0e+14 . 0FMM - negative binomial 1.258 -1.405 0.244 5.068 8.983 0 0Quantile regression (median) 1.437 0.817 2.589 4.557 9.417 0 0ECM - Poisson ML 1.009 -0.061 4.1e-5 5.123 8.940 0.535 0.825Zero-inflated poisson 1.009 -0.062 9.8e+5 5.123 8.940 0.531 0.822Negative Binomial 1.020 -0.123 0.009 5.128 8.951 0.188 0.413Zero-inflated negative binomial 1.020 -0.120 0.009 5.128 8.952 0.196 0.425Generalized Negative Binomial 1.020 -0.121 0.009 5.128 8.952 0.194 0.422

Year 2007

OLS 0.996 0.023 -1.9e-4 4.011 7.075 .793 0.966OLS onln(y) 1.127 -1.518 -0.697 4.318 7.094 0 0OLS on

√y 1.201 -1.154 -0.001 4.039 7.083 0 0

Cox PH -5.727 8.982 5.176 5.214 9.233 0 0Generalized Gamma (PH) 1.672 -0.280 2.160 3.691 7.483 0 0Exponential (PH) 1.434 .035 1.775 3.690 7.351 0 0Gompertz (PH) 1.580 -0.127 2.044 3.691 7.438 0 0Log-logistic (PH) 1.392 .422 1.934 3.670 7.383 0 0Log-normal (PH) 1.496 0.006 1.923 3.680 7.394 0 0Weibull (PH) 1.351 -0.002 1.503 3.705 7.281 0 0NLLS 0.984 0.107 0.014 3.987 7.063 0.277 0.520GLM log-Poisson 1.003 -0.020 2.6e-5 3.998 7.065 0.824 0.976GLM log-gamma 0.994 0.032 -0.001 4.011 7.078 0.709 0.932GLM log-normal 0.984 0.107 0.014 3.987 7.062 0.278 0.522GLM sqrt-gamma 0.996 0.023 -4.2e-4 4.019 7.085 0.794 0.967EEE . . . . .FMM - Poisson -4.0e-25 5.744 -1.6e+21 1.6e+21 7.8e+16 . 0FMM - negative binomial 6.4e-28 5.742 -2.0e+23 2.0e+23 566405.6 . 0Quantile regression (median) 1.314 0.755 1.946 3.666 7.373 0 0ECM - Poisson ML 1.004 -0.022 -6.4e-06 3.998 7.065 0.801 0.969Zero-inflated poisson 1.004 -0.023 3.7e-4 3.998 7.065 0.795 0.967Negative Binomial 1.001 -0.003 9.1e-4 4.006 7.074 0.965 0.999Zero-inflated negative binomial 1.001 -0.006 0.001 4.006 7.073 0.938 0.997Generalized Negative Binomial 1.001 -0.006 0.001 4.006 7.073 0.931 0.996

35

Table 6.RESULTS FORCHAPTER 2: CANCER, COPAS RESULTS, DAILY FREQUENCY

β1 β0 MPE APE RMSE χ2(1) χ2(2)

(p-value) (p-value)

Year 2002

OLS 0.997 0.142 9.2e-4 36.081 62.776 0.828 0.977OLS onln(y) 0.970 -0.880 -2.251 36.921 62.923 0 0OLS on

√y 1.315 -13.738 -0.010 36.357 62.951 0 0

Cox PH -44.873 72.122 42.874 42.897 79.377 0 0Generalized Gamma (PH) 1.732 1.902 19.615 32.074 66.583 0 0Exponential (PH) 1.462 -0.542 13.471 32.443 64.699 0 0Gompertz (PH) 1.870 -3.330 18.595 32.159 66.433 0 0Log-logistic (PH) 1.687 3.235 19.754 32.066 66.585 0 0Log-normal (PH) 1.835 2.090 21.072 32.155 67.172 0 0Weibull (PH) 1.640 0.132 17.166 32.112 65.768 0 0NLLS .9803169 1.047 .194 35.795 62.601 0.169 0.328GLM log-Poisson 1.010 -0.416 3.7e-4 35.895 62.622 0.517 0.811GLM log-gamma 1.013 -0.539 0.045 35.989 62.761 0.374 0.668GLM log-normal 0.980 1.047 .196 35.795 62.602 0.170 0.328GLM sqrt-gamma 1.016 -0.613 0.068 36.069 62.878 0.309 0.584EEE -1.6e-08 43.590 -4842513 4854459 3.6e+07 . 0FMM - Poisson . . . . .FMM - negative binomial 1.222 -8.089 1.292 35.667 62.911 5.39e-34 4.8e-36Quantile regression (median) 1.592 3.751 18.535 32.054 66.106 0 0ECM - Poisson ML 1.010 -0.427 0.002 35.894 62.621 0.505 0.801Zero-inflated poisson 1.009 -0.414 -9.0e-4 35.896 62.622 0.521 0.814Negative Binomial 1.014 -0.555 0.047 35.985 62.757 0.359 0.651Zero-inflated negative binomial 1.014 -0.554 0.050 35.983 62.758 0.358 0.649Generalized Negative Binomial 1.014 -0.551 0.049 35.985 62.758 0.361 0.652

Year 2007

OLS 0.996 0.137 -0.001 28.122 49.570 0.810 0.971OLS onln(y) 0.867 2.342 -2.998 29.176 49.877 0 0OLS on

√y 1.201 -7.478 -0.008 28.316 49.632 0 0

Cox PH -35.846 58.783 36.648 36.671 64.182 0 0Generalized Gamma (PH) 1.474 3.558 14.496 25.844 52.102 0 0Exponential (PH) 1.425 0.428 11.503 25.955 51.306 0 0Gompertz (PH) 1.652 -0.843 14.312 25.949 52.199 0 0Log-logistic (PH) 1.394 5.377 14.471 25.816 52.018 0 0Log-normal (PH) 1.512 4.685 15.821 25.927 52.553 0 0Weibull (PH) 1.488 0.768 12.829 25.899 51.671 0 0NLLS 0.983 0.729 0.109 27.926 49.474 0.266 0.496GLM log-Poisson 1.004 -0.146 -2.2e-4 28.0157 49.496 0.799 0.968GLM log-gamma 0.988 0.406 -0.027 28.140 49.607 0.452 0.750GLM log-normal 0.983 0.750 0.109 27.928 49.477 0.250 0.475GLM sqrt-gamma 0.990 0.345 -0.020 28.195 49.657 0.531 0.820EEE 17.600 -16.243 34.210 34.491 62.623 0 0FMM - Poisson . . . . .FMM - negative binomial 1.221 -7.708 .4336 28.197 49.669 2.0e-31 8.3e-31Quantile regression (median) 1.325 5.314 13.150 25.783 51.515 0 0ECM - Poisson ML 1.004 -0.140 -9.8e-4 28.017 49.497 0.808 0.971Zero-inflated poisson 1.004 -0.136 -0.002 28.019 49.498 0.815 0.973Negative Binomial 0.990 0.357 -0.023 28.134 49.601 0.509 0.801Zero-inflated negative binomial 0.989 0.378 -0.022 28.136 49.605 0.487 0.782Generalized Negative Binomial 0.989 0.370 -0.023 28.136 49.604 0.495 0.790

36

Date post:	07-Nov-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Models for waiting times in healthcare: Comparative study ...

Documents