+ All Categories
Home > Documents > Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 ›...

Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 ›...

Date post: 06-Jul-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
105
Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1 – Modelling results March 2018
Transcript
Page 1: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on

econometric cost modelling

Appendix 1 – Modelling results

March 2018

Page 2: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

2

Summary .................................................................................................................... 3

1 Water models ...................................................................................................... 5

1.1 Water resources models ............................................................................... 5

1.2 Water treatment models ................................................................................ 9

1.3 Water resources plus .................................................................................. 11

1.4 Treated water distribution models ............................................................... 12

1.5 Network plus water models ......................................................................... 15

1.6 Wholesale water models ............................................................................. 28

2 Wastewater models........................................................................................... 41

2.1 Bioresources models ................................................................................... 41

2.2 Sewage treatment models........................................................................... 48

2.3 Bioresources plus models ........................................................................... 52

2.4 Sewage collection models ........................................................................... 52

2.5 Network plus wastewater models ................................................................ 56

2.6 Wholesale wastewater models .................................................................... 69

3 Retail models .................................................................................................... 80

3.1 Bad debt models ......................................................................................... 80

3.2 Totex less bad debt models ........................................................................ 88

3.3 Total expenditure models ............................................................................ 94

4 Enhancement expenditure models .................................................................. 102

4.1 Meeting lead standards costs.................................................................... 102

4.2 Water new developments and new connections ....................................... 103

4.3 First time sewerage costs ......................................................................... 104

4.4 Sewage growth ......................................................................................... 105

Page 3: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

3

Summary

This appendix presents the econometric cost models of this consultation. It includes

models proposed by us and models proposed by 13 water companies. This appendix

supplements Cost assessment for PR19: consultation on econometric modelling.

All the econometric models in this appendix are presented in a fixed template. The

table below provides a glossary for the statistical diagnostics used in the templates.

We have published a set of (‘do’) files with a code to run all our models in this

appendix in Stata. We have also published a set of excel spreadsheets with the

underlying data. We discuss the source of data in the main document.

The remainder of this appendix is structured as follows:

section 1 presents the modelling results for wholesale water activities;

section 2 presents the modelling results for wholesale wastewater activities;

section 3 presents the modelling results for retail expenditure, and

section 4 presents the modelling results for enhancement expenditure.

A simple glossary of statistical diagnostics in our templates

P-value of an

estimated

coefficient

The p-value gives the probability of observing the estimated

coefficient (or one more extreme) if the true value was in fact zero.

A lower value indicates a lower probability of observing the

estimated coefficient if the true value was zero, and can thus be

interpreted as giving a higher degree of confidence that the true

value is not zero – i.e. that there is a relationship between the

dependent and explanatory variables.

In practice, the p-value indicates our confidence in the estimated

coefficient. The lower the p-value, the more confident we are in

the value of the estimated coefficient.

Technical comment: due to the panel nature of the data, p-values

in this appendix are based on cluster robust standard errors.

Asterisks for

estimated

coefficients

Next to the estimated coefficients we use a common asterisks

notation to indicate their statistical significance.

*** indicates 1% significance level

Page 4: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

4

** indicates 5% significance level

* indicates 10% significance level

The more starts, the more confident we are in the value of the

estimated coefficient.

No star indicates a lower level of statistical significance (ie there is

less confidence in the value of the estimated coefficient). However,

there is a wide range of confidence levels in this category. As we

say in section 2.1 of the consultation, statistical significance of

80% and even 70% are may deemed valid in practical work.

R2 adjusted The adjusted R-squared measures how accurately the model fits

the data. It measures the proportion of variation in the dependent

variables (in our case, variation in costs) that can be explained by

the model.

The statistic ranges from 0 to 1. The higher the value the better

the model fits.

Importantly, R2 measures should only be used to compare models

with the same dependent variable.

Variance

Inflation

Factor (VIF)

Used to detect multicollinearity. High collinearity means that we

cannot estimate the coefficients with confidence – their variance is

high and statistical significance low. As a consequence the

individual coefficient estimates are not precise and unstable. As a

rule of thumb, a VIF>4 indicates medium risk and VIF>10

indicates harmful collinearity.

An exception to this rule is when the model includes a variable and

its quadratic term. In such cases the VIF becomes high due to the

correlation between these two related terms. But while the high

collinearity may impair our ability to accurately estimate the impact

of the individual terms on the dependent variable, it should not

impair our ability to accurately estimate their collective impact.

Since these two terms always move together, the collective impact

is what is important.

Reset test Regression specification error test. Used to detect an inadequate

functional form. Particularly powerful for detecting if the model is

missing non-linear terms.

The higher the p-value the more confident we are that the

functional form is adequate.

Page 5: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

5

1 Water models

1.1 Water resources models

Template 1. Water resources models proposed by Ofwat

Description of dependent variable

Water resources base costs excluding abstraction charges and items described in section 3 of the main consultation document.

All monetary values have been inflated to 2016-17 prices using the CPIH

Comments on models

We have used the number of connected properties as a scale variable in our water resource models. We considered that the volume of water, while perhaps a more intuitive scale driver or a water resources business, suffered from endogeneity. It is to an extent under management control and can provide a perverse incentive on water efficiency.

We use average pumping head to account for energy costs, which are an important component of water resources costs.

We considered other factors. We expected a positive coefficient to the number of sources per property and a negative coefficient for the proportion of water from impounding reservoirs. However, the model did not generate our expected results for these variables. A number of companies present models with a positive coefficient on the proportion of water from reservoirs. We question whether this is the expected sign in a water resources model.

Our simple models explain close to 90% of the variation in water resources costs.

Consultation model ID OWR1 OWR2

Dependent variable ln (water resources base costs)

ln (connected properties) 1.026*** (0.000)

1.069*** (0.000)

ln (average pumping head water resources)

0.163 (0.139)

Constant 1.938** (0.019)

0.808 (0.460)

R2 adjusted 0.889 0.894

VIF (max) 1.000 1.259

Reset test 0.562 0.62

Estimation method OLS OLS

N (sample size) 107 107

Template 2. Water resources models proposed by Anglian Water

Description of dependent variable

Log of water resources base costs excluding rates and abstraction charges

Acronyms used in explanatory variables

APH = average pumping head

DI = distribution input

WTW = water treatment works

Page 6: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

6

Comments on models (Anglian Water)

Model 1 takes the Water Resources’ operational parameters as the causation factors. For a single AMP, these causation factors are exogenous. Models 2, 3 and 4 are based on demographic and geographic factors. It is the most fundamental: causation factors are completely exogenous to WaSCs and WoCs.

All four models are described in detail in our Cost Modelling report – Phase 2, published March 2018 here http://www.anglianwater.co.uk/about-us/thinking-about-our-future/.

Consultation model ID ANHWR1 ANHWR2 ANHWR3 ANHWR4

Company’s model ID 1 2 3 4

Dependent variable ln (Water resources botex less rates and abstraction costs)

Ln(DI from impounding reservoirs) Ml/d

0.0007 ** (0.046)

Ln(DI from pumped storage reservoirs) Ml/d

-0.00004 (0.797)

Ln(DI from rivers) Ml/d

0.0004 (0.107)

Ln(DI from boreholes) Ml/d

0.0007 ** (0.015)

0.220 *** (0.000)

Ln(DI from rivers & reservoirs Ml/d

0.377 *** (0.000)

Ln(average DI from surface WTW) Ml/d

0.344 *** (0.000)

Ln(average DI from borehole WTW) Ml/d

-0.104 (0.105)

Ln(APH x *DI) Unit: (Ml/d)*m hd

0.641 *** (0.000)

0.184 * (0.082)

Ln(Number of sources) 0.173 ** (0.041)

0.268 *** (0.000)

0.429 *** (0.000)

0.101 (0.156)

Ln(Reservoir capacity ) Unit: Ml

0.140 *** (0.000)

0.304 *** (0.000)

0.209 *** (0.000)

0.164 *** (0.000)

% DI from groundwater -1.350 ***

(0.000)

% DI from rivers & pumped storage reservoirs

-1.145 *** (0.004)

Volume abstracted/maximum licenced volume

1.427 *** (0.000)

1.420 *** (0.000)

% population in sparse areas (<600 people per sqkm)

-0.284 (0.151)

% population in dense areas (>4000 people per sqkm)

0.686 ** (0.031)

Constant -7.04 *** (0.000)

-3.31 *** (0.000)

-4.90 *** (0.000)

-3.11 *** (0.000)

R2 adjusted 0.902 0.902 0.905 0.912

Reset test 0.697 0.000 0.086 0.602

VIF (max) 9.77 5.19 8.14 4.77

Method OLS OLS OLS OLS

N (sample size) 107 107 107 107

Page 7: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

7

Template 3. Water resources models proposed by Southern Water

Description of dependent variable

Modelled water resources OPEX + modelled water resources base CAPEX

modelled OPEX is total OPEX less third party services, abstraction charges and local authority rates

modelled base CAPEX is maintenance expenditure in infrastructure and non-infrastructure less grants and contributions.

All costs are unsmoothed and deflated to 2016/17 prices using CPIH.

Comments on models (Southern Water)

Water resources models in particular tend to be comparatively less robust to statistical sensitivities and alternative model specifications and predict a relatively wide efficiency ranges across the industry. The models presented are robust to the inclusion of abstraction charges, but individual companies’ performances are sensitive to the modelled cost.

The models control for scale (using unit costs), geological factors, capacity and source type:

Model 7 models cost per connected property, whereas models 8-9 model cost per population served

All models control for sources over DI and reservoir capacity

Models 7-8 control for proportion of DI from reservoirs and model 9 controls for the proportion of DI from boreholes and the proportion of DI from rivers All models control for sources over DI and reservoir capacity

Model 7 is identical to Yorkshire Water model 14 (Ofwat comment)

Consultation model ID SRNWR1 SRNWR2 SRNWR3

Company’s model ID 7 8 9

Dependent variable ln (Resources BOTEX per thousand connected properties)

Sources over DI 0.646*** (0.003)

0.740*** (0.001)

0.708*** (0.008)

Reservoir capacity (log) 0.040 (0.17)

0.044* (0.095)

0.046* (0.083)

Proportion of DI from reservoirs 0.731** (0.043)

0.749** (0.04)

Proportion of DI from boreholes -0.734** (0.03)

Proportion of DI from rivers -0.793* (0.072)

Constant -5.428*** (0.000)

-6.296*** (0.000)

-5.563*** (0.000)

R2 adjusted 0.382 0.404 0.403

Reset test 0.069 0.201 0.172

VIF (max) 1.499 1.499 2.728

Method OLS OLS OLS

N (sample size) 102 102 102

Template 4. Water resources models proposed by Yorkshire Water

Description of dependent variable

Water resources base costs = operating expenditure less abstraction charges, third party services and local authority rates + capital maintenance expenditure net of grants and contributions (G&C)

The dependent variables are deflated using CPIH to 2016/17 prices. No smoothing was undertaken.

Comments on models (Yorkshire Water)

Page 8: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

8

Water resources models tend to predict a wider efficiency range and have poor statistical properties compared to other models.

Our dependent variable excludes G&C consistent with the PR14 approach. However, given lack of split of G&C for capital maintenance and enhancement expenditure, we have also modelled CAPEX on a gross basis. The statistical performance of the models are broadly consistent with and without G&C.

Model 15 was proposed also by South Staffs Water, which made the comment:

The water resources models are not as robust as aggregate or network plus models and therefore the range of modelled costs compared to actual costs is wider. For this reason, we think that it would be inappropriate to set the water resources price control on the basis of a water resource model alone, the price control should take a wider set of information into account.

Consultation model ID YKYWR1 YKYSSCWR2 YKYWR3 YKYWR4

Company’s model ID 14 15 16 17

Dependent variable ln (resources BOTEX per thousand

connected properties) ln (resources BOTEX per thousand

population)

Sources over DI (number / (Ml/d))

0.646*** (0.003)

0.535** (0.025)

0.740*** (0.001)

0.708*** (0.008)

Reservoir capacity (Ml) (log) 0.0400 (0.17)

0.049* (0.08)

0.0440* (0.095)

0.046* (0.083)

% DI from reservoirs 0.731** (0.043)

0.749** (0.04)

% DI from boreholes -0.652* (0.052)

-0.734** (0.03)

% DI from rivers -0.858** (0.039)

-0.793* (0.072)

Constant -5.428*** (0.000)

-4.775*** (0.000)

-6.296*** (0.000)

-5.563*** (0.000)

R2 adjusted 0.382 0.389 0.404 0.403

Reset test 0.069 0.038 0.201 0.172

VIF (max) 1.499 2.728 1.499 2.728

Method OLS OLS OLS OLS

N (sample size) 102 102 102 102

Template 5. Water resources models proposed by Bristol Water

Description of dependent variable

The dependent variable is Botex per connected property.

Botex = (total opex – business rates – third party costs) + capital maintenance

Comments on models (Bristol Water)

The models and presented in this pro forma are based on the Master Wholesale Cost data file dated 27th February 2017, reflecting the latest updates and amendments to the data.

Capital maintenance costs have been smoothed on a three year rolling-average basis, therefore four years of data have been modelled (2014-2017). Botex costs have been calculated on a unit cost basis by dividing cost information by the sum of Total non-household connected properties at year end and Total household connected properties at year end also from the six-year wholesale cost data set.

A full description of the work undertaken to arrive at these models is set out in a report by NERA: ‘Comparative Benchmarking Assessment to Support Preparation of Bristol Water’s AMP7 Business Plan’ (December 2017).

Page 9: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

9

Consultation model ID BRLWR1 BRLWR2 BRLWR3

Company’s model ID 4 5 6

Dependent variable Ln(water resource botex per property)

Ln(length of raw mains and conveyors/DI)

0.134* (0.053)

0.109* (0.078)

Year15 dummy -0.039* (0.084)

-0.039 (0.108)

-0.038 (0.107)

Year16 dummy -0.029 (0.569)

-0.035 (0.498)

-0.034 (0.492)

Year17 dummy -0.047 (0.448)

-0.063 (0.305)

-0.063 (0.297)

% of water from reservoirs 0.680** (0.026)

0.673** (0.027)

0.624*** (0.001)

Ln(number of sources/DI) 0.167** (0.025)

-0.034 (0.601)

0.0007 (0.993)

% of water from boreholes -0.224 (0.383)

0.119 (0.657)

Ln(average pumping head resources)

0.180* (0.067)

0.171 (0.113)

Constant -4.006*** -5.013*** -4.872***

R2 adjusted 0.45 0.55 0.54

Reset test 0.43 0.44 0.45

VIF (max) 2.21 3.11 1.51

Method OLS OLS OLS

N (sample size) 68 68 68

1.2 Water treatment models

Template 6. Water treatment models proposed by Ofwat

Description of dependent variable

Water treatment base costs excluding items described in section 3 of the main consultation document.

Description of selected explanatory variables

Treated water = total water treated at all ground and surface water works

% boreholes =The percent of distribution input (DI) coming from boreholes, artificial recharge and aquifer storage and recovery water supply schemes

% proportion of water treated in WTW levels 3-6 = The percent of water treated in water treatment works with complexity levels 3 to 6

All monetary values have been inflated to 2016-17 prices using the CPIH

Comments on models

We considered connected customers and total treated water as scale variables in our water treatment models.

For treatment complexity we used one of two factors: the percent of distribution input from boreholes, which is typically cheaper to treat relative to surface sources, or the proportion of water treated in works of complexity 3 to 6. We considered that treatment works levels 3-6 provided a better representation of the more complex works, rather than treatment works level 4-6. Although level 3 does include traditional treatment methods there are significant three treatment stage works that would fall into this category and the boundary between levels 2 and 3 represents a clearer divide between ‘basic’ and ‘complex’ than the boundary between levels 3 and 4.

In some models we control for pumping costs. In some we control for economies of scale at the treatment works level by including a density variable. All coefficients are robust and meet expectations.

Page 10: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

10

Consultation model ID OWT1 OWT2 OWT3 OWT4 OWT5 OWT6 OWT7 OWT8 OWT9 OWT10

Dependent variable ------------------ ln (water treatment base costs) ------------------

ln (connected properties) .947*** (0.000)

.941*** (0.000)

.949*** (0.000)

.985*** (0.000)

.972*** (0.000)

ln (total water treated) .923*** (0.000)

.913*** (0.000)

.919*** (0.000)

.968*** (0.000)

.962*** (0.000)

% of DI coming from boreholes

-.006* (0.065)

-0.005 (0.109)

-.004** (0.039)

-0.003 (0.104)

-.004** (0.050)

-0.003 (0.151)

% of water treated in WTW levels 3-6

.008*** (0.008)

.006** (0.014)

.008*** (0.000)

.007*** (0.000)

ln (average pumping head for water treatment)

.217*** (0.004)

.200*** (0.004)

.200*** (0.007)

.187*** (0.006)

.156*** (0.009)

.129** (0.013)

.188** (0.019)

.156** (0.030)

ln (weighted average density)

-.157** (0.015)

-.203*** (0.003)

-0.117 (0.142)

-.173** (0.033)

Constant 4.47*** 11.7*** 3.98*** 11.2*** 3.08*** 10.6*** 3.8*** 11.8*** 4.5*** 12.2***

R2 adjusted 0.862 0.865 0.907 0.903 0.91 0.904 0.92 0.922 0.912 0.915

VIF (max) 1.061 1.081 1.101 1.119 1.127 1.142 1.276 1.286 1.264 1.304

Reset test 0.769 0.885 0.076 0.174 0.551 0.294 0.007 0 0.016 0.024

Method OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 107 107 107 107 107 107 107 107 107 107

Page 11: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

11

1.3 Water resources plus

Template 7. Water resources plus models proposed by Ofwat

Description of dependent variables

Water resources plus is composed of water resources, raw water distribution and water treatment.

We excluded abstraction charges and cost items described in section 3 of the main consultation document.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

We use connected properties as a scale variable. This scale variable has advantages at more aggregate models as it is completely exogenous and captures more dimensions than other scale variables. It captures both the volume of water and the size of the network. As such, it is a sort of composite variable of water volume and network length.

Model 2 shows a lower effect of properties than that of model 1. This is due to the inclusion of a number of sources which also accounts for the effect of scale.

We use the same variables described above for water treatment with the addition of average pumping head for water resources plus.

Similarly to water treatment models, the effect of the percent of water treated at complexity levels 3-6 is higher than that of the percent of water coming from boreholes. The weighted average density has similar effect on costs to that observed in water treatment.

Consultation model

ID OWRP1 OWRP2 OWRP3 OWRP4 OWRP5 OWRP6 OWRP7 OWRP8

Dependent variable ---------------- ln (water resources plus base costs) ----------------

ln (connected properties)

0.996*** (0.000)

0.639*** (0.000)

1.002*** (0.000)

0.991*** (0.000)

1.020*** (0.000)

1.017*** (0.000)

1.024*** (0.000)

1.023*** (0.000)

% of water treated in water treatments in complexity levels 3-6

0.009*** (0.001)

0.007** (0.048)

0.008*** (0.007)

% of DI from boreholes

-.009*** -.009*** -0.005** (0.016)

-0.004** (0.026)

-0.004** (0.023) (0.001) (0.001)

ln (weighted average density)

-0.182** -0.152* -0.124* -0.067 (0.022) (0.053) (0.100) (0.377)

ln (distribution input per source)

-.350***

(0.000)

ln (number of sources)

0.356*** (0.001)

ln (average pumping head for water resources plus)

0.297* (0.063)

0.332** (0.015)

0.179 (0.233)

0.278* (0.069)

Constant 4.989*** 7.682*** 4.353*** 5.266*** 1.788** 2.433*** 3.024*** 3.042***

(0.000) (0.000) (0.000) (0.000) (0.025) (0.010) (0.005) (0.009)

R2 adjusted 0.935 0.935 0.936 0.925 0.934 0.935 0.94 0.937

VIF (max) 1.819 5.163 1.113 1.172 1.284 1.231 1.838 1.576

Reset test 0.02 0.119 0.005 0.65 0.782 0.011 0.051 0.022

Estimation method OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 107 107 107 107 107 107 107 107

Page 12: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

12

Template 8. Water resources plus models proposed by Wessex Water

Description of dependent variable

Water treatment and resources botex smoothed = Opex + IRE + average MNI over period – third party costs – local authority rates – abstraction charges

Comments on models (Wessex Water)

Variation 1 (v1) models are the exogenous variations of our water treatment & resource models. Variation 2 (v2) models are the endogenous variations of our water treatment & resource models.

All models below provide very similar results with unsmoothed expenditure.

Consultation model ID WSXWRP1 WSXWRP2 WSXWRP3 WSXWRP4

Company’s model ID 2v1 2v2 4v1 4v2

Dependent variable Ln(Smoothed WT&R Botex) Ln(Smoothed unit WT&R Botex per DI)

Distribution Input 0.965*** (0.000)

0.994*** (0.000)

Measure of highly dense areas (% area with >6000 people per sqkm)

-0.545** (0.017)

-0.565** (0.013)

Proportion of DI from groundwater sources

-0.296* (0.099)

-0.255 (0.165)

Average Source Size -0.167 (0.587)

-0.179 (0.585)

Proportion of water treated W4+

0.016

(0.211)

0.016 (0.219)

Average pumping head 0.261*** (0.008)

0.411*** (0.004)

0.280*** (0.008)

0.414*** (0.001)

Constant -2.948*** (0.000)

-4.899*** (0.000)

-3.249*** (0.000)

-4.957*** (0.000)

R2 adjusted 0.95 0.94 0.59 0.44

VIF (max) 1.67 12.73 1.67 10.26

Reset test 0.001 0.013 0.000 0.167

Method OLS OLS OLS OLS

N (sample size) 102 102 102 102

1.4 Treated water distribution models

Template 9. Treated water distribution models proposed by Ofwat

Description of dependent variables

Treated water distribution base costs excluding cost items described in section 3 of the main consultation document.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

We considered two scale variables in our distribution models: number of properties (models 1-4) and length of mains (models 5-8).

When using length of mains as a scale variable, we have also included a density variable. This is to account for the fact that a company that serves a larger population per km of mains may incur higher distribution costs. As expected, the coefficient of the density variable is positive, albeit quite large. We present the same models

Page 13: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

13

with the weighted average density driver, which produces more sensible values for the estimated coefficient. The coefficient also captures increased cost of working in highly dense/urban areas.

Other cost drivers we included are the number of booster pumping stations, service reservoirs and water towers per length of main, to account for network complexity.

The variables percent of mains length laid after 1981 and percent of mains length refurbished and realigned were included as additional drivers of maintenance costs.

Consultation model

ID OTWD

1 OTWD

2 OTWD

3 OTWD

4 OTWD

5 OTWD

6 OTWD

7 OTWD

8

Dependent variable ------------------ ln (treated water distribution base costs) ------------------

ln (connected properties)

1.086*** (0.000)

1.121*** (0.000)

1.122*** (0.000)

1.157*** (0.000)

ln (lengths of main)

1.106*** (0.000)

1.156*** (0.000)

1.080*** (0.000)

1.124*** (0.000)

% of mains length refurbished and relined

0.465*** (0.000)

0.478*** (0.000)

0.475*** (0.000)

0.488*** (0.000)

0.449*** (0.000)

0.483*** (0.000)

0.446*** (0.002)

0.486*** (0.001)

ln (booster pumping stations per lengths of main)

0.308** (0.030)

0.296*** (0.008)

0.394*** (0.000)

0.310** (0.026)

ln (service reservoirs and water towers per lengths of main)

0.242** (0.045)

0.242** (0.015)

0.258** (0.019)

0.144 (0.411)

% of mains lengths laid post 1981

-.012*** (0.005)

-.012*** (0.002)

-.011*** (0.001)

-.012*** (0.000)

-0.013** (0.019)

-.014*** (0.006)

ln (density)

1.286*** (0.000)

1.191*** (0.000)

ln (weighted average density)

0.296*** (0.001)

0.253*** (0.009)

Constant 4.102*** 3.340*** 3.974*** 3.271*** 3.823*** 3.200*** 7.161*** 6.359***

R2 adjusted 0.966 0.964 0.976 0.975 0.977 0.975 0.961 0.96

VIF (max) 1.132 1.164 1.238 1.266 2.386 2.066 2.913 2.307

Reset test 0.632 0.511 0.033 0.079 0.042 0.064 0.001 0

Estimation method OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 107 107 107 107 107 107 107 107

Template 10. Treated water distribution models proposed by Thames Water

Description of dependent variable

Water Distribution totex (including enhancement) net of grants and contributions

Description of selected explanatory variables

% 𝑀𝑎𝑖𝑛𝑠_320𝑚𝑚_450𝑚𝑚 =𝑃𝑜𝑡𝑎𝑏𝑙𝑒 𝑤𝑎𝑡𝑒𝑟 𝑚𝑎𝑖𝑛𝑠 320𝑚𝑚 − 450𝑚𝑚

𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑀𝑎𝑖𝑛𝑠𝑋100%

% 𝑀𝑎𝑖𝑛𝑠_450𝑚𝑚_610𝑚𝑚 =𝑃𝑜𝑡𝑎𝑏𝑙𝑒 𝑤𝑎𝑡𝑒𝑟 𝑚𝑎𝑖𝑛𝑠 450𝑚𝑚 − 610𝑚𝑚

𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑀𝑎𝑖𝑛𝑠𝑋100%

% 𝑀𝑎𝑖𝑛𝑠_320𝑚𝑚_610𝑚𝑚 =𝑃𝑜𝑡𝑎𝑏𝑙𝑒 𝑤𝑎𝑡𝑒𝑟 𝑚𝑎𝑖𝑛𝑠 320𝑚𝑚 − 450𝑚𝑚+𝑃𝑜𝑡𝑎𝑏𝑙𝑒 𝑤𝑎𝑡𝑒𝑟 𝑚𝑎𝑖𝑛𝑠 450𝑚𝑚 − 610𝑚𝑚

𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑀𝑎𝑖𝑛𝑠𝑋100%

% 𝑀𝑎𝑖𝑛𝑠_𝑝𝑟𝑒1880 = 𝑇𝑜𝑡𝑎𝑙 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑚𝑎𝑖𝑛𝑠 𝑙𝑎𝑖𝑑 𝑜𝑟 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑎𝑙𝑙𝑦 𝑟𝑒𝑓𝑢𝑟𝑏𝑖𝑠ℎ𝑒𝑑 𝑝𝑟𝑒−1880

𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑀𝑎𝑖𝑛𝑠𝑋100%

% 𝑀𝑎𝑖𝑛𝑠_1921_1940 = 𝑇𝑜𝑡𝑎𝑙 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑚𝑎𝑖𝑛𝑠 𝑙𝑎𝑖𝑑 𝑜𝑟 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑎𝑙𝑙𝑦 𝑟𝑒𝑓𝑢𝑟𝑏𝑖𝑠ℎ𝑒𝑑 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 1921 𝑎𝑛𝑑 1940

𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑀𝑎𝑖𝑛𝑠𝑋100%

Comments on models (Thames Water)

We consider that a translog model is more appropriate because:

the F-tests supports the translog and most of the interaction and square terms in the translog are statistically significant

Page 14: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

14

the estimated coefficients of regional wage are more sensible

the statistical level of significance of other cost drivers (e.g., average pumping head distribution, diameter, age, etc.) become relevant, and

no evidence of misspecification (see model M4 for example)

The predicted vs. actual cost difference shows a significant improvement when using a translog functional form compared to the Cobb-Douglas (although the improvement in the difference could be in part explained by the inclusion of more explanatory factors, e.g. squares and interactions).

A negative estimated coefficient on regional wages was found when we used water delivered as a scale variable. We therefore used length of mains as the scale variable.

None of the models shows a significant statistical effect of regional wages. However, including regional wages in the models helps to mitigate the more serious problem of omitted variable bias which avoids any pre-adjustment in the models.

Average pumping head shows a strong and stable significant effect across all specifications. There is also some indication that the diameter and age of the mains are important drivers of distribution costs. By controlling these factors the models provide some evidence of not omitting important drivers.

The time trend estimation shows a reduction of cost, on average, 2% per annum across all companies over the period 2011-12 to 2016-17 with statistical level of significance in some of the models.

Consultation model ID TMSTWD1 TMSTWD2 TMSTWD3 TMSTWD4 TMSTWD5

Company’s model ID 2 3 4 5 6

Dependent variable Ln(Totex Distribution)

Ln(Mains) 0.868** (0.000)

0.934*** (0.000)

0.968*** (0.000)

0.951*** (0.000)

0.968*** (0.000)

Ln(Property Density) 0.632*** (0.005)

0.460* (0.082)

0.572** (0.018)

0.567** (0.031)

0.572** (0.018)

Ln(Mains)_SQ -0.246** (0.022)

-0.192 (0.120)

-0.161 (0.189)

-0.180 (0.133)

-0.161 (0.189)

Ln(Property Density)_SQ 2.967*** (0.009)

2.931*** (0.007)

4.016*** (0.001)

4.238*** (0.001)

4.016*** (0.001)

Ln(Mains)Ln(Density) 0.999*** (0.008)

0.651*** (0.001)

0.725*** (0.003)

0.577*** (0.004)

0.725*** (0.003)

Ln(Regional Wage_water_2soc) 0.571

(0.548) 0.951

(0.378) 0.633

(0.657) 0.571

(0.735) 0.633

(0.657)

Ln(average pumping head distribution) 0.228*** (0.007)

0.154** (0.019)

0.143** (0.032)

0.159** (0.018)

0.144** (0.032)

Time -0.024** (0.038)

-0.026** (0.039)

-0.022 (0.212)

-0.024 (0.257)

-0.022 (0.212)

% mains_320mm_450mm 0.432* (0.079)

% mains_450mm_610mm 0.197

(0.153) 0.268* (0.092)

0.133 (0.395)

0.267* (0.092)

% mains320_610mm

% mains_pre_1880 -0.036 (0.193)

-0.019 (0.498)

-0.036 (0.193)

% mains_1921_1940 0.098

(0.343) 0.132

(0.309) 0.098

(0.343)

Constant 4.686*** (0.000)

4.923*** (0.000)

4.693*** (0.000)

4.813*** (0.000)

4.693*** (0.000)

R2 adjusted 0.966 0.964 0.966 0.969 0.966

Reset test 0.002 0.001 0.003 0.201

VIF (max) 3.58 4.91 4.56 5.54

Method OLS OLS OLS RE OLS

N (sample size) 106 106 106 106 106

Page 15: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

15

Template 11. Treated water distribution models proposed by Wessex Water

Description of dependent variable

Water treatment and resources botex = Opex + IRE + average MNI over period – third party costs – local authority rates – abstraction charges

Comments on models (Wessex Water)

The main cost driver is the number of connected properties. The main issue we faced was how to model density. Our preferred approach was to use the number of service reservoirs normalised by property numbers as a measure of density. We submitted aggregate and unit cost models based on the numbers of properties supplied, the normalised number of service reservoirs and average pumping head.

All models below provide very similar results with unsmoothed expenditure.

Consultation model ID WSXTWD1 WSXTWD2

Company’s model ID 2 4

Dependent variable Ln(WD botex smoothed) Ln(WD botex per property

smoothed)

Connected Properties 1.087*** (0.000)

Service Reservoirs / 100k properties -3.015*** (0.000)

-2.930*** (0.000)

Service Reservoirs / 100k properties^2 0.557*** (0.000)

0.533*** (0.000)

Average pumping head 0.150* (0.052)

0.190 (0.146)

Constant -0.090 (0.979)

0.623 (0.487)

R2 adjusted 0.97 0.44

VIF (max) 111.25 110.25

Reset test 0.485 0.037

Method OLS OLS

N (sample size) 102 102

1.5 Network plus water models

Template 12. Network plus water models proposed by Ofwat

Description of dependent variable

Network plus water base costs excluding cost items described in section 3 of the main consultation document.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

We considered two scale variables in our network plus models: connected properties (models 1-4) and length of mains (models 5-8).

When using length of mains as a scale variable, we have also included a density variable. This is to account for the fact that a company that serves a larger population per km of mains may incur higher distribution costs. As expected, the coefficient of the density variable is positive, albeit quite large. We present the same models with the weighted average density driver, which produces more sensible values for the estimated coefficient. The coefficient also captures increased cost of working in highly dense/urban areas.

The rationale for all explanatory variables in our network plus models can be found in our comments on the water treatment and treated water distribution models.

Page 16: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

16

Consultation model ID ONPW

1 ONPW

2 ONPW

3 ONPW

4 ONPW

5 ONPW

6 ONPW

7 ONPW

8

Dependent variable ------------------ Ln(network plus water base costs) ------------------

ln (connected properties)

1.022*** (0.000)

1.044*** (0.000)

1.064*** (0.000)

1.084*** (0.000)

ln (lengths of main) 1.046*** 1.037*** 1.044*** 1.062***

(0.000) (0.000) (0.000) (0.000)

% of mains length refurbished and relined

0.16 (0.125)

0.163 (0.114)

0.264*** (0.006)

0.265*** (0.006)

0.166 (0.145)

0.232** (0.038)

0.172 (0.166)

0.257** (0.045)

ln (booster pumping stations per lengths of main)

0.278 (0.110)

0.256** (0.037)

0.246* (0.081)

0.416*** (0.007)

ln (service reservoirs and water towers per lengths of main)

0.335** (0.015)

0.337*** (0.009)

0.036 (0.836)

0.237 (0.153)

% of mains length laid or refurbished after 1981

-0.008* (0.065)

-0.006 (0.158)

-0.008* (0.064)

-0.006 (0.123)

-0.010** (0.047)

-0.009 (0.108)

ln (average pumping head for water treatment)

0.084* (0.069)

0.090*** (0.008)

0.091*** (0.008)

0.111*** (0.004)

% of water treated in water treatments in complexity levels 3-6

0.004* (0.050)

0.003* (0.057)

0.002 (0.201)

0.002 (0.303)

ln (density) 1.028***

(0.000) 1.064*** (0.000)

ln (weighted average density)

0.221*** (0.007)

0.229** (0.014)

Constant 5.266*** 5.114*** 4.780*** 4.774*** 5.124*** 5.736*** 7.074*** 7.733***

(0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)

R2 adjusted 0.971 0.975 0.967 0.97 0.975 0.971 0.967 0.96

VIF (max) 1.532 1.543 1.189 1.306 3.159 2.557 2.883 2.337

Reset test 0.057 0.047 0.355 0.08 0.024 0.013 0 0

Estimation method OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 107 107 107 107 107 107 107 107

Template 13. Network plus water models proposed by Anglian Water

Description of dependent variable

Water Network plus base costs excluding local authority rates

Acronyms used in explanatory variables

APH = average pumping head

DI = distribution input

Comments on models (Anglian Water)

The models follow the form developed by the CMA for the 2015 Bristol Determination. The same approach as taken by the CMA was followed in choosing which models to report.

All models are described in detail in our Cost Modelling report – Phase 2, published March 2018 here

Page 17: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

17

Consultation model ID ANHNPW1 ANHNPW2 ANHNPW3 ANHNPW4 ANHNPW5

Company’s model ID 1 2 3 4 5

Dependent variable Water N+ botex: Log Unit Cost Water N+ botex: Log Aggregate

Ln(Water Delivered /Properties)

0.642* (0.064)

0.632*** (0.000)

0.800** (0.017)

0.778 ** (0.027)

0.637 *** (0.000)

Ln(Aggregate length of potable mains)

0.433*** (0.000)

0.456*** (0.000)

-0.482*** (0.000)

-0.603*** (0.000)

-0.573*** (0.000)

Ln(Regional wages) 0.345

(0.563) 0.284

(0.582) 0.914

(0.141) 0.257

(0.663) 0.302

(0.557)

Ln(Aggregate length of potable mains)

1.049*** (0.000)

1.049*** (0.000)

1.030*** (0.000)

% DI from rivers 0.028

(0.777)

% DI from reservoirs 0.307*** (0.001)

Ln(Average Pumping Head) -0.018 (0.801)

0.021 (0.77)

0.107 (0.209)

0.076 (0.381)

0.075 (0.365)

% Water delivered to Non Household customers

-0.778 (0.137)

-0.730* (0.065)

-0.388 (0.475)

-0.572 (0.278)

-0.515 (0.23)

% DI treated using multiple treatment approaches

0.304*** (0.003)

0.285*** (0.004)

0.255** (0.015)

0.258*** (0.01)

Time dummy 1st year -0.231** (0.014)

-0.115* (0.084)

-0.277*** (0.003)

-0.233* (0.012)

-0.116* (0.083)

Time dummy 2nd year -0.127 (0.115)

-0.091 (0.14)

-0.176** (0.025)

-0.130 (0.102)

-0.090 (0.143)

Time dummy 3rd year -0.064 (0.424)

-0.048 (0.43)

-0.094 (0.225)

-0.060 (0.447)

-0.048 (0.431)

Time dummy 4th year -0.110 (0.136)

-0.038 (0.517)

-0.127* (0.074)

-0.109 (0.136)

-0.037 (0.529)

Time dummy 5th year -0.124* (0.095)

-0.135* (0.059)

-0.122 * (0.095)

Time dummy 6th year -0.168** (0.022)

-0.163** (0.020)

-0.165** (0.022)

Constant 3.179

(0.105) 3.050* (0.065)

-6.364*** (0.002)

-4.249** (0.031)

-4.418*** (0.01)

R2 adjusted 0.203 0.433 0.963 0.960 0.972

Reset test 0.012 0.002 .005 0.000 0.000

VIF (max) 6.24 3.78 7.29 6.27 3.78

Method OLS OLS OLS OLS OLS

N (sample size) 125 89 125 125 89

Template 14. Network plus water models proposed by Southern Water

Description of dependent variable

Network plus base costs = modelled OPEX plus modelled base CAPEX.

Modelled OPEX is total OPEX less third party services, abstraction charges and local authority rates.

Modelled base CAPEX is maintenance expenditure in infrastructure and non-infrastructure less grants and contributions.

All costs are unsmoothed and deflated to 2016/17 prices using CPIH.

Comments on models (Southern Water)

Given that Network+ costs are a large proportion of wholesale costs, the models are similar in terms of variable selection:

Model 4 controls for length of mains as a scale driver, whilst models 5-6 use connected properties

Page 18: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

18

Model 4 uses a simple (linear) measure of density, whilst models 5-6 use a translog relationship.

Models 4-5 use proportion of mains laid before 1980 and proportion of mains renewed/relined as maintenance drivers, whereas model 6 uses the proportion of mains renewed/relined only.

Consultation model ID SRNNPW1 SRNNPW2 SRNNPW3

Company’s model ID 4 5 6

Dependent variable Network+ BOTEX (log)

Connected properties (‘000s) (log) 1.053*** (0.000)

1.039*** (0.000)

Length of mains (km) (log) 1.094***

(0)

Properties over mains (‘000s / km) (log)

0.515** (0.044)

Properties over area (‘000s / km2) (log, demeaned)

-0.187* (0.05)

-0.173* (0.059)

Properties over area (‘000s / km2) (log, demeaned) squared

0.306*** (0.006)

0.362*** (0.005)

Proportion of water treated at complexity band 4 and above (%)

0.356** (0.028)

0.394* (0.058)

0.404* (0.091)

Proportion of mains renewed/relined (%)

28.83** (0.036)

24.34* (0.087)

22.72* (0.096)

Proportion of mains laid before 1980 (%)

1.005* (0.059)

0.570 (0.257)

Year 2016 dummy -0.0776** (0.012)

-0.0854*** (0.007)

Constant -5.502*** (0.000)

-3.509*** (0.000)

-3.067*** (0.000)

R2 adjusted 0.956 0.958 0.956

RESET Test 0.112 0.638 0.286

VIF (max) 1.232 1.276 1.267

Method OLS OLS OLS

N (sample size) 102 102 102

Template 15. Network plus water models proposed by Severn Trent Water

Description of dependent variable

All models relate to water network plus base costs: operating expenditure (less third party costs and council tax) + maintenance expenditure (infra and non-infra) gross of grants and contributions.

Description of selected explanatory variables

Length Length of potable and raw water distribution mains

Density(weighted) Ofwat’s weighted average density from the "Constructed data" folder on sharepoint

Density(Props. per km squared)

Number of properties per km squared. The area data used in the denominator is from Ofwat’s Masterfile with the exception of Southern Water, whose data in that file represents the waste boundaries. We have replaced this with data from our own GIS database.

WTW Water treatment works

GW/SW ratio The ratio of the number of ground to surface water works

Relined&renewed (km)

Length of mains relined or renewed

Prop. bands 4-6 Proportion of water treated in sites of categories 4-6

Page 19: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

19

Comments on models (Severn Trent Water)

Model 1 is a Cobb-Douglas specification. A priori, we would expect small positive coefficients on the pumping head, mains repair, and treatment complexity variables. Our expectations have been met in this model. Given that ground water is usually significantly cheaper to treat than surface water, we expect (and see) a small negative coefficient on the variable measuring the ratio of ground to surface water sites. An increase in the scale of operations at a company (production and maintenance), while keeping the average capacity of treatment works unchanged, should result in a broadly equal proportionate change in costs. Therefore, we would also expect the sum of the coefficients on the length, no. of works and mains repair variables to be around 1 (assuming the average company in the industry is operating at constant returns to scale). This is indeed the case in this model.

Model 2 extends model 1 by adding non-linear terms in the length and density variables in order to allow economies of scale/density to vary with firm characteristics. All other variables are retained. While the pumping head and number of works variables are insignificant, they are correctly signed and theoretically important and so they are retained in the model to try to account for energy costs and economies of scale at the asset level. The positive and significant non-linear density term indicates diseconomies of density, which might be expected given that treated water distribution is the single largest element of network plus expenditure and costs tend to increase with density in this area. In the presence of constant returns to scale at the industry average, we would expect the sum of the coefficients on the length, no. of works and mains repair (relined/renewed) variables to sum to around 1. These expectations are met almost exactly in the model.

Model 3 extends model 1 by adding non-linear terms in the length and no. of works variables. Previous Severn Trent research suggests an increase in cost elasticity as capacity declines/no. of works increases. However, given that a non-linear term on the "no. of works" variable in a model would be likely to capture both asset level and firm level economies of scale, we chose to re-scale it and express it as the number of works per property. This allows it to more clearly represent differences in asset capacity. The positive coefficient on this variable is as expected. Re-scaling the WTW coefficient also changes our expectations for the magnitude of the length coefficient; this and the mains repair variable are now expected to sum to around 1 in the presence of constant returns to scale. The sum of these scale features comes in only slightly higher than 1.

Model 4 presents a Cobb-Douglas form model with distribution input as the main scale driver. All of the other variables that have been discussed are also included and each coefficient conforms to expectations. However, due to the greater correlation between Ofwat’s new weighted density measure and the distribution input variable, the standard errors are somewhat inflated and more variables are statistically insignificant. The sum of the three scale related variables comes to 1.06. This is close to our expectations.

Model 5 extends model 4 by adding non-linear terms in the distribution input and no. of treatment works variables. For the same reasons as in model 3, we re-scale the treatment works variable and express it as the number of works per property. These non-linear terms add little to the model with both highly insignificant and the treatment works variable of a sign we would not expect in the OLS version of the model. However, each of the other coefficients remains of a logical sign and magnitude. The sum of the distribution input and mains repair (relined/renewed) variables comes to 1.08.

Model 6 above builds on Ofwat’s PR14 approach, using a similar density measure, but adds some extra logical variables and addresses water treatment cost drivers differently. Our prior expectations are for small positive coefficients on the pumping head, treatment (prop. bands 4-6), relined/renewed and no. of works variables. These expectations have been met with most of these coefficients statistically significant; only the average pumping head variable is insignificant but we place more weight on its magnitude despite its insignificance given the small sample size. We would also expect the sum of the coefficients on the length, no. of works and relined/renewed variables to sum to around 1 in the presence of constant returns to scale. The condition is almost met with these coefficients summing to 1.06.

Model 7 presents a simpler Cobb-Douglas style version of model 6. The coefficients on each individual variable are very close to expectations, but some of the variables are insignificant. However, we take confidence from the fact that these coefficients are appropriately signed and of a logical magnitude. Given their theoretical importance, they are retained in the model.

As with the other models, the sum of the coefficients on the length, no. of works and mains repair variables comes to around 1 (1.1). This is in line with our expectations.

Page 20: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

20

Consultation model ID SVTNPW1 SVTNPW2 SVTNPW3 SVTNPW4 SVTNPW5 SVTNPW6 SVTNPW7

Company’s model ID 1 2 3 4 5 6 7

Dependent variable ---------------------- Ln(Water N+ botex) ----------------------

Ln(Length) .57*** (.001)

.86*** (0.000)

.99*** (.000)

.85*** (.00)

.80*** (.00)

Ln(Distribution Input) .72*** (.00)

.98*** (.000)

Ln(Dist. Input)^2 .01

(.74)

Ln(number of WTW) .37** (.029)

.07 (.48)

.25* (.06)

.25** (.032)

.1**

(.036) .19*

(.052)

Ln(WTW per property) .27** (.014)

Ln(WTW per prop)^2 .19* (.09)

-.02 (.85)

Prop. bands 4-6 .31*

(.059)

.39** (.02)

.16 (.25)

.12 (.43)

GW/SW works ratio -.03** (.017)

-.02** (.045)

-.02* (.08)

-.03*** (.00)

-.03*** (.002)

-.02*** (.00)

-.03*** (.00)

Ln(Average pumping head)

.05 (.19)

.1 (.41)

.12 (.47)

.16 (.3)

.17 (.26)

.13 (.14)

.09 (.55)

Ln(Relined & Renewed) .09** (.04)

.07** (.029)

.07 (.12)

.09** (.017)

.1*** (.007)

.11*** (.00)

.12*** (.00)

Ln(Density) .31* (.09)

.39*** (.00)

.33 (.01)

.08 (.31)

.08 (25)

Ln(Density – props per km^2)

.59*** (.00)

.47*** (.00)

Ln(Length)^2 -.05 (.12)

.02 (.63)

-.03 (.12)

Ln(Density)^2 .19*** (.00)

.37*** (.00)

Ln(Density) X Ln(Length) .12*** (.009)

Prop. bands 4-6 .26*

(.056)

.24** (.014)

.12 (.47)

Dummy 2012 -.05** (.04)

-.03 (.55)

-.04 (.42)

-.09* (.057)

-.11** (.027)

-.06(.194) -.08 (.13)

Dummy 2013 -.02** (.05)

-.01 (.91)

-.02 (.79)

-.04 (.48)

-.05 (.42)

-.03(.52) -.05 (.33)

Dummy 2014 -.08** (.04)

-.07 (.13)

-.07* (.1)

-.1** (.015)

-.11** (.011)

-.08** (.046)

-.1**(.03)

Dummy 2015 -.1**

(.039) -.09* (.054)

-.1** (.05)

-.1** (.02)

-.12** (.021)

-.09** (.049)

-.1**(.04)

Dummy 2016 -.13** (.02)

-.13*** (.00)

-.14*** (.00)

-.14*** (.00)

-.14*** (.00)

-.14*** (.00)

-.13***(.00)

Constant 4.93*** 4.91*** 5.16*** 5.25*** 5.02*** 5.22***

R2 adjusted .97 .98 .96 .97 .97 .98 .97

Reset test 0 .41 0.17 .009 0.41 0.00 0.01 0.23

VIF(max) 16.6 24 6.3 19.6 5.7 15.2 11.3

Method OLS OLS OLS OLS OLS OLS OLS

N (sample size) 107 107 107 107 107 107 107

Page 21: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

21

Template 16. Network plus water models proposed by South West Water

Description of dependent variable

Modelled OPEX = network+ OPEX – network+ third party – network+ abstraction charges – network+ local authority rates

Modelled base CAPEX = network+ maintenance infra + network+ maintenance non-infra – network+ grants and contributions

Modelled BOTEX = Modelled OPEX + Modelled base CAPEX

Modelled BOTEX+ (growth) = modelled BOTEX + network+ additions to the supply and demand balance + network+ new developments and growth + network+ metering expenditure + network+ resilience

Modelled TOTEX = modelled BOTEX + network+ other capital expenditure infra + network+ other capital expenditure non-infra + network+ infrastructure network reinforcement

Unsmoothed net costs from 2011/12 to 2016/17

Comments models (South West Water)

We have adopted the same approach to modelling network plus wholesale water costs as for aggregate BOTEX, as there were no resources-specific drivers in our models. We have not, at this stage, examined the appropriateness of different estimation approaches. We do note, however, that some models seem more robust than others and clearly this will have implications for identifying relative efficiency.

See our aggregate wholesale water BOTEX submission for a more detailed review of the drivers considered, which were:

Scale (properties)

Density/sparsity (mains per property)

Source type

Maintenance

Each of the network+ water models we have developed captures each of these key cost drivers for companies. Our network+ water models differ across the way in which density and sparsity are captured.

To explore the impact of density and sparsity on water costs we considered both a trans-log specification using mains over properties, in addition to considering a simpler model with a single log-linear term. Models 1, 3 and 5 model a trans-log ‘u shape’ relationship between cost and population density/sparsity. Models 2, 4, and 6 use only the log of mains over connected properties, capturing only the impact of sparsity.

We have extended our aggregate BOTEX modelling to models controlling for BOTEX + growth enhancement and TOTEX (see discussion in wholesale water models). As can be seen from the efficiency range charts, modelling BOTEX+ (growth) or TOTEX does not substantially broaden the efficiency ranges. As for wholesale water models, we would recommend that BOTEX+ (growth) and TOTEX modelling approaches are explored to the fullest possible extent at PR19.

All of the BOTEX models estimate statistically significant coefficients which are supported from an operational and economic perspective. The relationship between cost and cost drivers in BOTEX+ (growth) and TOTEX models is broadly similar to that estimated in BOTEX models, although not all coefficients pass statistical significance tests. As with aggregate modelling, the large coefficient on the proportion of mains relined or renewed seems large is a result of the small size of the underlying data.

While some trans-log models do fail the RESET test, we would note that they lead to narrower efficiency ranges than a log-linear model when modelling BOTEX. Given the operational justification for this specification, we would recommend the exploration of models which control for a ‘u-shape’ relationship between cost and density.

Given our focus on modelling what we consider to be key industry drivers of cost, we have not explored estimation approaches beyond OLS with robust standard errors. We will be considering the most appropriate estimation approaches as part of our consultation response.

All models are broadly robust from a statistical perspective, with the exception of the RESET test for some models.

Adjusted R2 is sufficiently high.

Page 22: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

22

VIF (a measure of collinearity) is well below the ‘rule of thumb’ threshold of 10. Note that for trans-log specifications we demean the squared term to minimise collinearity, with no impact on coefficients or predictions.

We find mixed evidence from the RESET test on whether the model would be improved by the addition of polynomial terms, i.e. given the control variables, whether the model is mis-specified. This is despite including trans-log terms.

Consultation model ID SWBNPW1 SWBNPW2 SWBNPW3 SWBNPW4 SWBNPW5 SWBNPW6

Company’s model ID 1 2 3 4 5 6

Dependent variable Network+ BOTEX (ln) Network+ BOTEX+ (growth)

(ln) Network+ TOTEX (growth)

(ln)

Properties (ln) 1.025*** (0.000)

1.046*** (0.000)

1.046*** (0.000)

1.060*** (0.000)

1.060*** (0.000)

1.078*** (0.000)

Proportion of mains renewed or relined (%)

32.88*** (0.000)

31.81*** (0.000)

24.77*** (0.005)

24.66*** (0.005)

25.53*** (0.006)

25.39*** (0.008)

Mains over connected properties (ln)

0.379*** (0.000)

0.352** (0.015)

0.390*** (0.000)

0.369*** (0.002)

0.512*** (0.000)

0.484*** (0.000)

Mains over connected properties (ln squared and demeaned)

1.737*** (0.000)

1.072*** (0.001)

1.440*** (0.000)

Proportion of treated surface water (%)

0.163** (0.033)

0.146* (0.069)

0.0548 (0.555)

0.0706 (0.441)

0.0221 (0.821)

0.0433 (0.659)

Properties growth (%) 0.272** (0.012)

0.399*** (0.002)

0.290*** (0.008)

0.462*** (0.002)

Constant -3.819*** (0.000)

-3.797*** (0.000)

-3.899*** (0.000)

-3.984*** (0.000)

-4.244*** (0.000)

-4.359*** (0.000)

R2 adjusted 0.956 0.944 0.950 0.946 0.948 0.942

Reset test 0.286 0.767 0.000 0.025 0.001 0.001

VIF(max) 1.312 1.277 1.337 1.331 1.337 1.331

Method OLS OLS OLS OLS OLS OLS

N (sample size) 102 102 102 102 102 102

Template 17. Network plus water models proposed by Thames Water

Description of dependent variable

Network plus totex = opex + capex (maintenance + enhancement) net of grants & contributions

Description of selected explanatory variables

APH Network=Avg_pmphd_R_T_DN=Avg_PMHD Raw+Avg_PMHD Treatment+Avg_PMHD Distribution

% DI from boreholes= Proportion of distribution input derived from boreholes, excluding managed aquifer recharge (MAR) water supply schemes

Comments on models (Thames Water)

All our network plus (raw + treatment + distribution) models are totex unsmoothed.

We have used the functional form found in Water Distribution as a starting point. We are exploring different functional forms (Cobb-Douglas and Translog) in network plus with different scale variables to determine if having flexible economies of scale/density is appropriate at the network plus level

There is a potential issue on the way average pumping head has been allocated by companies. As it was shown in the Water Distribution analysis, average pumping head is an important and statistically significant driver in the cost functions. When we bring this variable to the network plus models, (e.g., raw, treatment and distribution) the calculation of average pumping head might be suffering by the misreading of companies in the definition of this variable.

Page 23: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

23

This allocation issue might be affecting the performance of the average pumping head (in particular in raw and treatment) yielding not statistical evidence as an important driver.

Length of mains appear strongly significant along with the interaction term with property density. In a similar, way property density seems to be another relevant driver of costs.

The age of the mains between 1921 and 1940 seems to be quite relevant in terms of its level of significance and the stability of the magnitude estimated across all the models tested.

The coefficient of regional wages is sensible although not statically significant. But including sensible estimates of regional wage in the models helps mitigate the effects of serious issues, such as omitted variable bias, and is preferred to costs pre-adjustments

By using proportion of Impounding Reservoirs and Boreholes as the main representation of water treatment costs in the network plus level, models M3 and M4 are providing good evidence of no omitted variables issues. However, these variables are not statistically significant under cluster robust standard errors.

Consultation model ID TMSNPW1 TMSNPW2 TMSNPW3 TMSNPW4

Company’s model ID M3 M4 M5 M9

Dependent variable -------------- Ln(Totex Water NetworkPlus) --------------

Ln(Length of potable and raw water mains)

1.054*** (0.000)

1.062*** (0.000)

1.069*** (0.000)

1.054*** (0.000)

Ln(Property Density) 0.373

(0.105) 0.419* (0.089)

0.363 (0.122)

0.373*** (0.004)

Ln(Length of potable and raw water mains)^2

-0.023 (0.815)

-0.001 (0.991)

-0.0152 (0.883)

-0.023 (0.626)

Ln(Property Density)^2 1.775

(0.225) 2.138

(0.191) 3.014* (0.051)

1.775** (0.030)

Ln(Length of potable and raw water mains)*Ln(Density)

0.496** (0.020)

0.480** (0.026)

0.426** (0.043)

0.496*** (0.000)

Ln(APH Network) 0.046

(0.727) 0.0004 (0.998)

0.177 (0.177)

0.046 (0.510)

Ln(Regional Wage_water_2soc) 0.699

(0.641) 0.407

(0.819) 0.563

(0.716) 0.699

(0.339)

time -0.020 (0.329)

-0.014 (0.527)

-0.017 (0.416)

-0.020 (0.119)

% mains laid between 1921 and 1940

0.211* (0.057)

0.217* (0.089)

0.287*** (0.004)

0.211*** (0.000)

% DI from impounding reservoirs 0.125

(0.144) 0.1287 (0.172)

0.074 (0.417)

0.125*** (0.001)

% DI from boreholes 0.098

(0.247) 0.116

(0.173) 0.016

(0.865) 0.098** (0.024)

Constant 4.997*** (0.000)

4.852*** (0.000)

4.989*** (0.000)

4.997*** (0.000)

R2 adjusted 0.970 0.968 0.971

Reset test 0.061 0.221

VIF (max) 5.75 6.00

Method OLS OLS RE RE

N (sample size) 106 88 106 106

Template 18. Network plus water models proposed by Welsh Water

Description of dependent variable

Network Plus includes costs for Raw Water Distribution, Water Treatment and Treated Water Distribution

Water Network Plus Botex = (Total Operating Expenditure – Third Party Services – Abstraction Charges – Local authority rates) + (Maintaining the long term capability of the assets infra + Maintaining the long term capability of the assets non-infra)

Values rebased to 2016/17 using CPIH.

Page 24: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

24

Comments on models (Welsh Water)

The models aim to capture key cost drives for the industry. The models include a scale variable, the number of connected properties, alongside variables to capture density and sparsity, treatment complexity and drivers of maintenance.

To capture both density and sparsity, the models include properties over length of mains alongside a squared term. In this way the two variables capture the U-shape relationship between costs and density and sparsity. This variable allows the impact of density on costs to vary according to how dense the company is.

The estimated elasticities of density on costs are reasonable and of the right order across the industry. The model is robust to the removal of the sparsest companies however the coefficient on the square term becomes less significant when the densest company is removed.

The models are broadly robust to using alternative modelled costs (e.g. including abstraction charges, excluding grants and contributions etc), and to alternative estimation techniques such as random effects.

These models have been produced with South West and Bournemouth combined.

Consultation model ID WSHNPW1 WSHNPW2

Company’s model ID 3 4

Dependent variable Ln(Network plus Botex)

Ln(Connected Properties) (,000)

1.002*** (0.000)

1.010*** (0.000)

Ln (Properties over Mains), demeaned (,000/km)

-0.304* (0.081)

-0.296* (0.085)

Ln (Properties over Mains)^2, demeaned (,000/km)

1.836*** (0.000)

2.075*** (0.000)

% mains renewed and relined 31.91*** (0.002)

32.90*** (0.003)

% of water treated at complexity band 2 and below

-0.532** (0.011)

-0.587*** (0.004)

% of water treated at complexity band 5 and above

0.125 (0.318)

Constant -2.469*** -2.492***

R2 adjusted 0.970 0.970

VIF (max) 1.715 1.261

Reset test 0.036 0.076

Estimation method OLS OLS

N (sample size) 102 102

Template 19. Network plus water models proposed by Yorkshire Water

Description of dependent variable

Network plus base costs = operating expenditure less abstraction charges, third party services and local authority rates + capital maintenance expenditure net of grants and contributions (G&C)

The dependent variables are deflated using CPIH to 2016/17 prices. No smoothing was undertaken.

Comments on models (Yorkshire Water)

The network+ models use a variety of scale variables and density measures. These models are generally robust to alternative modelled costs and estimation techniques, and produce reasonably compact efficiency ranges.

Our dependent variable excludes G&C consistent with the PR14 approach. However, given lack of split of G&C for capital maintenance and enhancement expenditure, we have also modelled CAPEX on a gross basis. The statistical performance of the models are broadly consistent with and without G&C.

Page 25: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

25

Consultation model ID YKYNPW1 YKYNPW2 YKYNPW3

Company’s model ID 11 12 13

Dependent variable Network + BOTEX (log)

Connected properties (‘000s) (log) 1.030*** (0.000)

Population served (‘000s) (log) 1.012*** (0.000)

Length of mains (km) (log) 1.026*** (0.000)

% of mains renewed/relined 32.75** (0.01)

30.16** (0.018)

35.86*** (0.005)

% of mains laid before 1980 0.791* (0.096)

0.623 (0.18)

% of DI from reservoirs 0.327** (0.019)

0.341** (0.012)

% of DI from rivers 0.217

(0.361) 0.220

(0.342)

% of water treated at complexity band 1 and below

-0.692** (0.015)

Properties over area (‘000s / km2) (log, demeaned)

-0.123 (0.169)

-0.188** (0.041)

Properties over area (‘000s / km2) (log, demeaned) squared

0.279** (0.02)

0.246** (0.028)

Properties over mains (‘000s / km) (log)

0.719** (0.039)

Constant -3.437***

(0.000 -4.004*** (0.000)

-3.410*** (0.002)

R2 adjusted 0.958 0.960 0.952

Reset test 0.560 0.475 0.0615

VIF (max) 1.569 1.569 1.275

Estimation method OLS OLS OLS

N (sample size) 102 102 102

Template 20. Network plus water models proposed by Bristol Water

Description of dependent variable

The dependent variable is Botex per connected property.

Botex = (total opex – business rates – third party costs) + capital maintenance

Comments on models (Bristol Water)

The models and corresponding coefficients presented in this pro forma are based on cost information for 17 companies (data for Bournemouth and South West Water have been appropriately combined). Regressions were run in reference to the Master Wholesale Cost data file dated 27th February 2017, reflecting the latest updates and amendments to the data.

Capital maintenance costs have been smoothed on a three year rolling-average basis, therefore four years of data have been modelled (2014-2017). Botex costs have been calculated on a unit cost basis by dividing cost information by the sum of Total non-household connected properties at year end and Total household connected properties at year end also from the six-year wholesale cost data set.

A full description of the work undertaken to arrive at these models is set out in a report by NERA: ‘Comparative Benchmarking Assessment to Support Preparation of Bristol Water’s AMP7 Business Plan’ (December 2017).

Page 26: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

26

Consultation model ID BRLNPW1 BRLNPW2 BRLNPW3

Company’s model ID 7 8 9

Dependent variable Ln(network plus botex per property)

Ln(DI/connected property) Ml/d per ‘000 connected property

0.793* (0.063)

1.050** (0.024)

0.973*** (0.003)

Ln(length of mains/ connected property) Km/‘000 connected property

0.607*** (0.008)

0.494** (0.011)

0.461* (0.076)

Share of water treated at level 5 and above (%)

0.257 (0.144)

0.143 (0.364)

Length of mains laid pre-1940/Total length of mains (%)

0.805** (0.012)

0.487 (0.152)

1.197*** (0.000)

Length of renewed and relined mains/Total length of mains (%)

18.12* (0.093)

15.64 (0.157)

34.12** (0.011)

Year15 0.0003 (0.990)

0.003 (0.912)

0.024 (0.468)

Year16 0.024

(0.521) 0.029

(0.432) 0.072

(0.138)

Year17 0.024

(0.461) 0.032

(0.301) 0.0767** (0.046)

Surface water treated / Total water treated (%)

0.526*** (0.006)

Share of water from reservoirs (%) 0.206* (0.059)

Ln(number of sources / DI) 0.113

(0.325)

Ln(average pumping head network) 0.140

(0.294)

Constant -3.733*** -3.881*** -3.437***

R2 adjusted 0.53 0.65 0.70

Reset test 0.67 0.54 0.15

VIF (max) 1.48 1.84 2.94

Estimation method OLS OLS OLS

N (sample size) 68 68 68

Template 21. Network plus water models proposed by South East Water

Description of dependent variables

Modelled OPEX = OPEX – third party – abstraction charges – local authority rates

Modelled base CAPEX = maintenance infra + maintenance non-infra – (grants and contributions)

Modelled BOTEX = Modelled OPEX + Modelled base CAPEX

Costs are modelled on an outturn basis and unsmoothed

Comments on models (South East Water)

The model coefficients are broadly robust to alternative modelled costs and estimation techniques.

The number of treatment plants is a material driver of costs for the same reason as the number of sources noted above and is captured in the models by the number of treatment plants per scale driver. We have modelled this variable in levels and logs with models in levels tending to have marginally superior statistical properties.

The coefficient on the proportion of mains relined/renewed variable is large only because the proportion of mains relined/renewed is a small variable which takes a maximum value of 0.012 and a minimum value of 0.000207 with a mean of 0.004. The magnitude of the cost adjustment is therefore limited. A coefficient of 25 would imply an estimated elasticity range of approximately 0 to 0.3.

Page 27: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

27

Consultation model ID SEWNPW1 SEWNPW2 SEWNPW3 SEWNPW4 SEWNPW5

Company’s model ID 9 10 11 12 13

Dependent variable Network + BOTEX (log)

Connected properties (‘000s) (log)

1.097*** (0.000)

1.101*** (0.000)

1.089*** (0.000)

Population served (‘000s) (log) 1.095*** (0.000)

DI (Ml/d) (log) 1.057*** (0.000)

Proportion of area with more than 4000 people per km2 (%)

0.452** (0.028)

0.326* (0.057)

0.689*** (0.004)

0.695*** (0.006)

0.689*** (0.002)

Proportion of area with less than 600 people per km2 (%)

0.757*** (0.005)

0.607** (0.012)

0.633*** (0.006)

0.658*** (0.006)

0.530*** (0.005)

% of DI treated at complexity band 3 and above (%)

0.262 (0.277)

0.322 (0.172)

0.429* (0.05)

0.383* (0.078)

0.587*** (0.000)

Proportion of mains renewed/relined (%)

22.66** (0.036)

23.71** (0.024)

21.96** (0.041)

22.95** (0.035)

21.34** (0.047)

Number of treatment works over connected properties (number / ‘000s) (log)

0.0926 (0.126)

Number of treatment works over DI (number / (Ml/d)) (log)

0.0667 (0.245)

Number of treatment works over DI (number / (Ml/d))

1.437** (0.023)

Year 2016 dummy -0.079*** (0.002)

-0.061** (0.012)

-0.080*** (0.003)

-0.079*** (0.004)

-0.083*** (0.001)

Constant -4.616*** (0.000)

-2.802*** (0.000)

-3.589*** (0.000)

-3.710*** (0.000)

-4.039*** (0.000)

R2 adjusted 0.960 0.963 0.961 0.961 0.964

Reset test 0.736 0.373 0.846 0.799 0.963

VIF (max) 2.063 2.000 2.422 2.582 2.608

Estimation method OLS OLS OLS OLS OLS

N (sample size) 102 102 102 102 102

Template 22. Network plus water models proposed by South Staffs Water

Description of dependent variable

Modelled OPEX = [OPEX] – [third party] – [abstraction charges] – [local authority rates]

Modelled base CAPEX = [maintenance infra] + [maintenance non-infra] – [grants and contributions]

Modelled BOTEX = [Controllable OPEX] + [Controllable base CAPEX]

Costs are deflated to 2016/17 base prices using CPI-H modelled on an unsmoothed basis.

Comments on models (South Staffs Water)

The coefficients are generally robust to alternative modelled costs and estimation techniques.

The coefficient on the proportion of mains relined/renewed variable is large only because it is a small variable which takes a maximum value of 0.012 and a minimum value of 0.000207 with a mean of 0.004. The magnitude of the cost adjustment is therefore limited. A coefficient of 25 would imply an estimated elasticity range of approximately 0 to 0.3.

Average pumping head is a known driver of power expenditure, yet the driver was often insignificant and/or had a counter-intuitive sign. This may be due to data problems with this variable or that its effect is reduced through the inclusion of other cost drivers. We note however that there remains a very strong correlation between average pumping head, distribution input and power costs when modelled separately. Modelling power expenditure separately as a function of average pumping head may be more appropriate, but we appreciate that the consultation will may give us the opportunity to study what other companies have observed in this area.

Page 28: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

28

Models containing Ofwat’s density and sparsity measures were considered. Although these models performed reasonably well in statistical diagnostic tests, company performances were sensitive to the choice of the threshold. As a robust operational rationale for choosing a particular threshold could not be identified, the models presented include simple density drivers only.

Consultation model ID SSCNPW1 SSCNPW2

Company’s model ID Model 1 Model 2

Dependent variable Network+ BOTEX (log)

Length of mains (km) (log) 1.069*** (0.000)

1.094*** (0.000)

Properties over mains (‘000s / km) (log) 0.577* (0.074)

0.515** (0.044)

% water treated at complexity band 4 and above 0.344

(0.109) 0.356** (0.028)

% of mains renewed/relined 27.53* (0.056)

28.83** (0.036)

% of mains laid before 1980 1.005*

(0.059)

Constant -4.446*** (0.000)

-5.502*** (0.000)

R2 adjusted 0.948 0.956

Reset test 0.821 0.112

VIF (max) 1.164 1.232

Estimation method OLS OLS

N (sample size) 102 102

1.6 Wholesale water models

Template 23. Wholesale water models proposed by Ofwat

Description of dependent variables

Wholesale water base costs excluding cost items described in section 3 of the main consultation document.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

We considered two scale variables in our wholesales models: connected properties (models 1-6) and length of mains (models 7-12).

When using length of mains as a scale variable, we have also included a density variable. This is to account for the fact that a company that serves a larger population per km of mains may incur higher distribution costs. As expected, the coefficient of the density variable is positive, albeit quite large. We present the same models with the weighted average density driver, which produces more sensible values for the estimated coefficient. The coefficient also captures increased cost of working in highly dense/urban areas.

The rationale for all explanatory variables in our wholesale water models can be found in our comments on the water treatment and treated water distribution models. All coefficients are reasonable robust and meet expectations.

Page 29: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

29

Consultation model ID OWW 1 OWW 2 OWW 3 OWW 4 OWW 5 OWW 6 OWW 7 OWW 8 OWW 9 OWW

10 OWW

11 OWW

12

Dependent variable -------------- ln (wholesale water base costs) --------------

ln (connected properties) 1.109*** (0.000)

1.078*** (0.000)

1.114*** (0.000)

1.053*** (0.000)

1.037*** (0.000)

1.081*** (0.000)

ln (lengths of main) 1.114***

(0.000) 1.072*** (0.000)

1.114*** (0.000)

1.086*** (0.000)

1.031*** (0.000)

1.082*** (0.000)

% mains length refurbished and relined 0.177

(0.126) 0.185* (0.073)

0.191* (0.071)

0.286** (0.014)

0.247** (0.014)

0.276*** (0.006)

0.210* (0.067)

0.174 (0.122)

0.197* (0.071)

0.184 (0.146)

0.13 (0.301)

0.165 (0.173)

ln (booster pumping stations per lengths of main)

0.280** (0.041)

0.392*** (0.006)

0.320* (0.051)

0.353** (0.049)

ln (service reservoirs and water towers per lengths of main)

0.202** (0.029)

0.336*** (0.006)

0.183 (0.162)

0.165 (0.360)

% of lengths of mains laid or refurbished 1981

-0.007* (0.088)

-0.007 (0.116)

-0.007 (0.106)

-0.005 (0.101)

-0.005 (0.197)

-0.006 (0.178)

-0.008* (0.058)

-0.006 (0.136)

-0.007* (0.098)

-0.009* (0.067)

-0.007 (0.183)

-0.008* (0.094)

% of water treated in water treatments in complexity levels 3-6

0.004 (0.185)

0.003 (0.130)

0.004** (0.030)

ln (average pumping head for water resources plus)

0.272*** (0.007)

0.170* (0.078)

0.199** (0.037)

0.231** (0.011)

0.172* (0.067)

0.196** (0.038)

0.252** (0.031)

0.207* (0.092)

0.231* (0.065)

ln (density) 0.918***

(0.000) 1.148*** (0.000)

1.071*** (0.000)

ln (weighted average density) 0.248***

(0.001) 0.330*** (0.000)

0.290*** (0.001)

Constant 2.287*** 4.324*** 3.394*** 3.696*** 5.780*** 4.840*** 3.249*** 4.244*** 3.508*** 5.614*** 7.216*** 6.139***

R2 adjusted 0.972 0.976 0.975 0.963 0.974 0.973 0.973 0.976 0.974 0.968 0.971 0.969

VIF (max) 1.31 1.725 1.605 1.234 1.254 1.306 1.46 2.829 2.287 1.641 3.182 2.533

Reset test 0.372 0.145 0.684 0.046 0.047 0.161 0.346 0.162 0.476 0.025 0.021 0.019

Estimation method OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 107 107 107 107 107 107 107 107 107 107 107 107

Page 30: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

30

Template 24. Wholesale water models proposed by Southern Water

Description of dependent variable

Wholesale water base costs = (OPEX less third party services, abstraction charges and local authority rates) + (maintenance capital expenditure infra and non-infra less grants and contributions).

All costs are unsmoothed and deflated to 2016-17 prices using CPIH.

Comments on models (Southern Water)

The three BOTEX models vary with the scale driver, density driver and maintenance drivers used:

Model 1 uses length of mains as the scale driver, whilst models 2-3 use connected properties

Model 1 uses a simple (linear) density measure. Because length of mains also captures aspects of sparsity, the positive coefficient on density is to be expected. Models 2-3 estimate a translog density relationship, using properties over mains and properties over area respectively.

Models 1 and 3 control for the proportion of mains renewed/relined and the proportion of mains laid before 1980. Model 2 controls for the proportion of mains renewed/relined only

Consultation model ID SRNWW1 SRNWW2 SRNWW3

Company’s model ID 1 2 3

Dependent variable BOTEX (log)

Connected properties (‘000s) (log) 1.028*** (0.000)

1.070*** (0.000)

Length of mains (km) (log) 1.096*** (0.000)

Properties over mains (‘000s / km) (log) 0.502** (0.034)

Properties over mains (‘000s / km) (log,demeaned)

-0.0817 (0.512)

Properties over mains (‘000s / km) (log,demeaned) squared

1.313*** (0.007)

Properties over area (‘000s / km2) (log, demeaned)

-0.155* (0.092)

Properties over area (‘000s / km2) (log, demeaned) squared

0.238* (0.055)

Sources over DI 0.760*** (0.000)

0.344* (0.081)

% DI from reservoirs 0.185* (0.067)

% of water treated at complexity band 4 and above

0.409*** (0.007)

0.486*** (0.009)

% of mains renewed/relined 28.86** (0.035)

29.98*** (0.003)

21.89 (0.119)

% of mains laid before 1980 0.926* (0.059)

0.438

(0.379)

Year 2016 dummy -0.0589** (0.027)

-0.0803*** (0.006)

Constant -5.440*** (0.000)

-3.441*** (0.000)

-3.531*** (0.000)

R2 adjusted 0.962 0.977 0.964

Reset test 0.215 0.608 0.743

VIF (max) 1.232 2.259 1.471

Method OLS OLS OLS

N (sample size) 102 102 102

Page 31: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

31

Template 25. Wholesale water models proposed by South West Water

Description of dependent variable

Wholesale BOTEX = (OPEX – third party – abstraction charges – local authority rates) + (maintenance infra + maintenance non-infra – grants and contributions)

Wholesale BOTEX+ (growth) = wholesale BOTEX + additions to the supply and demand balance + new developments and growth + metering expenditure + resilience

Wholesale TOTEX = Wholesale BOTEX + other capital expenditure infra + other capital expenditure non-infra + infrastructure network reinforcement

Unsmoothed net costs from 2011/12 to 2016/17

Comments on models (South West Water)

We have focused on capturing the key drivers of costs in wholesale water that are operationally robust and statistically valid.

The key drivers we have focused on for aggregate modelling are:

Scale (properties): Properties represents the most appropriate scale driver for aggregate water costs, as it simultaneously captures the volume of water that requires treatment and the size of the network as captured by the number of connections.

Density/sparsity (mains per property): there are increased costs associated with operating in densely populated urbanised areas (traffic congestion, congested underground utilities, etc.) and in sparsely populated areas (increase travel costs, leakage control, pumping costs). We selected this measure as it most directly relates to the operational relationship with maintenance costs. In addition, it allowed the modelling of a u-shape relationship, whereby the costs of operating in areas of more extreme population density and sparsity are accounted for within our trans-log models.

Source type/treatment process: the type of source determines the resource costs and the quality of the source water, which in turn determines the required complexity of water treatment. The proportion of distribution input that comes from surface water is outside of management control, as the source types available to companies are determined by local geological factors (while the type of treatment process, in contrast, lies partially within management control).

Maintenance: the costs associated with maintaining and repairing assets.

Our models differ across 2 key parameters: density/sparsity and source water quality.

Models 1, 3 and 5 model a trans-log ‘u-shape’ relationship between cost and population density/sparsity. Models 2, 4 and 6 use only the log of mains over connected properties, capturing only the impact of sparsity.

We have extended our BOTEX modelling to models controlling for BOTEX + growth enhancement and TOTEX. We have used the same BOTEX drivers as in our aggregate BOTEX models, as the regional operating characteristics increasing or decreasing BOTEX are also likely to affect the cost of delivering enhancement solutions. In addition, we have augmented our models with a driver for growth—the percentage increase in properties—to capture the impact of an increase in customer volumes on: growth enhancement directly; ongoing OPEX and capital maintenance costs; and delivery of programmes recorded under quality enhancement. We were not able to include direct measures of differences in the amount of quality enhancement within our econometric modelling.

While these models do not include a quality enhancement specific driver, they do meet many of the statistical criteria set out by Ofwat (see below). As can be seen from the efficiency range charts, while modelling BOTEX+ (growth) does not widen the efficiency ranges, including quality enhancement to model TOTEX does lead to somewhat broader efficiency ranges.

We would recommend that BOTEX+ (growth) and TOTEX modelling approaches are explored to the fullest possible extent at PR19. Benchmarking companies based on their TOTEX spend plays an important role in capturing the synergies between OPEX and CAPEX spend and ensuring that companies are rewarded for innovative solutions that reduce costs overall rather than in one particular area.

There is a broader range between model specifications than across cost categories. The models which include trans-log mains over connected properties have the narrowest efficiency ranges.

All of the BOTEX models estimate statistically significant coefficients that meet expectations. Likewise in BOTEX+ (growth) and TOTEX modes, although some coefficients are less significant.

Page 32: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

32

Consultation model ID SWBWW1 SWBWW2 SWBWW3 SWBWW4 SWBWW5 SWBWW6

Company’s model ID 1 2 3 4 5 6

Dependent variable Ln(wholesale water

botex) Ln(wholesale water

botex+growth) Ln(wholesale water

totex)

Properties (ln) 1.027*** (0.000)

1.046*** (0.000)

1.049*** (0.000)

1.060*** (0.000)

1.058*** (0.000)

1.073*** (0.000)

% of mains renewed or relined

33.70*** (0.000)

32.74*** (0.000)

27.19*** (0.000)

27.11*** (0.002)

26.19*** (0.005)

26.07*** (0.005)

Mains over connected properties (ln)

0.374*** (0.000)

0.350*** (0.010)

0.371*** (0.000)

0.355*** (0.001)

0.496*** (0.000)

0.473*** (0.000)

Mains over connected properties (ln squared and demeaned)

1.543*** (0.000)

0.856*** (0.003)

1.205*** (0.000)

% of treated surface water

0.193*** (0.010)

0.178** (0.020)

0.113 (0.212)

0.126 (0.158)

0.0782 (0.413)

0.0960 (0.312)

Properties growth (%) 0.285*** (0.007)

0.387*** (0.002)

0.308*** (0.005)

0.452*** (0.002)

Constant -3.729*** (0.000)

-3.710*** (0.000)

-3.820*** (0.000)

-3.888*** (0.000)

-4.129*** (0.000)

-4.225*** (0.000)

R2 adjusted 0.962 0.953 0.954 0.952 0.952 0.947

RESET Test 0.335 0.860 0.013 0.019 0.001 0.002

VIF (max) 1.312 1.277 1.337 1.331 1.337 1.331

Method OLS OLS OLS OLS OLS OLS

N (sample size) 102 102 102 102 102 102

Template 26. Wholesale water models proposed by Welsh Water

Description of dependent variable

Water Botex = (Total Operating Expenditure – Third Party Services – Abstraction Charges – Local authority rates) + (Maintaining the long term capability of the assets infra + Maintaining the long term capability of the assets non-infra)

Values rebased to 2016/17 using CPIH.

Comments on models (Welsh Water)

The submitted botex models aim to capture key cost drives for the industry. The models include a scale variable, the number of connected properties, alongside variables to capture density and sparsity, treatment complexity, drivers of maintenance and the size of the sources.

To capture both density and sparsity, the models include properties over length of mains alongside a squared term. In this way, the two variables capture the U-shape relationship between costs and density and sparsity. This variable allows the impact of density on costs to vary according to how dense the company is. The density variables have been demeaned (the sample mean value of the variable is subtracted from each observation) in order to eliminate collinearity between the linear and quadratic density term.

One of the two variables has a coefficient which does not show up as significant. This is not considered an issue as the two variables work in conjunction with each other and the other variable is highly significant.

The estimated elasticities of sparsity/density on costs are reasonable and of the right order across the industry. The model is robust to the removal of the most sparse and dense companies.

The models also appear to be robust to using alternative modelled costs (e.g. including abstraction charges, excluding grants and contributions etc) and to alternative estimation techniques such as random effects.

These models have been produced with South West and Bournemouth combined.

Page 33: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

33

Consultation model ID WSHWW1 WSHWW2

Company’s model ID 1 2

Dependent variable Ln(Water Botex)

Ln(Connected Properties) (,000)

0.986*** (0.000)

1.025*** (0.000)

Ln (Properties over Mains), demeaned (,000/km)

-0.0250 (0.877)

-0.0625 (0.62)

Ln (Properties over Mains)^2, demeaned (,000/km)

1.125** (0.019)

1.605*** (0.000)

Number of Sources/DI (nr/Ml/D)

0.659*** (0.002)

0.628*** (0.000)

% mains renewed and relined 29.25*** (0.004)

% water treated at complexity band 2 and below -0.709*** (0.000)

-0.837*** (0.000)

% water treated at complexity band 5 and above 0.213

(0.194)

Year 2016 Dummy -0.110*** (0.006)

-0.0559** (0.019)

Constant -2.186*** -2.518***

Adjusted R-squared 0.974 0.978

VIF (max) 2.046 2.052

Reset test 0.594 0.292

Method OLS OLS

N (sample size) 102 102

Template 27. Wholesale water models proposed by Yorkshire Water

Description of dependent variables

Wholesale water base costs = operating expenditure less abstraction charges, third party services and local authority rates + capital maintenance expenditure net of grants and contributions (G&C)

Wholesale water totex (growth) costs = wholesale water base costs + growth enhancement expenditure.

Modelled Growth enhancement expenditure = expenditure of supply side enhancement to the supply/demand balance (peak) + supply side enhancement to the supply/demand balance (average) + demand side enhancement to the supply/demand balance (peak) + demand side enhancement to the supply/demand balance (average) + resilience + new developments + metering for optants + metering for meters introduced by companies + metering for non-household and other.

The dependent cost variables are deflated using CPIH to 2016/17 prices. No smoothing was undertaken.

Comments on models (Yorkshire Water)

The aggregate BOTEX models use a variety of scale variables and density measures. These models are generally robust to alternative modelled costs and estimation techniques, and produce reasonably compact efficiency ranges.

The statistical performance of the models are broadly consistent with and without G&C.

Page 34: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

34

Consultation model ID YKYWW1 YKYWW2 YKYWW3 YKYWW4 YKYWW5 YKYWW6

Company’s model ID 5 6 7 8 9 10

Dependent variable Water BOTEX (log)

Length of mains (km) (log) 1.032*** (0.000)

1.047*** (0.000)

1.060*** (0.000)

Connected properties (‘000s) (log)

1.012*** (0.000)

1.030*** (0.000)

Population served (‘000s) (log) 1.026*** (0.000)

Properties over mains (‘000s / km) (log)

0.932*** (0.000)

0.984*** (0.001)

0.923*** (0.000)

Properties over mains (‘000s / km) (log, demeaned)

-0.131 (0.368)

-0.0890 (0.465)

-0.302** (0.016)

Properties over mains (‘000s / km) (log, demeaned) squared

1.236*** (0.009)

1.320*** (0.006)

1.041** (0.021)

Sources over DI (number / (Ml/d)

0.708*** (0.007)

0.765** (0.010)

0.694*** (0.003)

0.518*** (0.001)

0.754*** (0.000)

0.682*** (0.000)

% of mains renewed/relined 28.79** (0.019)

32.05*** (0.002)

35.40*** (0.001)

32.21*** (0.002)

28.02*** (0.006)

% of mains laid before 1980 0.800** (0.029)

0.904** (0.014)

% DI from reservoirs 0.529*** (0.002)

0.526*** (0.004)

0.548*** (0.000)

0.174 (0.103)

0.186* (0.059)

0.239** (0.013)

Proportion of DI from rivers (%)

0.236 (0.308)

0.229 (0.336)

0.332 (0.127)

% of water treated in band 1 and below

-0.892*** (0.000)

% of water treated in band 2 and below

-0.754*** (0.000)

-0.632*** (0.000)

Constant -3.596*** (0.000)

-3.217*** (0.000)

-4.113*** (0.000)

-2.594*** (0.000)

-2.718*** (0.000)

-3.508*** (0.000)

R2 adjusted 0.962 0.963 0.969 0.975 0.977 0.977

Reset test 0.512 0.727 0.107 0.737 0.639 0.508

VIF (max) 2.415 2.392 2.416 2.022 2.257 2.251

Method OLS OLS OLS OLS OLS OLS

N (sample size) 102 102 102 102 102 102

Page 35: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

35

Template 28. Wholesale water plus models proposed by Yorkshire Water

Consultation model ID YKYWW7 YKYWW8 YKYWW9 YKYWW10 Company’s model ID 1 2 3 4

Dependent variable TOTEX (growth) (log)

Length of mains (km) (log) 1.048*** (0.000)

1.047*** (0.000)

1.052*** (0.000)

1.055*** (0.000)

Properties over mains (‘000s / km) (log) 0.810*** (0.000)

0.796*** (0.000)

0.891*** (0.000)

0.836*** (0.000)

Sources over DI (number / (Ml/d)) 0.494*** (0.001)

0.439*** (0.002)

0.597*** (0.005)

0.442** (0.012)

% of mains renewed/relined 20.13* (0.097)

21.65** (0.05)

% of mains laid before 1980 0.905** (0.02)

0.824** (0.039)

% of DI from reservoirs 0.396*** (0.003)

0.406*** (0.002)

0.399** (0.021)

0.419*** (0.008)

Enhancement to the supply/demand balance over DI

2.034** (0.019)

1.961** (0.018)

1.912* (0.1)

1.741 (0.156)

New properties over connected properties

0.134

(0.387)

0.297** (0.045)

Constant -3.874*** (0.000)

-3.935*** (0.000)

-3.229*** (0.000)

-3.583*** (0.000)

R2 adjusted 0.963 0.963 0.959 0.962

Reset test 0.761 0.720 0.289 0.154

VIF (max) 1.643 1.829 1.563 1.839

Method OLS OLS OLS OLS

N (sample size) 102 102 102 102

Template 29. Wholesale water models proposed by Affinity Water

Description of dependent variable

The dependent variable of the models presented in this template is total smoothed botex per connected property. This includes operating and capital maintenance costs across all the wholesale value chain for the water service.

Operating costs include all operating expenditure except for local authority rates and third party services. Capital maintenance costs are based on the capex category “maintaining the long term capability of assets”, including both infrastructure and non-infrastructure costs.

In order to mitigate the effects of “lumpy” capital investments, and following recent precedent from the CMA, we have smoothed companies’ capital maintenance on a 3 year rolling-average basis. Therefore, our models are based on four years of data (2014 to 2017).

Comments on models (Affinity Water)

The models presented in this template are estimated using data from Ofwat’s wholesale water cost assessment dataset from October 2017, which compiles cost and driver data for all companies in England and Wales. Our dataset includes a total of 17 companies, since we have combined the data for Bournemouth and South West Water to treat them as a single merged company.

We have selected the models presented in this template using an innovative tool based on a Monte Carlo simulation. This tool randomly generates and runs a total of 12,000 econometric models based on different combinations of the available cost drivers. We have then selected our initial set of preferred models based on the following filtering criteria:

The models pass the Ramsey RESET test at the 5% significance level

The adjusted R-squared is higher than 0.4

Page 36: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

36

The coefficients of certain cost drivers have an intuitive sign. In particular, we consider that water delivered per property, distribution input per property, and the share of water treated at level 4 (or 5) and above should have a positive relationship with base expenditure per property.

A total of 385 models out of the 12,000 satisfied the above criteria. We have then applied further filtering criteria to narrow down this set of potential models:

We have excluded all models which included leakage or distribution input/property as an explanatory variable. The reason is that leakage is a driver that can be managed by the company to some extent, and including it therefore risks endogeneity bias (as was highlighted by the CMA in its 2015 Final Determination for Bristol Water, page A4(2)-28).

We have only included models which contain at least one variable capturing each of the following four effects on companies’ costs: (1) population density, (2) network density (3) water treatment complexity variables (4) variables relating to the company’s mix of sources. We consider that these are key cost drivers for Affinity Water and for the water industry in general, and if they had been omitted from models, some form of off-model adjustment would be required (eg. special factor adjustment) to control for their effect.

A total of 14 models out of the 385 models satisfied these additional criteria. We have then estimated the VIF statistic for each of these 14 models, and selected the top 4 models with the lowest VIF. This is an objective method for minimising the risk that the models are distorted due to the effects of multicollinearity.

This innovative model selection method has the advantage of allowing us to asses a large number of possible models in a systematic and objective way, ensuring our selected models satisfy key statistic standards from the perspective of the industry as a whole. However, it is a mechanistic method which involves limited expert judgement, and as such does not guarantee that these are the best possible models for explaining water industry costs. Rather, they provide a starting point for developing models that can be applied in the PR19 review.

Consultation model ID AFWWW1 AFWWW2 AFWWW3 AFWWW4

Company’s model ID 1 2 3 4

Dependent variable Ln (total smoothed botex per property)

Ln (length of mains/ connected properties) (km/000s)

0.939*** (0.001)

0.966*** (0.005)

0.978*** (0.005)

0.988** (0.001)

Ln (population/ connected properties) 3.219*** (0.001)

2.807** (0.023)

2.760** (0.025)

3.639*** (0.001)

% of water treated at level 4 or above 0.231* (0.065)

0.357* (0.051)

0.367** (0.041)

0.266 (0.104)

% of water from reservoirs 0.351*** (0.001)

% of water from boreholes -0.026 (0.453)

Surface water treated/ Total water treated

0.281** (0.018)

Ln (water treatment works/ DI) -0.062 (0.385)

-0.077 (0.270)

year15 dummy -0.023 (0.302)

-0.026 (0.227)

-0.026 (0.222)

-0.022 (0.375)

year16 dummy -0.018 (0.634)

-0.032 (0.454)

-0.033 (0.443)

-0.021 (0.622)

year17 dummy -0.022 (0.579)

-0.012 (0.792)

-0.021 (0.625)

-0.006 (0.890)

Constant -7.443*** -7.274*** -7.322*** -7.975***

Adjusted R-squared 0.57 0.41 0.42 0.53

VIF (max) 3.53 3.56 3.52 3.90

Reset test 0.41 0.48 0.37 0.57

Method OLS OLS OLS OLS

N (sample size) 68 68 68 68

Page 37: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

37

Template 30. Wholesale water models proposed by Bristol Water

Description of dependent variable

The dependent variable is Botex per connected property.

Botex = (total opex – business rates – third party costs) + capital maintenance

Comments on models (Bristol Water)

The models and corresponding coefficients presented in this pro forma are based on cost information for 17 companies (data for Bournemouth and South West Water have been appropriately combined ). Regressions were run in reference to the Master Wholesale Cost data file dated 27th February 2017, reflecting the latest updates and amendments to the data.

Capital maintenance costs have been smoothed on a three year rolling-average basis, therefore four years of data have been modelled (2014-2017). Botex costs have been calculated on a unit cost basis by dividing cost information by the sum of Total non-household connected properties at year end and Total household connected properties at year end also from the six-year wholesale cost data set.

A full description of the work undertaken to arrive at these models is set out in a report by NERA: ‘Comparative Benchmarking Assessment to Support Preparation of Bristol Water’s AMP7 Business Plan’ (December 2017).

Consultation model ID BRLWW1 BRLWW2 BRLWW3

Company’s model ID 1 2 3

Dependent variable Ln(total botex per property aggregate)

Ln(DI/ ‘000 connected property) 0.718

(0.116) 0.834** (0.0160)

0.753* (0.055)

Ln(length of mains/ ‘000 connected property) 0.279

(0.231) 0.346* (0.097)

0.454*** (0.009)

Ln(length of raw mains and conveyors/DI) Unit: km per Ml/d

0.041 (0.624)

Share of water treated at level 5 and above (%) 0.354** (0.017)

0.193

(0.222)

Length of mains laid pre-1940/Total length of main (%)

0.270 (0.358)

0.987*** (0.006)

0.439 (0.140)

Length of renewed and relined mains/Total length of mains (%)

11.11 (0.278)

32.36** (0.021)

17.71* (0.090)

Ln(average pumping head aggregate) 0.228

(0.146) 0.026

(0.806) 0.102

(0.559)

Year15 -0.015 (0.460)

0.017 (0.583)

-0.003 (0.892)

Year16 -0.004 (0.911)

0.062 (0.220)

0.023 (0.590)

Year17 -0.012 (0.699)

0.063* (0.092)

0.019 (0.583)

Surface water treated / Total water treated (%) 0.541** (0.020)

Share of water from reservoirs (%) 0.051

(0.753) 0.272** (0.027)

Ln(number of sources / DI) 0.132

(0.172)

Constant -3.718*** -3.131*** -3.699***

R2 adjusted 0.61 0.73 0.67

Reset test 0.94 0.24 0.74

VIF (max) 1.99 3.66 1.95

Method OLS OLS OLS

N (sample size) 68 68 68

Page 38: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

38

Template 31. Wholesale water models proposed by South East Water

Description of dependent variable

Modelled BOTEX = Modelled OPEX + Modelled base CAPEX Modelled OPEX = OPEX – third party – abstraction charges – local authority rates Modelled base CAPEX = maintenance infra + maintenance non-infra – (grants and contributions) Costs are modelled on an outturn basis and unsmoothed.

Comments on models (South East Water)

The coefficients are broadly robust to alternative modelled costs and estimation techniques.

It is important to capture the impact of the number of sources on expenditure as the number of sources drives a number of real costs such as employment costs (travel time), maintenance costs, capital costs as each require control systems, pumps, borehole maintenance, monitors, chemical delivery costs etc. The number of sources over DI variable has a statistically significant coefficient and is of the expected sign.

The coefficient on the proportion of mains relined/renewed variable is large only because the proportion of mains relined/renewed is a small variable which takes a maximum value of 0.012 and a minimum value of 0.000207 with a mean of 0.004. The magnitude of the cost adjustment is therefore limited. A coefficient of 25 would imply an estimated elasticity range of approximately 0 to 0.3.

The TOTEX (growth) models were developed by including growth enhancement cost and corresponding drivers in BOTEX models. Since each enhancement activity is a small part of TOTEX (growth), and given a relatively small dataset, these enhancement drivers end up statistically insignificant and sometimes have an unintuitive sign. The coefficient on the proportion of mains relined/renewed variable is large only because the proportion of mains relined/renewed is a (numerically) small variable that takes a maximum value of 0.012 and a minimum value of 0.000207 with a mean of 0.004. The magnitude of the cost adjustment is therefore limited. A coefficient of 25 would imply an estimated elasticity range of approximately 0 to 0.3. Some coefficients are narrowly insignificant at the 10% level. Given these coefficients are of the correct sign from an operational and economic perspective, this was deemed appropriate to consider. The coefficients are broadly robust to alternative modelled costs (e.g. including abstraction charges) and alternative estimation approaches such as Random Effects.

Consultation model ID SEWWW1 SEWWW2 SEWWW3 SEWWW4

Company’s model ID 5 6 7 8

Dependent variable Water BOTEX (log)

Connected properties (‘000s) (log) 1.088*** (0.000)

1.065*** (0.000)

Population served (‘000s) (log) 1.084*** (0.000)

Distribution input (Ml/d) (log) 1.046*** (0.000)

% of area with more than 4000 people per km2 0.547*** (0.000)

0.367*** (0.009) 0.243* (0.07)

0.615*** (0.000)

% of area with less than 600 people per km2 0.428*** (0.003) 0.512*** (0.001) 0.367** (0.014)

0.413*** (0.005)

% of DI treated at complexity band 3 and above 0.608*** (0.000)

0.563*** (0.000)

0.617*** (0.000)

0.581*** (0.000)

% of mains renewed/relined 24.38** (0.012)

22.03** (0.037)

23.01** (0.032)

Sources over DI (number / (Ml/d) 0.669*** (0.001) 0.592*** (0.004) 0.578*** (0.003) 0.709*** (0.001)

Year 2016 dummy -0.072*** (0.003)

-0.076*** (0.001)

-0.058*** (0.008)

-0.113*** (0.005)

Constant -3.862*** (0.000)

-4.649*** (0.000)

-2.843*** (0.000)

-3.577*** (0.000)

R2 adjusted 0.973 0.971 0.973 0.970

Reset test 0.950 0.994 0.808 0.835

VIF (max) 3.000 3.044 2.970 2.996

Estimation method OLS OLS OLS OLS

N (sample size) 102 102 102 102

Page 39: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

39

Template 32. Wholesale water plus models proposed by South East Water

Consultation model ID SEWWW5 SEWWW6 SEWWW7 SEWWW8

Company’s model ID 1 2 3 4

Dependent variable TOTEX (growth) (log)

Connected properties (‘000s) (log) 1.123*** (0.000)

DI (Ml/d) (log) 1.073*** (0.000)

1.065*** (0.000)

1.067*** (0.000)

Proportion of area with more than 4000 people per km2 (%)

0.680*** (0.000)

0.350** (0.029)

0.389** (0.011)

0.380*** (0.006)

% area with less than 600 people per km2 0.554*** (0.000)

0.470** (0.014)

0.536*** (0.002)

0.520*** (0.001)

% water treated at complexity band 3 and above

0.584*** (0.000)

0.637*** (0.000)

0.511*** (0.000)

0.600*** (0.000)

Sources over DI (number / (Ml/d)) (log) 0.170*** (0.000)

0.140** (0.031)

0.127** (0.04)

0.117** (0.05)

% of mains renewed/relined 21.90** (0.02)

21.15* (0.081)

17.37 (0.124)

20.37* (0.076)

New mains over length of mains (%) 0.171 (0.29)

0.268 (0.187)

0.281

(0.185)

Enhancement to the supply/demand balance over DI

1.832* (0.1)

1.884 (0.113)

Constant -3.606*** (0.000)

-2.644*** (0.000)

-2.488*** (0.000)

-2.674*** (0.000)

R2 adjusted 0.977 0.972 0.973 0.974

Reset test 0.538 0.247 0.0881 0.195

VIF (max) 2.524 2.498 2.580 2.599

Estimation method OLS OLS OLS OLS

N (sample size) 102 102 102 102

Template 33. Wholesale water models proposed by South Staffs Water

Description of dependent variable

Modelled OPEX = [OPEX] – [third party] – [abstraction charges] – [local authority rates]

Modelled base CAPEX = [maintenance infra] + [maintenance non-infra] – [grants and contributions]

Modelled BOTEX = [Controllable OPEX] + [Controllable base CAPEX]

Costs are deflated to 2016/17 base prices using CPI-H modelled on an unsmoothed basis.

Comments on models (South Staffs Water)

The coefficients are generally robust to alternative modelled costs and estimation techniques.

The coefficient on the proportion of mains relined/renewed variable is large only because it is a small variable which takes a maximum value of 0.012 and a minimum value of 0.000207 with a mean of 0.004. The magnitude of the cost adjustment is therefore limited. A coefficient of 25 would imply an estimated elasticity range of approximately 0 to 0.3.

Average pumping head is a known driver of power expenditure, yet the driver was often insignificant and/or had a counter-intuitive sign. This may be due to data problems with this variable or that its effect is reduced through the inclusion of other cost drivers. We note however that there remains a very strong correlation between average pumping head, distribution input and power costs when modelled separately. Modelling power expenditure separately as a function of average pumping head may be more appropriate, but we appreciate that the consultation will may give us the opportunity to study what other companies have observed in this area.

Models containing Ofwat’s density and sparsity measures were considered. Although these models performed reasonably well in statistical diagnostic tests, company performances were sensitive to the choice of the

Page 40: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

40

threshold. As a robust operational rationale for choosing a particular threshold could not be identified, the models presented include simple density drivers only.

Consultation model ID SSCWW1 SSCWW2

Company’s model ID 1 2

Dependent variable Water BOTEX (log)

Length of mains (km) (log) 1.048*** (0.000)

1.029*** (0.000)

Properties over mains (‘000s / km) (log)

1.051*** (0.000)

1.013*** (0.000)

% of water treated at complexity band 2 and below

-0.649*** (0.002)

-0.540** (0.011)

% of DI from reservoirs 0.335*** (0.005)

0.360** (0.011)

Sources over DI (number / (Ml/d)) 0.968*** (0.000)

0.905*** (0.001)

% of mains renewed/relined 29.39*** (0.005)

% of mains laid before 1980 0.402

(0.262)

Constant -2.864*** (0.000)

-2.924*** (0.002)

R2 adjusted 0.972 0.967

Reset test 0.253 0.339

VIF (max) 1.992 2.310

Estimation method OLS OLS

N (sample size) 102 102

Page 41: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

41

2 Wastewater models

2.1 Bioresources models

Template 34. Bioresources models proposed by Ofwat

Description of dependent variable

Bioresources base costs excluding cost items described in section 4 of the main consultation document.

Comments on models

We use properties or sludge produced as a scale variable. For a vertically separated bioresources provider, sludge produced is not under management control (unlike sludge disposed).

To account for disposal costs we used the percent of sludge disposed to farmland. To account for transport costs we use the percent of intersiting work done by tanker or trucks. In model 3 we add total intersiting work (by all forms of transport) to distinguish between vehicle transport (tanker and trucks) from pipe transport.

All estimated coefficients have the expected sign and a plausible magnitude. The percent of total intersiting work in model 3 does not seem to improve the model.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Consultation model ID OBR 1 OBR 2 OBR 3

Dependent variable --------- ln (bioresources base costs) ---------

ln (properties) 1.002*** (0.000)

ln (sludge produced) 0.940***

(0.000) 0.912*** (0.000)

% intersiting work done by truck and tanker

0.020*** (0.003)

0.017*** (0.010)

0.019*** (0.008)

% of sludge disposed to farmland -0.021** (0.021)

-0.018** (0.026)

-0.018** (0.025)

ln (intersiting work) 0.061

(0.437)

Constant 3.167** (0.017)

13.261*** (0.000)

12.802*** (0.000)

R2 adjusted 0.862 0.878 0.88

VIF (max) 2.536 2.47 2.671

Reset test 0.011 0.003 0.002

Estimation method OLS OLS OLS

N (sample size) 60 60 60

Template 35. Bioresources models proposed by Anglian Water

Description of dependent variable

Natural log of Bioresources botex excluding rates

Acronyms used in explanatory variables

ttds = tons of dry solids

STW = sewage treatment works

Comments on models (Anglian Water)

We have developed three possible model forms for Bioresources:

Page 42: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

42

Model 1 is based on demographic and geographic factors, with causation factors that are exogenous so far as the Bioresources function is concerned.

Models 2, 3 are based on the nature of the Network Plus asset base which produces the raw sludge which in turn is the treatment input for the Bioresources function. Both from the point of view that:

1. The existing Network Plus fixed asset base cannot realistically be changed in the short to medium term; 2. Bioresources as a stand-alone function cannot control the Network Plus technology used to produce the

sludge it is treating.

Models 4-6 take the operational parameters of the Bioresources function as being the causation factors. Given the asset lives of Bioresources assets, except in the short term, these causation factors are not exogenous so far as the Bioresources function is concerned.

Arable land is the proportion of arable land in each WaSC’s appointed area as reported by DEFRA. It is intended as a proxy for Land-bank.

All models are described in detail in our Cost Modelling report – Phase 2, published March 2018: http://www.anglianwater.co.uk/about-us/thinking-about-our-future/

Consultation model ID ANHBR1 ANHBR2 ANHBR3 ANHBR4 ANHBR5 ANHBR6

Company’s model ID 1 2 3 4 5 6

Dependent variable Ln(Bioresources botex)

Ln(Sludge produced x sparsity<600) (ttds)

0.383** (0.035)

Ln(Sludge produced x(1- sparsity<600)) (ttds)

0.462*** (0.000)

Ln(Sludge produced x sparsity<1,150) (ttds)

1.043*** (0.000)

Ln(Sludge produced x(1- sparsity<1,150)) (ttds)

0.217*** (0.000)

ln(Ttds generated by Band5 STWs) (ttds)

0.156

(0.249) 0.280*** (0.027)

ln(Ttds generated by Band6 STWs) (ttds)

0.812*** (0.000)

0.692** (0.000)

ln(Ttds generated by Band1-4 STWs) (ttds)

-0.172 (0.382)

0.139 (0.286)

Ln(Sludge produced) (ttds)

1.086*** (0.000)

1.150*** (0.000)

% tds treated by conventional or advanced anaerobic digestion

-0.992*** (0.000)

-0.713*** (0.000)

-0.803*** (0.000)

-1.010*** (0.000)

-0.804*** (0.000)

-0.858*** (0.000)

Ln(Appointed area) 0.488** (0.013)

Sewered area / Appointed area

2.182* (0.098)

Arable land in appointed area as % of total arable land

3.664** (0.042)

% sludge produced at co-located STW

-0.796*** (0.002)

Sparsity<600/km2

0.964*** (0.001)

Time Trend

0.046** (0.03)

0.036* (0.086)

0.042** (0.047)

0.043** (0.045)

0.041** (0.05)

0.043** (0.035)

Constant -4.719*** (0.002)

-0.375 (0.418)

-0.583 (0.213)

-0.991** (0.037)

-2.489*** (0.000)

-1.615*** (0.002)

R2 adjusted 0.829 0.833 0.823 0.822 0.828 0.838

Reset test 0.480 0.984 0.770 0.011 0.320 0.818

VIF (max) 4.21 7.65 3.05 1.77 2.30 1.81

Method OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60

Page 43: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

43

Template 36. Bioresources models proposed by Southern Water

Description of dependent variable

Modelled OPEX plus modelled base CAPEX.

Modelled OPEX is total OPEX less third party services, abstraction charges and local authority rates.

Modelled base CAPEX is maintenance expenditure in infrastructure and non-infrastructure less grants and contributions.

All costs are unsmoothed and deflated to 2016/17 prices using CPIH.

Comments on models (Southern Water)

The two network+ models are similar to models 1 and 3 of the BOTEX models, providing alternative approaches to control for pumping capacity per length of sewer. These models also appear to estimate coefficients that are operationally intuitive with reasonable statistical properties.

Consultation model ID SRNBR1 SRNBR2 SRNBR3 SRNBR4

Company’s model ID 1 2 3 4

Dependent variable ln (Bioresources BOTEX)

Amount of Sludge produced (log) 1.063*** (0.000)

1.011*** (0.000)

1.046*** (0.000)

1.101*** (0.000)

% of sludge treated using AD or AAD -0.947*** (0.009)

-0.942** (0.018)

-0.718*** (0.000)

% of sludge produced and treated at a site of STW and STC co-location

-0.008** (0.014)

Total measure of intersiting 'work' done (all forms of transportation) per unit sludge produced (log) (km/year)

0.163

(0.207)

% of load treated in small WTWs (bands 1 to 3)

0.052** (0.048)

% of area with more than 2000 people per km2

-0.809*** (0.003)

-0.624* (0.098)

Constant -0.129 (0.627)

-0.804 (0.226)

-0.480** (0.037)

-1.536** (0.011)

R2 adjusted 0.806 0.794 0.813 0.763

VIF (max) 1.753 1.750 1.529 2.553

Reset test 0.142 0.192 0.533 0.0851

Estimation method OLS OLS OLS OLS

N (sample size) 60 60 60 60

Template 37. Bioresources models proposed by Severn Trent Water

Description of dependent variable

Sludge base cost

Description of selected explanatory variables

Weighted density Ofwat's new weighted density index Prop. Load with tight N3 consent

This is the proportion of load that has an ammonia consent of 3mg/l or less. Engineering logic informs us that it would be better to have include the load with consents of between 3mg/l and 5mg/l also but this data was not readily available.

Av. Distance intersiting Total intersiting "work" done divided by sludge vol. (km/yr) Av. Distance intersiting via pipe Intersiting "work" done by pipeline divided by sludge vol. (km/yr) Av. Distance to disposal Total disposal "work" divided by total sludge vol. (km/yr)

Page 44: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

44

Comments on models (Severn Trent Water)

In model 13, we prefer to use a distance based measure of disposal work to reduce correlation with the scale variable and distance based measures of inter-siting activity to reduce correlation relative to the “work” based variables. All coefficients are broadly in line with expectations.

Consultation model ID SVTBR1 SVTBR2

Company’s model ID 13 14

Dependent variable Ln(Sludge base capex

smoothed 5 years) Sludge base capex smoothed 5

years)

Ln(sludge produced) 1.15*** (.00)

1.23*** (.00)

Ln(Weighted average density) -.11 (.27)

-.16*** (.007)

Ln(sludge produced)^2 .18

(.12)

Ln(Weighted Density)^2 .05

(.26)

% sludge treated with anaerobic digestion (conventional and advanced)

-.19 (.39)

-.02 (.91)

Av. distance intersited (km) .19*** (.00)

.27*** (.00)

Av. distance intersited by pipeline (km) -.04*** (.00)

-.045*** (.00)

% sludge treated at STC-STW co-located sites -.42* (.06)

-.36** (.04)

Av. distance to disposal (km) .33** (.012)

.38*** (.003)

Dummy 2012 .18

(.28) .23

(.19)

Dummy 2013 .21

(.18) .26

(.13)

Dummy 2014 .14 (.3)

.16 (.25)

Dummy 2015 .13

(.24) .14

(.22)

Dummy 2016 .03

(.73) .03 .77)

Constant 4.03*** (.00)

3.77*** (.00)

R2 adjusted .88 .89

Reset test 0.38 0.38

VIF max 4.35 19

Method OLS OLS

N (sample size) 60 60

Template 38. Bioresources models proposed by South West Water

Description of dependent variable in bioresources models

Bioresources = sludge transport + sludge treatment + sludge disposal

Modelled OPEX = bioresources OPEX – bioresources third party – bioresources pensions – bioresources local authority rates

Modelled base CAPEX = bioresources maintenance infra + bioresources maintenance non-infra – bioresources grants and contributions

Modelled BOTEX = modelled OPEX + modelled base CAPEX

Page 45: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

45

Modelled BOTEX+ (growth) enhancement = modelled BOTEX + bioresources first time sewerage + bioresources sludge enhancement (growth) + bioresources new developments and growth + bioresources growth at sewage treatment works + bioresources resilience + bioresources reduce flooding risk for properties

Modelled TOTEX = modelled BOTEX + bioresources other capital expenditure infra + bioresources other capital expenditure non-infra + bioresources infrastructure network reinforcement

Unsmoothed net costs from 2011/12 to 2016/17

Consultation

model ID SWBBR1 SWBBR2 SWBBR3 SWBBR4 SWBBR5 SWBBR6 SWBBR7 SWBBR8 SWBBR9

Company’s model ID

1 2 3 4 5 6 7 8 9

Dependent variable

Bioresources BOTEX (ln) Bioresources BOTEX+

(growth) (ln) Bioresources TOTEX (ln)

Sludge produced (ln)

0.990*** (0.000)

1.064*** (0.000)

1.020*** (0.000)

1.044*** (0.000)

1.047*** (0.000)

1.114*** (0.000)

1.145*** (0.000)

1.166*** (0.000)

1.223*** (0.000)

Proportion of area with less than 250 people per km2

0.731** (0.030)

0.591* (0.078)

0.586

(0.116)

Number of treatment works per property (ln)

0.220* (0.059)

0.122

(0.282)

0.139 (0.284)

Proportion of load treated at works in size band 1-3

0.0556* (0.052)

0.059** (0.037)

0.0617** (0.034)

Constant -1.441** (0.041)

0.0256 (0.941)

-1.357* (0.063)

-1.577** (0.028)

-0.475 (0.165)

-1.778** (0.014)

-1.966** (0.012)

-0.847*** (0.009)

-2.223*** (0.003)

R2 adjusted 0.739 0.734 0.743 0.765 0.757 0.778 0.787 0.781 0.799

Reset test 0.032 0.045 0.230 0.004 0.002 0.176 0.007 0.008 0.184

VIF max 2.020 3.934 2.269 2.020 3.934 2.269 2.020 3.934 2.269

Method OLS OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60 60 60 60

Template 39. Bioresources models proposed by United Utilities

Description of dependent variable

Model 8 and 9 are models of bioresources botex.

Botex has been derived by subtracting total enhancement expenditure (table 9, line 36), business rates (table 8 line 8) and third party services (table 8 lines 10 and 18) from net totex (table 8 line 21) for each of the respective value chains.

Each dependent variable includes smoothed base capex which minimises the impact of spikes.

For all models, the dependent variable is included in its logged form and is in 2012/13 CPIH FYA prices.

Comments on models (United Utilities)

Bioresources models perform well against statistical criteria but have lower explanatory power than econometric models for other subservices.

The model R2 scores of around 0.8 are acceptable but lower than those witnessed in other services. This may reflect the fact that companies can substitute activities between different parts of the wastewater value chain and therefore costs between bioresources and wastewater treatment more readily than they do between other service areas, that data quality is worse, in part as a result of inconsistency between companies in cost allocation and income accounting, that a suitable exogenous land bank variable has not been identified, or perhaps that there is greater variation in efficiency for the service.

Page 46: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

46

Consultation model ID UUBR1 UUBR2

Company’s model ID 8 9

Dependent variable ln(Bioresources botex [transport, treatment & disposal])

Log(Total sewage sludge produced) 0.985*** (0.000)

1.008*** (0.000)

% 'work' done in sludge disposal operations (all forms of transportation)

0.006* (0.067)

0.006** (0.019)

% of load received by WwTW bands 1-3 5.536* (0.066)

5.627* (0.067)

% WwTW in sparse areas (Arup/Vivid) 0.253

(0.583)

Constant -1.447** (0.039)

-1.610** (0.015)

R2 adjusted 0.797 0.795

VIF (max) 2.51 2.84

Reset test 0.185 0.043

Estimation method OLS OLS

N (sample size) 60 60

Template 40. Bioresources models proposed by Welsh Water

Description of dependent variable

Bioresources includes costs for sludge transport, sludge treatment and sludge disposal

Bioresources Botex = “Total Operating Expenditure” – “Third Party Services” – “Local authority and Cumulo rates” + “Maintaining the long term capability of the assets – infra” + “Maintaining the long term capability of the assets - non-infra”

Values rebased to 2016/17 using CPIH in line with the PR19 Methodology Statement.

Comments on models (Welsh Water)

The submitted Bioresources model controls for the amount of transport required using the proportion of load treated in band 1-3 works and the proportion of sludge produced and treated at a site of STW and STC co-location.

The model’s coefficients have the expected sign and magnitude and perform well on the statistical tests.

This cost segment appears to be slightly more problematic to model compared to Network+ and aggregate BOTEX, with estimated range of efficiency scores across the industry being slightly larger.

Although the models produce statistically insignificant coefficients for some variables, the estimated sign and magnitude is supported from an operational point of view. The models appear to have appropriate statistical properties and reasonably robust to other modelling approaches such as Random Effects and unit cost modelling.

Page 47: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

47

Consultation model ID WSHBR1

Company’s model ID 7

Dependent variable Ln (Bioresources Botex)

Ln (Sewage Sludge Produced) 1.050*** (0.000)

% load treated in band 1-3 works 0.055** (0.024)

% sludge produced and treated at a site of STW and STC co-location

-0.001 (0.784)

Constant -1.399*** (0.006)

R2 adjusted 0.760

VIF (max) 2.586

Reset test 0.293

Estimation method OLS

N (sample size) 60

Template 41. Bioresources models proposed by Wessex Water

Description of dependent variable

Model 1 and 3: Bioresources botex = Opex + Capital Maintenance – Third party costs – Local authority rates – EA charges

Model 2 and 4: Bioresources botex = Opex + IRE + Average MNI over period – Third party costs – Local authority rates – EA charges

Description of selected explanatory variables

Ofwat measure of highly dense areas = the proportion of the companies area of service with over 6000 pop.

Comments models (Wessex Water)

We include simple and exogenous models. No endogenous variables were included to aid in setting a level playing field for market opening. Limited independent observations limits number of variables we could include.

Consultation model ID WSXBR1 WSXBR2 WSXBR3 WSXBR4

Company’s model ID 1 2 3 4

Dependent variable Ln(Bioresources

botex)

Ln(Smooth bioresources

botex)

Ln(Unit Bioresources

botex per load)

Ln(Smooth unit Bioresources

botex per load)

Sludge Produced 0.968*** (0.000)

0.952*** (0.000)

Ofwat measure of highly dense areas

-0.529 (0.228)

-0.474 (0.256)

-0.411 (0.195)

-0.384 (0.237)

Constant -0.660 (0.386)

-0.736 (0.269)

-8.760*** (0.000)

-8.910*** (0.000)

R2 adjusted 0.82 0.89 0.077 0.140

VIF (max) 1.67 1.67 1.67 1.67

Reset test 0.000 0.000 0.638 0.662

Estimation method OLS OLS OLS OLS

N (sample size) 60 60 60 60

Page 48: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

48

Template 42. Bioresources models proposed by Yorkshire Water

Description of dependent variable

The dependent cost variable modelled is BOTEX. The dependent cost variables are deflated using CPIH to 2016/17 prices. No smoothing was undertaken.

The costs are in net terms i.e. excluding grants and contributions (G&C) consistent with the PR14 approach. However, given lack of split of G&C for capital maintenance and enhancement expenditure, we have also modelled CAPEX on a gross basis.

Comments on models (Yorkshire Water)

The bioresources models below aim to explain variations in bioresources BOTEX through variations in scale, sludge treatment and density (as a possible proxy for sludge transportation requirement).

In models 2 and 3, the estimated coefficients on the density/sparsity variable appears to be of the right sign. While these models also have reasonable statistical properties, they result in a relatively wider efficiency range than other parts of the value chain, indicating possible limitations in using these directly for price setting purposes and recourse to other modelling approaches and cross-checking.

Consultation model ID YKYBR1 YKYBR2 YKYBR3 YKYBR4 YKYBR5 YKYBR6

Company’s model ID 1 2 3 4 5 6

Dependent variable Bioresources BOTEX

Amount of Sludge produced (log) (ttds/ year)

0.920*** (0.000)

1.046*** (0.000)

1.080*** (0.000)

1.127*** (0.000)

1.107*** (0.000)

1.083*** (0.000)

% of sludge treated using AD or AAD

-0.646** (0.0277)

-0.718*** (0.000)

-0.741*** (0.009)

-0.740*** (0.001)

-0.703*** (0.008)

-0.686** (0.0103)

% of area with more than 2000 people per km2

-0.809*** (0.003)

% of area with more than 4000 people per km2

-0.843** (0.0381)

% of area with less than 250 people per km2

0.972*** (0.002)

Resident population per service area (log)

-0.343* (0.060)

Connected properties per service area (log)

-0.340* (0.100)

Constant -0.182 (0.550)

-0.480** (0.037)

-0.769* (0.090)

-1.672*** (0.005)

-0.723** (0.039)

-0.895** (0.046)

R2 adjusted 0.776 0.813 0.798 0.811 0.796 0.794

VIF (max) 1.105 1.529 2.160 2.282 2.716 2.431

Reset test 0.116 0.533 0.834 0.204 0.363 0.279

Estimation method OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60

2.2 Sewage treatment models

Template 43. Sewage treatment models proposed by Ofwat

Description of dependent variables

Sewage treatment base costs excluding cost items described in section 4 of the main consultation document.

Page 49: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

49

Comments on models

We use two alternative scale drivers, properties and total load. We think these are appropriate scale variables for sewage treatment. We control for economies of scale by including the proportion of load treated in small works of bands 1-3. Models 3 to 6 add treatment complexity variables.

All estimated coefficient have the expected sign and are statistically significant, except for the percent of load coming from trade effluent customers which is quite weak and not significant.

The reset test fails in all models. We tested specifications with quadratic and cross-product terms to allow for a more flexible relationship with the scale variable. This has not improved the reset test. The reset test should not be applied mechanically to exclude these models. Rather, it should prompt a specification search – which it did.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Consultation model ID OSWT1 OSWT2 OSWT3 OSWT4 OSWT5 OSWT6

Dependent variable -------- ln (sewage treatment base costs) --------

ln (properties) 1.000***

(0.000)

0.930*** (0.000)

0.899*** (0.000)

ln (load entering treatment works) 0.950*** (0.000)

0.884*** (0.000)

0.859*** (0.000)

% of load treated in STWs bands 1 to 3

0.053** (0.045)

0.054** (0.045)

0.056** (0.024)

0.058** (0.018)

0.056** (0.037)

0.058** (0.029)

% of biological load treated by STWs with an ammonia consent below 1mg

0.028** (0.011)

0.030*** (0.006)

0.028*** (0.004)

0.030*** (0.001)

% of load trade effluent customers received at treatment works

0.032 (0.621)

0.040 (0.516)

Constant 6.370*** (0.002)

3.869* (0.058)

7.154*** (0.001)

4.834** (0.017)

7.395*** (0.001)

5.183** (0.014)

R2 adjusted 0.868 0.864 0.896 0.897 0.898 0.907

VIF (max) 2.273 2.299 2.484 2.488 2.76 2.724

Reset test 0 0 0 0 0 0

Estimation method OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60

Template 44. Sewage treatment models proposed by Thames Water

Description of dependent variable

Sewage treatment botex = opex + capital maintenance (infra and non-infra)

Description of selected explanatory variables

Number of Works= Total number of works in each year for each company

Load Capacity Treatment Works=(Total Load Received)/(Total Number Of Works)

% tight consent NH3=〖NH3〗_(≤1mg/l)/(Total Load Received) x100%

Comments on models (Thames Water)

Based on a F-Test Cobb-Douglas (CD) is preferred over Translog

The scale variable estimations are strongly significant across all models, ranging from [0.89 to 0.95] suggesting the expected outcome of the presence of economies of scale

Time dummies or time trend don’t provide a significant effect.

The effect of Load Capacity of Treatment Works as a proxy for stock of capital yielded a statistical significance effect. However, all the models provide statistical evidence that there are still issues with omitted variables. This might be an indication that the stock of capital needs to be measured accurately as it is a fundamental part of the cost structure of botex.

Page 50: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

50

As an important driver for sewage treatment, we have explored the effect of quality in our models by using the following measure:

〖Quality Tight Consent Max〗_it=(〖Max〗_(Load Received_it) (〖NH3〗_(≤1mg/l) , 〖BOD〗_(≤7mg/l) , P_(≤0.5 mg/l) ))/〖Total Load Received〗_it X100%

This variable simply takes the maximum load received between all the three high/Tight consents in Ammonia (NH3), BOD and Phosphorus (P) as a proportion of the total load received. This measure captures the tight consents that companies are facing either in NH3, BOD or P. These consents are exogenously determined by the Environmental Agency and are without any management control. The estimated results for this variable ranges between [0.0394 and 0.0403] see models M2 and M3.

Specifically, the variables used in the quality measure are:

〖NH3〗_(≤1mg/l)=Load under 〖NH3〗_(≤1mg/l) in kg BOD5/day

〖BOD〗_(≤7mg/l)=Load under 〖BOD〗_(≤7mg/l) in kg BOD5/day

P_(≤0.5mg/l)=Load under P_(≤0.5mg/l) in kg BOD5/day

〖Total Load Received〗_it=Band 1+Band 2+Band 3+Band 4+Bnad 5+Above Band 5,all in kg BOD5/day

Results show a strong and stable relationship which is statistically significant over a large set of models

Regional wages show a positive effect as expected when using a pooled OLS. Initial results showed that the RE model tends to underestimate the effect of regional wages, and sometimes it produces a negative unexpected coefficient ruling out the use of this econometric model

Consultation model ID TMSSWT1 TMSSWT2

Company’s model ID 2 3

Dependent variable Ln(Botex Treatment)

Ln(Total Load Received) 0.956*** (0.000)

0.951*** (0.000)

% tight consent(NH3,BOD,P) 0.039*** (0.000)

0.040*** (0.000)

Ln(regional wages waste 2soc) 0.827

(0.504) 0.887

(0.510)

Ln(load capacity treatment works) -0.343** (0.029)

-0.346** (0.030)

Constant -7.496** (0.029)

-7.58** (0.039)

R2 adjusted 0.896 0.892

Reset test 0.000 0.000

VIF (max) 4.99 5.14

Method OLS OLS

N (sample size) 60 50

Template 45. Sewage treatment models proposed by United Utilities

Description of dependent variables

Wastewater treatment botex with selected enhancement expenditure, net of grants and contributions.

Botex = excludes business rates and third party services.

Enhancement areas that are substitutable with base costs can be integrated with base cost models. In some areas, companies can achieve a service outcome either through spending on enhancement or through more intensive operation or maintenance of their existing assets. Where this is the case, merging relevant enhancement lines into base cost may be expected to improve the explanatory power of the models, especially where the base models include explanatory factors that are causally related to the enhancement lines.

The dependent is included in its logged form and is in 2012/13 CPIH FYA prices.

Comments on models (United Utilities)

See United Utilities’ comments on wastewater collection models.

Page 51: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

51

Consultation model ID UUSWT1

Company’s model ID 6

Dependent variable ln(Wastewater treatment [incl selected enhancement])

% of population living in urban areas (Arup/Vivid) 2.263** (0.017)

Log(Total load received) 0.984***

(0)

% load received by WwTW bands 1-3 12.116*** (0.003)

% load received by WwTW with tertiary treatment (TA1/TA2/TB1/TB2)

0.275 (0.402)

2012-13 dummy 0.059

(0.113)

2013-14 dummy 0.024 (0.66)

2014-15 dummy 0.016

(0.818)

2015-16 dummy 0.059

(0.366)

2016-17 dummy 0.072

(0.236)

Constant -10.21***

(0)

R2 adjusted 0.897

VIF (max) 5.89

Reset test 0.0000

Estimation method OLS

N (sample size) 60

Template 46. Sewage treatment models proposed by Wessex Water

Description of dependent variable

Sewage treatment botex smoothed = Opex + IRE + average MNI over period – third party costs – local authority rates – abstraction charges

Comments on models (Wessex Water)

Variation 1 models: These are our Endogenous STW models. Variation 2 models: These are our Exogenous STW models. All models below produce very similar results with unsmoothed expenditure.

Consultation model ID WSXSWT1 WSXSWT2 WSXSWT3 WSXSWT4

Company’s model ID 2v1 2v2 4v1 4v2

Dependent variable Ln(Smooth ST botex) Ln(Smooth unit ST botex per load)

Total load (BOD) 0.710*** (0.000)

0.758*** (0.000)

Average size of works (total load / total works)

0.0450 (0.137)

-0.007 (0.841)

Ofwat measure of highly dense areas 0.142

(0.691)

-0.268 (0.404)

Proportion of load undergoing tertiary treatment

0.066 (0.878)

0.042 (0.923)

0.356 (0.535)

0.405 (0.463)

Constant -4.452** (0.014)

-4.852** (0.023)

-8.032*** (0.000)

-8.022*** (0.000)

R2 adjusted 0.88 0.863 0.068 0.12

VIF (max) 1.71 1.70 1.69 1.68

Reset test 0.000 0.000 0.111 0.004

Page 52: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

52

Estimation method OLS OLS OLS OLS

N (sample size) 60 60 60 60

2.3 Bioresources plus models

Template 47. Bioresources plus models proposed by Ofwat

Description of dependent variables

Bioresources and sewage treatment base costs, excluding cost items described in section 4 of the main consultation document.

Comments on models

These models combine variables used in our bioresources and sewage treatment models. All cost drivers have the expected sign and are statistically significant. The goodness of fit of all models is quite high, explaining at least 90 percent of the costs variance. All monetary values have been inflated to 2016-17 prices using the CPIH.

Consultation model ID OBP1 OBP1 OBP3 OBP4 OBP5 OBP6 OBP7

Dependent variable --------- ln (bioresources plus base costs) ---------

ln (properties) 0.963***

(0.000)

0.976*** (0.000)

0.779*** (0.000)

ln (load) 0.963*** (0.000)

0.911*** (0.000)

0.925*** (0.000)

0.746*** (0.000)

% load treated in STWs bands 1 to 3

0.047** (0.026)

0.047** (0.010)

0.050*** (0.002)

0.052** (0.013)

0.054*** (0.004)

% biological load treated by STWs with an ammonia consent below 1mg

0.012* (0.081)

0.012** (0.029)

% of intersiting work done by truck and tanker

-0.007*** (0.003)

-0.007*** (0.003)

-0.003** (0.037)

-0.004** (0.019)

% of sludge disposed to farmland

-0.011*** (0.000)

-0.013*** (0.000)

Constant 6.561*** (0.000)

8.298*** (0.000)

5.944*** (0.000)

7.654*** (0.000)

5.247*** (0.001)

9.788*** (0.000)

7.966*** (0.000)

R2 adjusted 0.919 0.946 0.953 0.933 0.937 0.903 0.904

VIF (max) 2.273 2.407 2.407 2.405 2.411 1.819 1.814

Reset test 0 0.073 0.003 0 0 0 0

Estimation method OLS OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60 60

2.4 Sewage collection models

Template 48. Sewage collection models proposed by Ofwat

Description of dependent variable

Sewage collection base costs excluding cost items described in section 4 of the main consultation document.

Comments on models

We use volume and connected properties as alternative scale variables.

Page 53: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

53

We consider that the volume of wastewater is a strong cost driver. The cost of running the wastewater network will be driven more by the volume of wastewater being conveyed to the treatment works rather than by the pollutant load of the wastewater. The volume, rather than the pollutant load, will affect pumping costs and the size of pipes, which in turn have an influence on maintenance costs.

The number of connected properties is also a good output driver. However, while it will capture the volume of domestic wastewater, it may not capture the amount of surface water entering the system.

An alternative scale driver not present in our models is sewer length, which performs similarly well.

We included the number of network pumping stations per sewer length to account for network complexity. An alternative to the number of pumping stations might be the capacity of pumping stations, which seems to produce good results as well.

The variables percent of new mains and percent of gravity sewers rehabilitated were included as additional drivers of maintenance costs.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Consultation model ID OSWC1 OSWC2 OSWC3 OSWC4 OSWC5

Dependent variable -------- ln (sewage collection base costs) --------

Log(connected properties) 0.796*** (0.000)

0.870*** (0.000)

0.858*** (0.000)

Log(volume) 0.772*** (0.000)

0.844*** (0.000)

Log(density) 0.703

(0.167) 0.856** (0.029)

Log(pumping stations per sewer length)

0.271** (0.046)

Pumping station per length (not log)

4.502** (0.023)

3.431** (0.046)

3.485* (0.074)

% of gravity sewer rehabilitated

0.294 (0.337)

0.368 (0.158)

0.337 (0.181)

Log(lengths replaced or renewed post 2001)

-0.063* (0.056)

% of lengths replaced or renewed post 2001

-0.007 (0.380)

-0.01 (0.281)

Constant 5.77** 3.58** 6.80*** 5.37*** 5.62***

R2 adjusted 0.889 0.907 0.882 0.886 0.896

VIF (max) 1.168 1.27 2.468 2.337 2.355

Reset test 0.361 0.021 0.032 0.014 0.005

Estimation method OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60

Template 49. Sewage collection models proposed by Thames Water

Description of dependent variable

Sewage collection botex = opex + capital maintenance expenditure (infra and non-infra)

Description of selected explanatory variables

Length of Mains=Total length of "legacy" public sewers as at 31 March

Property Density=(Total Number of connected Properties)/(Length of Public Sewers)

Pumping station Capacity= Total Pumping station capacity (Source: Cost Assessment November 2017, Waste Network sheet)

Page 54: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

54

VolWasteColl=Volume of wastewater receiving treatment at sewage treatment works. This is a good proxy for total volume of wastewater collected as one of the dimensions of the output in sewage collection. (Source: Cost Assessment November 2017, Waste Network sheet)

Population EQV=Current Population Equivalent served by STWs

Comments on models (Thames Water)

We have tested the type of functional form, Cobb-Douglas (CD) vs. Translog, and the results suggest that a CD functional form is more appropriate in this part of the wastewater value chain

We have use as a scale coefficient the total length of sewers mains (where mains=Total length of "legacy" public sewers as at 31 March). The results are quite robust and significant across all the models ranging between 0.64 and 0.73.

Pumping station capacity as a proxy for stock of capital shows a significant effect in all the models explored. Its effect ranged between [0.12 and 0.20]

All our models suggest that there are no issues with omitted variables

The estimated results show a positive regional wages effect. However, model M4 underestimate its effect when using a random effect econometric model (a similar result is found in sewage treatment)

From M4 to M6, we have included another important dimension of the output by including the effect of volume of wastewater collected proxied as the volume of wastewater receiving treatment at sewage treatment works as a proportion of population equivalent. The results provide statistical significant evidence of the important impact of this driver in sewage collection botex. Its effect ranges from 0.33 to 0.35, when excluding the RE model 4

Consultation model ID TMSSWC1 TMSSWC2 TMSSWC3

Company’s model ID 4 6 7

Dependent variable Ln(Botex Collection)

Ln(Mains) 0.660*** 0.679*** 0.657***

(0.000) (0.000) (0.000)

Ln(Property Density) 1.270** 1.281** 1.244***

(0.011) (0.011) (0.000)

Ln(Regional Wage_waste_2soc) 0.470 0.682 0.597

(0.465) (0.318) (0.286)

Ln(Pumping Station Capacity) 0.127** 0.101* 0.124***

(0.033) (0.062) (0.003)

Ln(VolWasteColl/Population EQV) 0.347** 0.339* 0.354***

(0.029) (0.054) (0.008)

Time -0.005

(0.653)

Constant -3.270 -3.680 -3.556*

(0.140) (0.133) (0.069)

R2 adjusted 0.925 0.928 0.924

Reset test 0.411 0.728 0.470

VIF (max) 2.21 2.37 2.73

Method OLS OLS OLS

N (sample size) 60 50 60

Template 50. Sewage collection models proposed by United Utilities

Description of dependent variable

Model 1 uses sewage collection botex as its dependent variable. Model 5 uses wastewater collection botex, with selected enhancement expenditure as its dependent variable.

Botex has been derived by subtracting total enhancement expenditure (table 9, line 36), business rates (table 8 line 8) and third party services (table 8 lines 10 and 18) from net totex (table 8 line 21) for each of the respective value chains.

Each dependent variable includes smoothed base capex which minimises the impact of spikes. In order to prevent further reduction in the amount of data points available, smoothing is undertaken by adjusting actual

Page 55: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

55

base capex using a ‘smoothing factor’, which is the ratio between the smoothed (using an extended dataset) and unsmoothed for each company for each year in the sample period.

The dependent variable is included in its logged form and is in 2012/13 CPIH FYA prices.

Comments on models (United Utilities)

These models incorporate the findings of the Arup and Vivid Economics reports published alongside this consultation. Engineering criteria consider whether model explanatory variables represent factors that will cause costs in AMP7 and whether the sign and magnitude of model coefficients are consistent with these causal narratives. These criteria thus consider models’ predictive plausibility directly. Statistical criteria are more limited because they appraise models’ predictive power only through models’ performance in historical datasets. With a large number of causal narratives to account for and limited data available, all models will predict costs with error and biases that affect companies in different ways. By choosing suites of models with different underlying assumptions or drawbacks, errors and biases can be reduced though not eliminated, which will improve the accuracy of predictions and reduce risks. The use of a diverse set of models is more likely to achieve this than a set of very similar models, whose errors and biases will be highly correlated with each other.

By choosing suites of models with different underlying assumptions or drawbacks, errors and biases can be reduced though not eliminated, which will improve the accuracy of predictions and reduce risks. The use of a diverse set of models is more likely to achieve this than a set of very similar models, whose errors and biases will be highly correlated with each other.

Consultation model ID UUSWC1 UUSWC2

Company’s model ID 1 5

Dependent variable Sewage collection botex Sewage collection (incl selected

enhancement)

Log(total sewer length) 0.371* (0.065)

0.382** (0.033)

Log(Annual urban runoff) (Arup/Vivid) 0.328

(0.214) 0.283

(0.197)

% of population living in urban areas (Arup/Vivid)

0.731 (0.48)

1.190** (0.032)

2012-13 -0.113 (0.457)

-0.081 (0.477)

2013-14 -0.0351 (0.719)

0.0389 (0.643)

2014-15 -0.029 (0.691)

0.084 (0.373)

2015-16 -0.0659 (0.568)

-0.069 (0.409)

2016-17 0.0092 (0.906)

0.005 (0.924)

Constant -2.319* (0.08)

-2.307** (0.033)

R2 adjusted 0.834 0.856

VIF (max) 14.77 14.77

Reset test 0.000 0.0437

Estimation method OLS OLS

N (sample size) 60 60

Template 51. Sewage collection models proposed by Wessex Water

Description of dependent variables

Sewerage Botex = Opex + IRE + Average MNI over period – Third party costs – Local authority rates – Abstraction charges

Comments on models (Wessex Water)

Page 56: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

56

The main cost driver is the number of connected properties. The main issue we faced was how to model density. We found that the inclusion of measures of density based on sewerage area per connected property produced robust results. The models include aggregate and unit cost models based on the number of connected properties, and the linear and quadratic term accounting for population density.

All models below provide very similar results with unsmoothed expenditure.

Consultation model ID WSXSWC1 WSXSWC2

Company’s model ID 2 4

Dependent variable Ln(sewage collection botex

smoothed) Ln(sewage collection botex per

property smoothed)

Connected Properties 0.685*** (0.000)

Sewage Catchment area per 1k properties -0.210 (0.144)

0.036 (0.771)

Sewage Catchment area per 1k properties ^2 -0.143 (0.695)

-0.655** (0.075)

Constant -0.639 (0.421)

-2.929*** (0.000)

R2 adjusted 0.895 0.328

VIF max 3.48 1.67

Reset test 0.024 0.000

Estimation method OLS OLS

N (sample size) 60 60

2.5 Network plus wastewater models

Template 52. Network plus wastewater models proposed by Ofwat

Description of dependent variable

Network plus wastewater base costs excluding cost items described in section 4 of the main consultation document. These costs include sewage collection and treatment base costs.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

The models contain cost drivers that are relevant for sewage collection or treatment in terms of scale, density, or complexity.

In addition to the variables described in sewage collection and treatment models, we also test the sewer length as an alternative scale variable. The coefficient on the density variable increases dramatically when including length of sewers in the model. This is because of the relationship between the variables – density is defined as properties per length of sewer. While the coefficient on density may appear high, it should be considered together with the coefficient on length. Short sewer length can contribute to high density, as a result the two coefficients will offset each other to provide, arguably, a plausible outcome.

Page 57: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

57

Consultation model

ID

ONPWW1

ONPWW2

ONPWW3

ONPWW4

ONPWW5

ONPWW6

ONPWW7

ONPWW8

ONPWW9

ONPWW10

Dependent variable

----------------- ln (network plus wastewater base costs) -----------------

ln (properties)

.769*** (0.000)

.774*** (0.000)

.721*** (0.000)

ln (load) .732***

(0.000)

.738*** (0.000)

.690*** (0.000)

ln (volume) .738***

(0.000)

.746*** (0.000)

ln (sewer length)

.769*** (0.000)

.738*** (0.000)

ln (density) .703*** (0.002)

0.688** (0.011)

0.435 (0.436)

1.47*** (0.000)

% lengths of sewer laid post 2001

-.020*** (0.000)

-.018*** (0.000)

-.016** (0.011)

-.020*** (0.000)

-.019** (0.012)

-.017** (0.012)

-.015** (0.018)

-0.018 (0.166)

-.017*** (0.003)

-.016*** (0.004)

% of load, ammonia consent < 1mg

0.019** (0.013)

0.018** (0.021)

Constant 5.16*** 7.11*** 7.67*** 5.17*** 8.10*** 9.99*** 9.43*** 11.8*** 8.84*** 10.6***

R2 adjusted 0.925 0.923 0.888 0.925 0.905 0.904 0.882 0.836 0.92 0.917

VIF (max) 1.016 1.02 1.027 1.025 1.007 1.011 1.015 1.01 1.279 1.291

Reset test 0.003 0 0.025 0.003 0 0 0.032 0 0.008 0.002 Estimation method

OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size)

60 60 60 60 60 60 60 60 60 60

Template 53. Network plus wastewater models proposed by Anglian Water

Description of dependent variables

Average system models (1-6): Natural log of wastewater network plus botex excluding rates per system

Passing Distance models: (7-11): Natural log of wastewater network plus botex excluding rates

Acronyms used in explanatory variables

p.e = population equivalent

Comments models (Anglian Water)

All models are described in detail in our Cost Modelling report – Phase 2, published March 2018: http://www.anglianwater.co.uk/about-us/thinking-about-our-future/

Translog model forms will inevitably see increased multicollinearity (as measured by VIF). This is the downside of the trade-off between the explanatory power of specific coefficients and the additional explanatory power of the model consequent on the inclusion of interacting terms. It is worth noting that in most cases, the coefficients quoted are significant. Furthermore, multicollinearity does not invalidate the model; it just makes it more difficult to interpret specific coefficients.

Page 58: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

58

Consultation model ID ANHNPWW1 ANHNPWW2 ANHNPWW3 ANHNPWW4 ANHNPWW5 ANHNPWW6 ANHNPWW7 ANHNPWW8 ANHNPWW9 ANHNPWW10 ANHNPWW11

Company’s model ID 1 2 3 4 5 6 7 8 9 10 11

Dependent variable Botex exc rates per system Botex exc rates

Ln(p.e. x(1-Sparsity <600km2)) Unit: Population

0.383*** (0.004)

0.327*** (0.000)

0.382*** (0.000)

0.361*** (0.0)

0.342*** (0.0)

0.338*** (0.0)

0.388*** (0.004)

Ln(p.e x Sparsity <600km2) Unit: Population

1.133*** (0.003)

1.129*** (0.0)

1.037*** (0.0)

0.946*** (0.0)

0.758** (013)

0.464*** (0.001)

0.486*** (0.000)

Ln((p.e x(1- Sparsity <600km2)) x ln(Total length of sewer) Populationxkm

-0.862* (0.094)

-0.528** (0.022)

-0.761*** (0.01)

-0.467*** (0.005)

-0.580* (0.056)

-0.271** (0.043)

0.417*** (0.000)

0.490*** (0.000)

0.365*** (0.000)

0.331*** (0.001)

Ln(p.e. Sparsity <600km2) x ln(Total length of sewer) Unit: Populationxkm

1.636** (0.028)

1.448*** (0.000)

1.504*** (0.002)

1.190*** (0.000)

0.947 (0.115)

0.366 (0.12)

0.535*** (0.000)

0.594

Ln(p.e.(1- Sparsity <600km2))^2 Unit: Population2

0.619 (0.221)

0.264 (0.199)

0.525* (0.084)

0.272* (0.065)

0.462 (0.105)

0.276** (0.019)

Combined sewer length as % total sewer length

0.011** (0.028)

0.010*** (0.0)

0.010*** (0.004)

0.008*** (0.0)

0.011*** (0.001)

0.008*** (0.0)

Pump capacity / # Water Recycling Centres (kW/system)

0.002 (0.17)

0.002*** (0.0)

Ln(p.e. x % indigenous sludge) Unit: Population

0.417*** (0.000)

0.4901*** (0.000)

0.365*** (0.000)

0.331*** (0.000)

Ln(p.e. x (1- % indigenous sludge)) Unit: Population

0.535***

(0.000)

0.594***

(0.000)

0.445***

(0.000)

0.286***

(0.000)

Ln(Total length of sewer) 0.275* (0.065)

0.288** (0.040)

-0.035 (0.821)

0.392*** (0.000)

0.511*** (0.003)

Ln(# Water Recycling Centres) x ln(Total length of sewer) Unit: kmxsystem

-0.172** (0.019)

-0.437*** (0.000)

-0.463*** (0.001)

-0.299*** (0.000)

-0.180 (0.344)

Ln(Total length of sewer)^2 0.461*** (0.000)

0.563*** (0.000)

0.612*** (0.001)

0.540*** (0.000)

0.380* (0.077)

Combined sewer length as % total sewer length

0.009*** (0.000)

0.010*** (0.000)

0.004*** (0.009)

0.009*** (0.000)

0.009*** (0.000)

Page 59: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

59

Consultation model ID ANHNPWW1 ANHNPWW2 ANHNPWW3 ANHNPWW4 ANHNPWW5 ANHNPWW6 ANHNPWW7 ANHNPWW8 ANHNPWW9 ANHNPWW10 ANHNPWW11

Company’s model ID 1 2 3 4 5 6 7 8 9 10 11

Pump capacity/ Total length of sewer Unit: kW/km

0.1221*** (0.000)

0.146*** (0.000)

0.126*** (0.000)

0.130*** (0.000)

2013 dummy 0.112** (0.014)

0.113 (0.121)

0.123** (0.026)

2014 dummy 0.117*** (0.009)

0.117 (0.109)

0.026 (0.119)

0.025 (0.14)

0.022 (0.544)

0.114** (0.039)

0.020

(0.276)

2015 dummy 0.024

(0.606) 0.020

(0.786) 0.031* (0.072)

0.031* (0.068)

0.029 (0.431)

0.014 (0.8)

0.025

(0.172)

2016 dummy 0.128*** (0.005)

0.125* (0.088)

0.073*** (0.0)

0.071***

(0.0) 0.068* (0.073)

0.099* (0.073)

0.057*** (0.003)

2017 dummy 0.197***

(0.0) 0.193*** (0.009)

0.110*** (0.0)

0.076** (0.045)

0.102*** (0.0)

0.097** (0.012)

0.155*** (0.006)

0.049** (0.04)

0.081*** (0.000)

Constant -0.163* (0.099)

-0.141** (0.015)

-0.0883 (0.21)

-0.035 (0.136)

-0.068 (0.294)

-0.037 (0.218)

0.048** (0.015)

-0.036 (0.402)

0.0267 (0.398)

0.031** (0.046)

0.014 (0.778)

R2 adjusted 0.942 0.933 0.972 0.969 0.981 0.981 0.973 0.949 0.926 0.984 0.983

Reset test N/A 0.0001 N/A N/A N/A 0.000 0.000 0.040 0.000 0.370 N/A

VIF (max) N/A 71.57 N/A N/A N/A 70.62 59.13 25.49 22.99 23.74 N/A

Method RE OLS RE RE RE OLS RE OLS RE RE RE

N (sample size) 60 60 50 50 50 50 60 60 50 50 50

Page 60: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

60

Template 54. Network plus wastewater models proposed by Southern Water

Description of dependent variables

Modelled OPEX plus modelled base CAPEX.

Modelled OPEX is total OPEX less third party services, abstraction charges and local authority rates.

Modelled base CAPEX is maintenance expenditure in infrastructure and non-infrastructure less grants and contributions.

All costs are unsmoothed and deflated to 2016/17 prices using CPIH.

Comments on models (Southern Water)

The two network+ models are similar to models 1 and 3 of the BOTEX models, providing alternative approaches to control for pumping capacity per length of sewer. These models also appear to estimate coefficients that are operationally intuitive with reasonable statistical properties.

Consultation model ID SRNNPWW1 SRNNPWW2

Company’s model ID 1 2

Dependent variable ln (Network+ BOTEX)

Total number of properties (log) (000s)

0.704*** (0.000)

0.679*** (0.000)

Proportion of load with BOD<10mg/L and amm<1mg/L (%)

4.227*** (0.001)

4.410*** (0.001)

Pumping station capacity per km sewer (kW/km) 0.074*** (0.002)

Pumping station capacity per km sewer (log) (kW/km) 0.198***

(0.000)

Proportion of area with more than 4000 people per km2 (%)

-0.956*** (0.000)

-0.992*** (0.001)

Constant -0.402 (0.376)

-0.193 (0.683)

R2 adjusted 0.910 0.914

VIF (max) 4.745 4.679

Reset test 0.034 0.005

Estimation method OLS OLS

N (sample size) 60 60

Template 55. Network plus wastewater models proposed by Severn Trent Water

Description of dependent variable

Models 8-11: Network plus botex gross of grants and contributions

Model 12: Network plus unit cost botex

Description of selected explanatory variables

Load Total load received, kg BOD5/day

No. of STW's Total number of sewage treatment works

Density Properties/mains length

Weighted density Ofwat's new weighted density index

Page 61: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

61

Tight BOD and N3 consents

Constructed from old June returns data and recent APR's. This is the sum of the number of tight BOD (<10mg/l) and tight ammonia (<5mg/l) consents at large STWs (band 6).

No. of tertiary works

This is the number of large (band 6) works that have a tertiary treatment stage.

Prop. Load with tight N3 consent

This is the proportion of load that has an ammonia consent of 3mg/l or less. Engineering logic informs us that it would be better to have include the load with consents of between 3mg/l and 5mg/l also but this data was not readily available.

Length/Load Length of sewerage mains divided by load

No. of STW's/load No. of STW's divided by load

Sludge vol Total volume of sludge produced (ttds)

Av. Distance intersiting

Total intersiting "work" done divided by sludge vol. (km/yr)

Av. Distance intersiting via pipe

Intersiting "work" done by pipeline divided by sludge vol. (km/yr)

% anaerobic digestion

% of sludge treated with anaerobic digestion (conventional and advanced)

Av. Distance to disposal

Total disposal "work" divided by total sludge vol. (km/yr)

% collocated sites % of sludge treated at a site of STC and STW collocation

Comments on models (Severn Trent Water)

Model 8 uses network plus base costs as the dependent variable. The coefficients are in line with expectations.

Model 9 extends model 8 by adding non-linear terms in the load and no. of STW’s variables. Our prior expectations on the sum of the first order load, STW’s and treatment variables remain unchanged, and have broadly been met in this model.

Model 10 changes only the treatment variable and once again our prior expectations are broadly met. The random effects version of the model poses the same problems as in model 9, with the treatment variable of a negligible size and highly insignificant.

Model 11 extends model 8 with non-linear terms in density and load and also changes the treatment variable. The expression of the treatment variable as a proportion of load also changes our prior expectations with the load and number of works variables which are now expected to sum to around 1 in the presence of constant returns to scale. These coefficients come in broadly in line with expectations, as does the coefficient for the treatment variable.

Model 12 is a unit cost model with all drivers scaled by load. This again imposes an assumption of constant returns to scale in load.

Page 62: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

62

Consultation model ID SVTNPWW1 SVTNPWW2 SVTNPWW3 SVTNPWW4 SVTNPWW5

Company’s model ID 8 9 10 11 12

Dependent variable Ln(Network plus waste base costs)

Ln(Load) .54*** (.00)

.32** (.02)

.4*** (.00)

.67*** (.00)

Ln(Length/Load) .61

(.16)

Ln(Density) 1.37** (.00)

1.2*** (.00)

1.4*** (.00)

1.4*** (.00)

1.6*** (.00)

Ln (No. of STW’s/Load) .31*** (.00)

Ln (No. of STW’s) .4** (.00)

.41*** (.00)

.4*** (.00)

.4*** (.00)

Ln (Sum of tight BOD and N3 permits)

.1** (.02)

.17** (.00)

Prop. of load subject to tight ammonia consent

.23 (.2)

.21 (.3)

Ln(Load)^2 -.03 (.55)

-.14* (.058)

.06 (.2)

Ln(Density)^2 4.78*** (.00)

Ln(No. of STW’s)^2 .12

(.67) .04

(.75)

Ln(Load) X Ln(No. of STW’s)

-.33** (.047)

-.37*** (.00)

Ln(large tertiary works) .27*** (.00)

Dummy 2012 -.05 (.2)

-.06 (.23)

-.03 (.3)

-.04 (.2)

-.06 (.17)

Dummy 2013 -.03 (.3)

-.03 (.33)

-.01 (.78)

-.02 (.48)

-.03 (.36)

Dummy 2014 -.03 (.4)

-.03 (.45)

-.001 (.98)

-.02 (.47)

-.03 (.36)

Dummy 2015 -.02 (.65)

.02 (.63)

-.001 (.97)

-.01 (.78)

-.01 (67)

Dummy 2016 -.01 (.72)

-.01 (.65)

-.002 (.95)

-.01 (.66)

-.01 (.65)

Constant 5.34*** (.00)

5.3*** (.00)

5.37*** (.00)

5.2*** (.00)

-7.5*** (.00)

R2 adjusted .96 .97 .98 .97 .76

Reset test 0.01 0.001 0.01 0.01 0.01

VIF max 7.7 (Load) 22 (Load) 14 (Load) 7.9 (Load) 5.1

Method OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60

Template 56. Network plus wastewater models proposed by South West Water

Description of dependent variable

Network+ = Sewage collection + Sewage treatment

Modelled OPEX = Network+ OPEX – Network+ Third party – Network+ pensions – Network+ Local authority rates

Modelled base CAPEX = Network+ Maintenance infra + Network+ Maintenance non-infra – Network+ grants and contributions

Modelled BOTEX = Modelled OPEX + Modelled base CAPEX

Page 63: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

63

Modelled BOTEX+ (growth) enhancement = modelled BOTEX + network+ first time sewerage + network+ sludge enhancement (growth) + network+ new developments and growth + network+ growth at sewage treatment works + network+ resilience + network+ reduce flooding risk for properties

Modelled TOTEX = modelled BOTEX + network+ other capital expenditure infra + network+ other capital expenditure non-infra + network+ infrastructure network reinforcement

Unsmoothed net costs from 2011/12 to 2016/17

Comments on models (South West Water)

We have adopted the same approach to modelling network plus wholesale wastewater costs as for aggregate wholesale wastewater costs, as there were no bioresources-specific drivers in our aggregate models. We have not, at this stage, examined the appropriateness of different estimation approaches. We do note, however, that some models seem more robust than others and clearly this will have implications for identifying relative efficiency.

See our aggregate wholesale wastewater BOTEX submission for a more detailed review of the drivers considered, which were:

Scale

Sparsity/economies of scale

Local environmental sensitivities (tightness of consents)

Costs of operating and maintaining network assets

Holiday population

We have explored specifications which capture several of these factors, although due to the nature of the data it is not possible to combine all factors into one model. All models capture scale, pumping costs and tightness of consents.

As with aggregate BOTEX, models 1, 2, 5, 6, 9 and 10 also use a metric of population density or sparsity. See the aggregate BOTEX submission for more detail on the rationale for this choice of driver.

As with aggregate BOTEX, models 3, 7 and 11 use the number of sewage treatment works. See the aggregate BOTEX submission for more detail on the rationale for this choice of driver.

As with aggregate BOTEX, models 4, 8 and 12 use the ratio of non-resident to resident population. See the aggregate BOTEX submission for more detail on the rationale for this choice of driver.

We have extended our aggregate BOTEX modelling to models controlling for BOTEX + growth enhancement and TOTEX (see discussion in wholesale wastewater models). As can be seen from the efficiency ranges in the Excel document, while modelling BOTEX+ (growth) does not widen the efficiency ranges, including quality enhancement to model TOTEX does lead to somewhat broader efficiency ranges. As for wholesale water models, we would recommend that BOTEX+ (growth) and TOTEX modelling approaches are explored to the fullest possible extent at PR19.

All of the BOTEX models estimate statistically significant coefficients which are supported from an operational and economic perspective. The relationship between cost and cost drivers in BOTEX+ (growth) and TOTEX models is broadly similar to that estimated in BOTEX models, although not all coefficients pass statistical significance tests. We would note that almost all models considered have significant coefficients on tightness of consents and one or both of pumping capacity and/or a measure of sparsity/economies of scale. This would suggest that these drivers have the strongest statistical relationship with cost.

Given our focus on modelling what we consider to be key industry drivers of cost, we have not explored estimation approaches beyond OLS with robust standard errors. We will be considering the most appropriate estimation approaches as part of our consultation response.

All models are broadly robust from a statistical perspective.

Adjusted R2 is sufficiently high.

VIF (a measure of collinearity) is well below the ‘rule of thumb’ threshold of 10.

We find mixed evidence from the RESET test on whether the model would be improved by the addition of polynomial terms, i.e. given the control variables, whether the model is mis-specified.

Page 64: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

64

Consultation model ID SWBNPW

W1 SWBNPW

W2 SWBNPW

W3 SWBNPW

W4 SWBNPW

W5 SWBNPW

W6 SWBNPW

W7 SWBNPW

W8 SWBNPW

W9 SWBNPW

W10 SWBNPW

W11 SWBNPW

W12

Company’s model ID Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12

Dependent variable Network+ BOTEX (ln) Network+ BOTEX+ (growth) (ln) Network+ TOTEX (ln)

Properties (ln) 0.699*** (0.000)

0.798*** (0.000)

0.741*** (0.000)

0.982*** (0.000)

0.656*** (0.000)

0.740*** (0.000)

0.720*** (0.000)

0.945*** (0.000)

0.732*** (0.000)

0.829*** (0.000)

0.674*** (0.000)

0.984*** (0.000)

Pumping capacity over mains (ln)

0.179*** (0.000)

0.186*** (0.000)

0.166*** (0.000)

0.187*** (0.000)

0.090** (0.040)

0.086* (0.073)

0.088** (0.026)

0.085* (0.066)

0.165*** (0.003)

0.203*** (0.001)

0.161*** (0.000)

0.171*** (0.002)

Proportion of load with BOD<10mg/L and amm<1mg/L

2.173*** (0.000)

1.907*** (0.001)

1.580*** (0.002)

0.325 (0.509)

2.030*** (0.000)

1.934*** (0.000)

1.514*** (0.001)

0.191 (0.662)

2.473*** (0.000)

1.819*** (0.005)

1.604*** (0.003)

0.837 (0.137)

Number of combined sewer overflow per km sewer (ln)

0.159** (0.024)

0.214** (0.011)

0.163** (0.022)

0.107* (0.066)

0.158** (0.021)

0.101* (0.059)

0.170** (0.027)

0.210*** (0.008)

0.171** (0.016)

Proportion of area with more than 2,000 people per km2

-0.517*** (0.007)

-0.458** (0.011)

-0.450** (0.032)

Proportion of area with less than 250 people per km2

0.463** (0.017)

0.501*** (0.008)

0.127

(0.607)

Number of treatment works per property (ln)

0.161** (0.011)

0.143** (0.015)

0.009

(0.894)

Ratio of non-resident to resident population

0.041*** (0.004)

0.046*** (0.001)

0.037** (0.0252)

Constant 0.179

(0.713) -0.820* (0.094)

-0.474 (0.260)

-2.036*** (0.007)

0.566 (0.199)

-0.346 (0.446)

-0.130 (0.755)

-1.710** (0.011)

0.253 (0.748)

-0.533 (0.409)

0.186 (0.729)

-1.718* (0.050)

R2 adjusted 0.893 0.889 0.881 0.895 0.902 0.901 0.896 0.910 0.889 0.881 0.870 0.891

Reset test 0.000 0.000 0.001 0.000 0.002 0.001 0.000 0.000 0.000 0.001 0.003 0.015

VIF max 7.743 6.336 4.556 8.892 7.743 6.336 4.556 8.892 7.743 6.336 4.556 8.892

Method OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60 60 60 60 60 60 60

Page 65: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

65

Template 57. Network plus wastewater models proposed by Thames Water

Description of dependent variable

Sewage Network Plus botex = opex + capital maintenance expenditure (infra and non-infra)

Description of selected explanatory variables

𝑇𝑜𝑡𝑎𝑙 𝐿𝑜𝑎𝑑 𝑅𝑒𝑐𝑒𝑖𝑣𝑒𝑑𝑖𝑡 = 𝐵𝑎𝑛𝑑 1 + 𝐵𝑎𝑛𝑑 2 + 𝐵𝑎𝑛𝑑 3 + 𝐵𝑎𝑛𝑑 4 + 𝐵𝑛𝑎𝑑 5 +𝐴𝑏𝑜𝑣𝑒 𝐵𝑎𝑛𝑑 5, 𝑎𝑙𝑙 𝑖𝑛 𝑘𝑔 𝐵𝑂𝐷5/𝑑𝑎𝑦

𝐿𝑜𝑎𝑑 𝐶𝑎𝑝𝑎𝑐𝑖𝑡𝑦 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑊𝑜𝑟𝑘𝑠 =𝑇𝑜𝑡𝑎𝑙 𝐿𝑜𝑎𝑑 𝑅𝑒𝑐𝑒𝑖𝑣𝑒𝑑

𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑂𝑓 𝑊𝑜𝑟𝑘𝑠

𝑄𝑢𝑎𝑙𝑖𝑡𝑦 𝑇𝑖𝑔ℎ𝑡 𝐶𝑜𝑛𝑠𝑒𝑛𝑡 𝑀𝑎𝑥𝑖𝑡 =𝑀𝑎𝑥𝐿𝑜𝑎𝑑 𝑅𝑒𝑐𝑒𝑖𝑣𝑒𝑑_𝑖𝑡{𝑁𝐻3≤1𝑚𝑔/𝑙 , 𝐵𝑂𝐷≤7𝑚𝑔/𝑙 , 𝑃≤0.5 𝑚𝑔/𝑙}

𝑇𝑜𝑡𝑎𝑙 𝐿𝑜𝑎𝑑 𝑅𝑒𝑐𝑒𝑖𝑣𝑒𝑑𝑖𝑡𝑋100%

𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑦 𝐷𝑒𝑛𝑠𝑖𝑡𝑦 =𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑒𝑑 𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑖𝑒𝑠

𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑃𝑢𝑏𝑙𝑖𝑐 𝑆𝑒𝑤𝑒𝑟𝑠

Pumping station Capacity= Total Pumping station capacity (Source: Cost Assessment November 2017, Waste Network sheet

For regional wages (2 soc) we use the latest version from January 2018.

Comments on models (Thames Water)

The scale variable estimations are strongly significant across all models, ranging from [0.90 to 1.05] suggesting the presence of economies of scale

We run network plus models controlling density with the weighted average population (wad) yielding a high estimated coefficient for Regional wages (1.205), whereas when density is controlled by property density and estimated by OLS the models produce a sensible estimation for regional wage ranging between [0.576, 1.01], but not statistically significant in any model.

Some interesting results showed that models tend to have higher adjusted R2 when controlling density by Property Density versus the case when it uses population density (wad)

There is a consistent failure in the RAMSEY Reset Test for omitted variables in all the wastewater network plus models but it is less severe than the sewage treatment case. This might be an indication that the problem remains in the sewage collection models as none of the models run in treatment passed the test. This might be explained by the way the stock of capital is measure.

We have tested time dummies and time trend variables with no relevant significant effects

Finally, The quality variable proposed in sewage treatment, 𝑄𝑢𝑎𝑙𝑖𝑡𝑦 𝑇𝑖𝑔ℎ𝑡 𝐶𝑜𝑛𝑠𝑒𝑛𝑡 𝑀𝑎𝑥𝑖𝑡, has produced

consistent and strong significant effects across all the specifications and models ranging between [0.021 and 0.024].

Page 66: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

66

Consultation model ID TMSNPWW1 TMSNPWW2

Company’s model ID 2 3

Dependent variable Ln(Totex Water NetworkPlus)

Ln (Total Load Received) 1.049*** 0.903***

(0.000) (0.000)

Prp Tight Consents Max(NH3, BOD, P) (%) 0.022*** 0.024***

(0.001) (0.000)

Ln(Regional Wages waste 2soc) 0.988 0.576

(0.237) (0.591)

Ln(Property Density) 1.309 1.149

(0.004) (0.003)

Ln(Load Capacity Treatment Works) -0.422*** -0.349***

(0.000) (0.006)

Ln(Pumping Station Capacity) 0.118**

(0.031)

Time -0.014

(0.303)

Constant -4.609*** -3.584***

(0.087) (0.197)

R2 adjusted 0.950 0.959

Reset test 0.018 0.003

VIF (max) 7.13 12.13

Method OLS OLS

N (sample size) 50 60

Template 58. Network plus wastewater models proposed by United Utilities

Description of dependent variable

Wastewater network plus botex, net of grants and contributions.

It excludes business rates and third party services.

Each dependent variable includes smoothed base capex which minimises the impact of spikes.

Comments on models

See United Utilities’ comments on wastewater collection models.

Page 67: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

67

Consultation model ID UUNPWW1

Company’s model ID 4

Dependent variable ln(Wastewater network plus botex)

Log(total load received) 0.864***

(0)

% load received by WwTW bands 1-3 10.687*** (0.003)

% of population living in urban areas (Arup/Vivid)

2.432** (0.022)

% load received by WwTW with tertiary treatment (TA1/TA2/TB1/TB2)

0.259 (0.332)

2012-13 dummy 0.072** (0.028)

2013-14 dummy 0.051

(0.132)

2014-15 dummy 0.030

(0.444)

2015-16 dummy 0.059

(0.159)

2016-17 dummy 0.068

(0.144)

Constant -8.207*** (0.001)

R2 adjusted 0.928

VIF (max) 6.89

Reset test 0.0001

Estimation method OLS

N (sample size) 60

Template 59. Network plus wastewater models proposed by Welsh Water

Description of dependent variables

Wastewater Network Plus includes costs for Sewage Collection and Sewage Treatment

Wastewater Network Plus Botex = “Total Operating Expenditure” – “Third Party Services” – “Local authority and Cumulo rates” + “Maintaining the long term capability of the assets – infra” + “Maintaining the long term capability of the assets - non-infra”

Values rebased to 2016/17 using CPIH in line with the PR19 Methodology Statement.

Comments on models (Welsh Water)

The Wastewater Network Plus model submitted is similar to the aggregate model. This is to be expected as Wastewater Network Plus consists of more than 80% of the wholesale botex on average across the industry.

The model includes a scale variable, the number of connected properties, alongside variables to capture density, treatment complexity and maintenance drivers.

The model does not pass the model specification test (reset) at the specified level. Due to the relatively small sample for the wastewater industry, coupled with relatively stable cost drivers over time we have placed importance on the model consistency and interpretability from an economic perspective.

Page 68: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

68

Consultation model ID WSHNPWW1

Company’s model ID 6

Dependent variable Ln(Wastewater Botex)

Ln(Connected Properties) (000s)

0.795*** (0.001)

Ln(Pumping Station Capacity per km of sewer) (kW/km)

0.182** (0.034)

Ln(Number of combined sewer overflows per km of combined sewer) (nr/km)

0.203 (0.154)

% of load with BOD<10mg/L and Ammonia <1 mg/L (%) 3.918** (0.016)

% of area with less than 250 people per km2 0.448

(0.163)

Constant -0.774 (0.293)

R2 adjusted 0.904

VIF (max) 6.336

Reset test 0.001

Estimation method OLS

N (sample size) 60

Template 60. Network plus wastewater models proposed by Yorkshire Water

Description of dependent variable

Network plus wastewater network plus base costs = operating expenditure less third party services and local authority rates + capital maintenance expenditure net of grants and contributions (G&C).

The dependent variable is deflated using CPIH to 2016/17 prices. No smoothing was undertaken.

Comments on models (Yorkshire Water)

The Network+ models proposed are similar to the aggregate wholesale models. As above, general limitations and modelling observations highlighted above are applicable here.

The models appear to estimate coefficients of the right sign and appropriate magnitude, and are robust to the various statistical tests.

The statistical performance of the models are broadly consistent with and without G&C.

Page 69: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

69

Consultation model ID YKYNPWW1 YKYNPWW2 YKYNPWW3 YKYNPWW4 YKYNPWW5

Company’s model ID 1 2 3 4 5

Dependent variable Network+ BOTEX

Total number of properties (log) (000s)

0.817*** (0.000)

0.699*** (0.000)

0.724*** (0.000)

0.846*** (0.000)

0.795*** (0.000)

% load with BOD<10mg/L and amm<1mg/L

2.399* (0.079)

4.346** (0.011)

4.288*** (0.002)

1.746* (0.081)

2.995** (0.048)

Pumping station capacity per km sewer (log) (kW/km)

0.238*** (0.001)

0.179** (0.034)

0.213*** (0.001)

0.332*** (0.002)

0.290*** (0.002)

Number of combined sewer overflows per km of sewer (log) (nr/km)

0.202 (0.200)

0.159 (0.194)

0.0580 (0.576)

% of area with more than 2000 people per km2

-0.517 (0.158)

% of area with more than 4000 people per km2

-0.923*** (0.001)

-0.502* (0.060)

% of sewers that are combined sewer

0.858*** (0.008)

0.574* (0.079)

Constant -0.681 (0.385)

0.179 (0.791)

-0.377 (0.597)

-1.832* (0.058)

-1.326 (0.185)

R2 adjusted 0.882 0.893 0.914 0.921 0.926

VIF (max) 6.280 7.743 6.645 4.810 6.435

Reset test 0.000 0.000 0.003 0.646 0.012

Estimation method OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60

2.6 Wholesale wastewater models

Template 61. Wholesale wastewater models proposed by Ofwat

Description of dependent variable

Wholesale wastewater base costs = bioresources, treatment and collection base costs, excluding cost items described in section 4 of the main consultation document.

Comments on models

We used connected properties or load treated as a volume driver.

The coefficient of the number of pumping stations is not significant. If we use capacity of pumping stations instead of number this becomes significant. We will consider the appropriate measure based on responses to this consultation.

All coefficients have the expected sign and plausible magnitude. We considered alternative, more flexible, specification in light of the failure of the Reset test. This search did not yield a better model and we consider that despite the low Reset tests the models are appropriate.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Page 70: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

70

Consultation model ID

OWWW1

OWWW 2

OWWW3

OWWW4

OWWW5

OWWW6

OWWW7

OWWW8

Dependent variable

--------------- ln (wholesale wastewater base costs) ---------------

Ln(properties) 0.976***

(0.000) 0.961*** (0.000)

0.975*** (0.000)

LN(load) 0.877*** (0.000)

0.852*** (0.000)

0.924*** (0.000)

0.910*** (0.000)

0.921*** (0.000)

% lengths replaced post 2001

-0.013** (0.013)

-0.015** (0.021)

-.013*** (0.003)

-.015*** (0.002)

ln (pumping stations per sewer length)

0.141

(0.207)

% load treated in STWs bands 1-3

0.034* (0.079)

0.019 (0.386)

0.061*** (0.000)

0.052*** (0.000)

0.048*** (0.000)

0.066*** (0.000)

0.055*** (0.000)

0.050*** (0.000)

% load from trade effluent customers

0.069*** (0.010)

0.087*** (0.001)

% sludge disposed to farmland

-.008*** (0.001)

-.009*** (0.000)

Ln(density) 1.170***

(0.009) 0.667* (0.087)

0.742*** (0.003)

1.317*** (0.001)

0.688** (0.032)

0.775*** (0.000)

Constant 8.25*** (0.000)

9.01*** (0.000)

2.23 (0.139)

5.51*** (0.009)

4.45*** (0.001)

-0.94 (0.402)

3.08* (0.076)

1.81* (0.050)

R2 adjusted 0.946 0.951 0.958 0.963 0.966 0.963 0.967 0.971

VIF (max) 2.35 3.545 2.838 2.631 2.671 2.913 2.669 2.699

Reset test 0.001 0 0.01 0.002 0.001 0.002 0 0.01

Method OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60 60 60

Template 62. Wholesale wastewater models proposed by Southern Water

Description of dependent variable

y = modelled OPEX plus modelled base CAPEX.

Modelled OPEX is total OPEX less third party services, abstraction charges and local authority rates.

Modelled base CAPEX is maintenance expenditure in infrastructure and non-infrastructure less grants and contributions.

All costs are unsmoothed and deflated to 2016/17 prices using CPIH.

Comments on models (Southern Water)

The four models provide alternatives in the following areas:

Pumping station capacity – moving sewage around is a key driver of wastewater costs. Models 1 and 2 control for pumping station capacity per length of sewer in levels, while models 3 and 4 control for pumping station capacity per length of sewer in logs. While regulatory precedent indicate modelling this variable in logarithms, we have presented both alternatives.

Bioresources drivers – models 2 and 4 control for sludge treatment and transport to account for variation in bioresources costs, whilst models 1 and 3 control for density/sparsity as an alternative (possibly capturing some aspect of the need for sludge transport).

Page 71: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

71

Consultation model ID SRNWWW1 SRNWWW2 SRNWWW3 SRNWWW4

Company’s model ID 1 2 3 4

Dependent variable ln (Wholesale wastewater BOTEX)

Total number of properties (log) (000s) 0.714*** (0.000)

0.771*** (0.000)

0.697*** (0.000)

0.732*** (0.000)

% of load with BOD<10mg/L and amm<1mg/L

3.798*** (0.000)

2.140* (0.074)

3.926*** (0.000)

2.397** (0.045)

Pumping station capacity per km sewer (kW/km)

0.056*** (0.007)

0.0538*** (0.003)

Pumping station capacity per km sewer (log) (kW/km)

0.153*** (0.000)

0.138*** (0.000)

% of area with more than 4000 people per km2

-0.863*** (0.000)

-0.892*** (0.000)

% of sludge treated using AD or AAD -0.305* (0.086)

-0.265 (0.121)

Total measure of intersiting 'work' done (all forms of transportation) per unit sludge produced (log) (km/year)

0.142** (0.016)

0.143** (0.020)

Constant -0.211 (0.600)

-0.774 (0.238)

-0.0710 (0.840)

-0.496 (0.386)

R2 adjusted 0.939 0.927 0.943 0.928

VIF (max) 4.745 6.429 4.679 5.930

Reset test 0.164 0.0353 0.0354 0.0241

Estimation method OLS OLS OLS OLS

N (sample size) 60 60 60 60

Template 63. Wholesale wastewater models proposed by Severn Trent Water

Description of dependent variables

Models 1-5: Wholesale botex gross of grants and contributions Models 6-7: Wholesale unit cost botex

Description of selected explanatory variables

Load Total load received, kg BOD5/day No. of STW's Total number of sewage treatment works Density Properties/mains length Tight BOD and N3 consents

Constructed from old June returns data and recent APR's. This is the sum of the number of tight BOD (<10mg/l) and tight ammonia (<5mg/l) consents at large STWs (band 6).

No. of tertiary works

This is the number of large (band 6) works that have a tertiary treatment stage.

Prop. Load with tight N3 consent

This is the proportion of load that has an ammonia consent of 3mg/l or less. Engineering logic informs us that it would be better to have include the load with consents of between 3mg/l and 5mg/l also but this data was not readily available.

Length/Load Length of sewerage mains divided by load No. of STW's/load No. of STW's divided by load

Comments on models (Severn Trent Water)

Model 1 OLS : Model 1 presents a log-linear model with the sum of tight BOD and ammonia consents acting as the treatment cost driver. The coefficients are all broadly of a magnitude that we would expect, are all significant, and our prior expectations on the 3 scale related coefficients is met, with the three summing almost exactly to 1.

Model 2 OLS: Model 2 extends model 1 with non-linear terms in the load and no. of STW variables as well as an interaction term between the two. The high correlation between the interaction term and other variables in this model led to some coefficient inaccuracy (although our prior hypotheses on the three core variables were still not rejected and all were correctly signed and of a sensible magnitude) which was rectified by changing the treatment variable to the number of tertiary works. Following discussion with Reckon (“Review of Severn

Page 72: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

72

Trent’s sewerage cost models”, Reckon for Severn Trent (2018), who argued against inclusion of an interaction term, we constructed model 3.

Model 3 OLS: Model 3 adds non-linear terms in load and the number of STW’s to model 1. The variables are all statistically significant and of a logical magnitude and our prior expectations are broadly met. The absence of the interaction term improves model stability substantially.

Model 4 OLS: Model 4 is the random effects version similar to model 3 with the addition of an interaction term included. The interaction term appears to slightly reduce the stability of the model, however, the coefficients remain in line with our expectations (with or without the interaction term). It should be noted that the use of the number of works with tight ammonia consents only as a measure of treatment complexity also works quite well in models 1-4, in that the coefficients are in line with expectations. However, these models tend to have greater problems with multicollinearity with many variables insignificant.

Model 5 OLS: This model is more like Ofwat’s PR14 specifications with non-linear terms in the density and load terms. The expression of the treatment variable as a proportion of load also changes our prior expectations on the load and number of works variables which are now expected to sum to around 1 in the presence of constant returns to scale. These coefficients come broadly in line with expectations.

Model 6 OLS: Model 6 is a unit cost model with all drivers scaled by load. This imposes an assumption (which we consider rather arbitrary) of constant returns to scale in load. While we would have preferred to scale by the number of properties, we found it difficult to obtain sensible coefficients when we adopted that approach.

Model 7 OLS: Model 7 changes only the treatment variable with most other coefficients remaining a similar magnitude to model 6.

Consultation model ID SVTWWW

1 SVTWWW

2 SVTWWW

3 SVTWWW

4 SVTWWW

5 SVTWWW

6 SVTWWW

7

Company’s model ID 1 2 3 4 5 6 7

Dependent variable Ln(botex waste) Ln(botex waste per unit load)

Ln(Load) .56*** (.00)

.49*** (.00)

.58*** (.00)

.47** (.02)

.66*** (.00)

Ln(Length/Load) .68

(.14) .42

(.25)

Ln(Density) 1.07** (.04)

1.05** (.03)

1.06** (.02)

1.15*** (.00)

1.1*** (.00)

1.5*** (.01)

1.31*** (.00)

Ln (No. of STW’s) .34** (.01)

.34** (.02)

.36** (.02)

.39*** (.00)

.35*** (.00)

Ln (No. of STW’s/Load) .28*** (.00)

.32*** (.00)

Ln (Sum of tight BOD and N3 permits)

.09** (.03)

.09** (.03)

.1 (.105)

Ln(Load)^2 -.09 (.28)

.04* (.096)

.01 (.9)

Ln(No. of STW’s)^2 .08 (67)

.07 (.77)

.2 (.4)

Ln(Load) X Ln(No. of STW’s)

-.32 (.16)

-.27 (.25)

Ln(large tertiary works) .17** (.04)

Prop. of load subject to tight ammonia consent

.25* (.08)

.14 (.56)

Prop. tight BOD load .71

(.23)

Ln(Load)^2 .06

(.11)

Ln(Density)^2 4.95*** (.00)

Dummy 2012 -.02 (.58)

-.01 (.8)

-.02 (.58)

-.02 (.6)

-.01 (.76)

-.02 (.6)

-.03 (.54)

Dummy 2013 -.004 (.00)

.01 (.8)

-.004 (.87)

-.004 (.86)

-.01 (.8)

-.002 (.96)

-.005 (.88)

Dummy 2014 -.002 (.93)

.014 (65)

-.002 (.9)

-.002 (.9)

-.004 (.88)

-.003 (.91)

-.004 (.89)

Page 73: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

73

Consultation model ID SVTWWW

1 SVTWWW

2 SVTWWW

3 SVTWWW

4 SVTWWW

5 SVTWWW

6 SVTWWW

7

Company’s model ID 1 2 3 4 5 6 7

Dummy 2015 .003 (.92)

.01 (.7)

.003 (.9)

.003 (.9)

-.01 (.72)

.007 (.79)

.004 (.86)

Dummy 2016 -.006 (.7)

-.00 (.9)

-.006 (.7)

-.001 (.68)

-.006 (.67)

-.004 (.78)

-.004 (.76)

Constant 5.5*** (.00)

5.5*** (.00)

5.49*** (.00)

5.5*** (.00)

5.37*** (.00)

-7.3*** (.00)

-7.3*** (.00)

R2 adjusted .96 .97 .96 .98 .97 .76 .79

Reset test 0.004 0.00 0.00 0.00 0.04 0.001 0.11

VIF (max) 7.7 14 8.4 21.9 7.9 5.1 5.5

Method OLS OLS OLS RE OLS OLS OLS

N (sample size) 60 60 60 60 60 60 60

Template 64. Wholesale wastewater models proposed by South West Water

Description of dependent variable

Modelled OPEX = OPEX – third party – pensions – local authority rates

Modelled base CAPEX = maintenance infra + maintenance non-infra – grants and contributions

Modelled BOTEX = modelled OPEX + modelled base CAPEX

Modelled BOTEX+ (growth) enhancement = modelled BOTEX + first time sewerage + sludge enhancement (growth) + new developments and growth + growth at sewage treatment works + resilience + reduce flooding risk for properties

Modelled TOTEX = modelled BOTEX + other capital expenditure infra +

other capital expenditure non-infra + infrastructure network reinforcement

Unsmoothed net costs from 2011/12 to 2016/17

Explanatory factors

Data on explanatory factors is taken from the Ofwat industry data-share. Measures of density and sparsity are Ofwat constructed data, using ONS statistics.

Comments on models (South West Water)

We have focused on capturing the key drivers of costs in wholesale wastewater that are operationally robust and statistically valid. We have not, at this stage, examined the appropriateness of different estimation approaches. We do note, however, that some models seem more robust than others and clearly this will have implications for identifying relative efficiency.

The key drivers we have focused on for aggregate wholesale wastewater modelling are:

Scale (properties): there are significant benefits from economies of scale in wastewater services. Properties represents the most appropriate scale driver for aggregate wastewater costs as it captures simultaneously the volume of waste that requires treatment and the size of the network as captured by the number of connections.

Sparsity/economies of scale: the cost of providing wastewater services to a dispersed customer base spread out across a company’s operating area is substantially greater than for serving major urban conurbations. This is most apparent in wastewater treatment, where there are large economies of scale in the size of treatment works (for example, a number of companies have extremely large wastewater treatment works approaching 1,000,000 p.e. and up to 3,000,000 p.e.).

Local environmental sensitivities (tightness of consents): depending on local environmental sensitivities, companies face different costs in treating and disposing of waste. To capture these differences we control for the impact of tight consents as these are outside of management control. (We note, however, that UV consents are only available in the large wastewater dataset, so we have not been able to control for such consents).

Costs of operating and maintaining network assets: the two key asset types we have identified as driving maintenance costs are pumping stations/capacity (driven by the topography and sparsity of the region) and the number of combined sewer overflows (driven by topography and climate).

Holiday population: a large increase in the population in the summer months increases costs over and above treating the same total wastewater flows in a steady state due to the need to build peak

Page 74: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

74

capacity and to ramp up and down the treatment process (manifesting as higher chemical and energy use and more maintenance needs).

We have explored specifications which capture several of these factors, although due to the nature of the data it is not possible to combine all factors into one model. All models (i) capture scale, (ii) pumping costs and (iii) tightness of consents.

Models 1, 2, 5, 6, 9 and 10 also use a metric of population density or sparsity to capture the impact of serving populations in remote rural locations. When we include a density measure we find it has a negative effect (in contrast to a positive effect in water) due to the beneficial economies of scale in treatment works discussed above. These metrics fall entirely outside management control, and so can be regarded as an entirely exogenous driver of costs. However, as a fairly generic index of population sparsity it may not capture differences in sewage collection, treatment and bioresources that are explained by factors other than population density, such as topography. These factors constrain economic transport distances and rationalisation potential as well as increasing the unit cost of operational and maintenance activities, even where population density is similar.

As such, Models 3, 7 and 11 use the number of sewage treatment works to more directly capture the incremental additional costs of transporting sewage to a greater number of small sewage treatment works, the economies of scale that companies with fewer larger treatment works serving large urban centres are able to achieve, and the additional bioresources costs that result from having many dispersed sewage treatment works and sludge treatment centres.

As an alternative, Models 4, 8 and 12 use the ratio of non-resident to resident population to control for the impact of the large variation in flows that areas with more holiday population face. This metric falls entirely outside of management control and so can be regarded as an exogenous driver of costs.

We have extended our aggregate BOTEX modelling to models controlling for BOTEX + growth enhancement and TOTEX. We have used the same BOTEX drivers as in our aggregate BOTEX models, as the regional operating characteristics increasing or decreasing BOTEX are also likely to affect the cost of delivering many enhancement solutions. We were not able to include direct measures of differences in the amount of growth or quality enhancement within our econometric modelling.

While these models do not include an enhancement specific driver, they do meet many of the statistical criteria set out by Ofwat (see below). As can be seen from the efficiency range charts, while modelling BOTEX+ (growth) does not widen the efficiency ranges, including quality enhancement to model TOTEX does lead to somewhat broader efficiency ranges.

As for wholesale water models, we would recommend that BOTEX+ (growth) and TOTEX modelling approaches are explored to the fullest possible extent at PR19.

All of the BOTEX models estimate statistically significant coefficients which are supported from an operational and economic perspective. The relationship between cost and cost drivers in BOTEX+ (growth) and TOTEX models is broadly similar to that estimated in BOTEX models, although not all coefficients pass statistical significance tests. We would note that most models considered have significant coefficients on tightness of consents and one or both of pumping capacity and/or a measure of sparsity/economies of scale. This would suggest that these drivers have the strongest statistical relationship with cost.

Given our focus on modelling what we consider to be key industry drivers of cost, we have not explored estimation approaches beyond OLS with robust standard errors. We will be considering the most appropriate estimation approaches as part of our consultation response.

All models are broadly robust from a statistical perspective.

Adjusted R2 is sufficiently high.

VIF (a measure of collinearity) is well below the ‘rule of thumb’ threshold of 10.

We find mixed evidence from the RESET test on whether the model would be improved by the addition of polynomial terms, i.e. given the control variables, whether the model is mis-specified.

Page 75: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

75

Consultation model ID SWBWWW1 SWBWWW2 SWBWWW3 SWBWWW4 SWBWWW5 SWBWWW6 SWBWWW7 SWBWWW8 SWBWWW9 SWBWWW10 SWBWWW11 SWBWWW12

Company’s model ID 1 2 3 4 5 6 7 8 9 10 11 12

Dependent variable Aggregate BOTEX (ln) Aggregate BOTEX+ (growth) (ln) Aggregate TOTEX (ln)

Properties (ln) 0.619*** (0.000)

0.751*** (0.000)

0.752*** (0.000)

0.920*** (0.000)

0.606*** (0.000)

0.708*** (0.000)

0.722*** (0.000)

0.897*** (0.000)

0.688*** (0.000)

0.809*** (0.000)

0.702*** (0.000)

0.974*** (0.000)

Pumping capacity over mains (ln)

0.095** (0.019)

0.120*** (0.006)

0.124*** (0.001)

0.125*** (0.004)

0.036 (0.396)

0.045 (0.332)

0.064 (0.107)

0.046 (0.327)

0.103** (0.050)

0.146*** (0.010)

0.127*** (0.005)

0.119** (0.027)

Proportion of load with BOD<10mg/L and amm<1mg/L

2.290*** (0.000)

1.744*** (0.000)

1.400*** (0.000)

0.248 (0.523)

2.170*** (0.000)

1.886*** (0.000)

1.438*** (0.000)

0.268 (0.403)

2.619*** (0.000)

1.850*** (0.000)

1.551*** (0.000)

0.721 (0.119)

Number of combined sewer overflow per km sewer (ln)

0.0625 (0.277)

0.129* (0.074)

0.081

(0.205) 0.037

(0.419) 0.094* (0.089)

0.041

(0.351) 0.104

(0.139) 0.155** (0.028)

0.113* (0.086)

Proportion of area with more than 2,000 people per km2

-0.661*** (0.000)

-0.536*** (0.000)

-0.565*** (0.003)

Proportion of area with less than 250 people per km2

0.457*** (0.002)

0.475*** (0.002)

0.192

(0.376)

Number of treatment works per property (ln)

0.144*** (0.006)

0.112** (0.024)

0.0102 (0.864)

Ratio of non-resident to resident population

0.037*** (0.001)

0.042*** (0.000)

0.039*** (0.006)

Constant 0.810** (0.049)

-0.428 (0.349)

-0.322 (0.416)

-1.527** (0.011)

0.983** (0.011)

-0.0519 (0.902)

0.0321 (0.938)

-1.293** (0.017)

0.631 (0.385)

-0.366 (0.549)

0.174 (0.742)

-1.603** (0.029)

R2 adjusted 0.930 0.916 0.915 0.920 0.938 0.932 0.929 0.939 0.917 0.905 0.898 0.915

Reset test 0.016 0.000 0.000 0.000 0.108 0.000 0.000 0.000 0.003 0.000 0.002 0.001

VIF (max) 7.743 6.336 4.556 8.892 7.743 6.336 4.556 8.892 7.743 6.336 4.556 8.892

Method OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS

N (sample size) 60 60 60 60 60 60 60 60 60 60 60 60

Page 76: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

76

Template 65. Wholesale wastewater models proposed by United Utilities

Description of dependent variable

Model 7’s dependent variable is wastewater botex, which includes selected enhancement expenditure.

Botex has been derived by subtracting total enhancement expenditure (table 9, line 36), business rates (table 8 line 8) and third party services (table 8 lines 10 and 18) from net totex (table 8 line 21) for each of the respective value chains.

The dependent variable for these models has been adjusted to include selected enhancement expenditure. Enhancement areas that are substitutable with base costs can be integrated with base cost models. In some areas, companies can achieve a service outcome either through spending on enhancement or through more intensive operation or maintenance of their existing assets. Where this is the case, merging relevant enhancement lines into base cost may be expected to improve the explanatory power of base cost models, especially where the base models include explanatory factors that are causally related to the enhancement lines.

The dependent variable includes expenditure associated with NEP - Event Duration Monitoring at intermittent discharges, NEP - Monitoring of pass forward flows at CSOs, Odour, New development and growth, Growth at sewage treatment works (excluding sludge treatment), Resilience, SEMD, Reduce flooding risk for properties and Transferred private sewers and pumping stations.

The dependent variable includes smoothed base capex which minimises the impact of spikes.

For all models, the dependent is included in its logged form and is in 2012/13 CPIH FYA prices.

Comments on models (United Utilities)

See United Utilities’ comments on wastewater collection models.

Consultation model ID UUWWW1

Company’s model ID 7

Dependent variable ln(Wastewater botex + selected enhancement

expenditure)

Log(total load received) 0.879***

(0)

% of load received by WwTW bands 1 to 3 6.249* (0.097)

% population living in urban area (Arup/Vivid) 1.111

(0.258)

% of load received by WwTW with tertiary treatment 0.296

(0.162)

2012-13 dummy 0.027

(0.354)

2013-14 dummy 0.043** (0.026)

2014-15 dummy 0.057* (0.065)

2015-16 dummy -0.003 (0.917)

2016-17 dummy 0.037

(0.355)

Constant -6.857*** (0.005)

R2 adjusted 0.939

VIF (max) 6.89

Reset test 0.002

Estimation method OLS

N (sample size) 60

Page 77: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

77

Template 66. Wholesale wastewater models proposed by Welsh Water

Description of dependent variable

Wastewater Botex = “Total Operating Expenditure” – “Third Party Services” – “Local authority and Cumulo rates” + “Maintaining the long term capability of the assets – infra” + “Maintaining the long term capability of the assets - non-infra”

2016-17 Cost Assessment Table 8 References:

Wastewater Botex = Line 11 – Line 10 – Line 8 + Line 12 + Line 13

Values rebased to 2016/17 using CPIH in line with the PR19 Methodology Statement.

Comments on models (Welsh Water)

The submitted botex models aim to capture key cost drives for the industry. The models include a scale variable, the number of connected properties, alongside variables to capture density, treatment complexity and drivers of maintenance.

The model’s estimated coefficients have the expected sign, magnitude and are statistically significant. The model has a sufficiently high R2 and is robust to outliers. The model does not pass the model specification test (reset) at the specified level. Due to the relatively small sample for the wastewater industry, coupled with relatively stable cost drivers over time we have placed importance on the model consistency and interpretability from an economic perspective.

Consultation model ID WSHWWW1

Company’s model ID 5

Dependent variable Ln(Wastewater Botex)

Ln(Connected Properties) (000s)

0.755*** (0)

Ln(Pumping Station Capacity per km of sewer) (kW/km)

0.120** (0.0439)

Ln(Number of combined sewer overflows per km of combined sewer) (nr/km)

0.124 (0.321)

% load with BOD<10mg/L and Ammonia <1 mg/L 3.541***

(0.00998)

% of area with less than 250 people per km2 0.437*

(0.0685)

Constant -0.423 (0.490)

R2 adjusted 0.929

VIF (max) 6.336

Reset test 0.000

Estimation method OLS

N (sample size) 60

Template 67. Wholesale wastewater models proposed by Yorkshire Water

Description of dependent variable

Wholesale wastewater base costs = operating expenditure less third party services and local authority rates + capital maintenance expenditure net of grants and contributions (G&C).

Modelled TOTEX(Growth) = modelled BOTEX + modelled growth enhancement expenditure.

Modelled BOTEX = OPEX less third party services, pension deficit recovery payments, and local authority rates were + capital maintenance net of grants and contributions.

Page 78: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

78

Modelled Growth enhancement expenditure = expenditure of “first time sewerage” + “New development and growth” + “Sludge enhancement (growth)” + “Growth at sewage treatment works (excluding sludge treatment)” + “Resilience” + “Reduce flooding risk for properties”.

The dependent variables are deflated using CPIH to 2016/17 prices. No smoothing was undertaken. Local authority rates were + capital maintenance net of grants and contributions.

Comments on models (Yorkshire Water)

The Aggregated BOTEX models are similar to the TOTEX (Growth) models, with the exclusion of the growth enhancement driver. General limitations and modelling observations highlighted under TOTEX (growth) are applicable here.

The models appear to estimate coefficients of the right sign and appropriate magnitude, and are robust to the various statistical tests.

Given lack of split of G&C for capital maintenance and enhancement expenditure, we have also modelled CAPEX on a gross basis. The statistical performance is broadly consistent with and without G&C.

The TOTEX (growth) models aim to explain variations in BOTEX through variation in scale, pumping requirements, treatment complexity, density/sparsity, maintenance drivers and a growth enhancement driver (properties growth).

CSOs per combined sewer was controlled in model 1, and proportion of combined sewers in models 2, 3, 4 as alternatives. From an operational point of view, CSOs rather than combined sewer might be a more appropriate driver of maintenance costs. Having said that, the estimated coefficient on this driver (model 1) appears to be statistically insignificant. While this may be due to a data paucity issue, the impact of this driver in the model appears less clear.

The density thresholds have an impact on the models (from a statistical as well as from an operational point of view). Density 1 measure (2000 people and above) might be more appropriate should variation across the industry be of importance. In Density measure 2 (4000 people and above) the data suggests only variation for Anglian, Severn Trent, Southern, Thames, United Utilities and Wessex. While we have explored both thresholds (and other density measures), the operational rationale for specific thresholds remain unclear.

The models appear to estimate coefficients of the right sign and appropriate magnitude, and broadly robust to the various statistical tests.

Consultation model ID YKYWWW1 YKYWWW2 YKYWWW3 YKYWWW4

Company’s model ID 1 2 3 4

Dependent variable ln (Wastewater Aggregate BOTEX)

Total number of properties (log) (000s)

0.619*** (0.000)

0.834*** (0.000)

0.765*** (0.000)

0.779*** (0.000)

% of load with BOD<10mg/L and amm<1mg/L 4.581*** (0.002)

1.569** (0.0167)

2.579*** (0.006)

2.923*** (0.005)

Pumping station capacity per km sewer (log) (kW/km)

0.0954* (0.088)

0.263*** (0.001)

0.217*** (0.006)

0.218*** (0.000)

Number of combined sewer overflows per km of sewer (log) (nr/km)

0.0625 (0.523)

% of area with more than 2000 people per km2 -0.661** (0.013)

-0.242 (0.254)

% of area with more than 4000 people per km2 -0.544*** (0.001)

% of combined sewers 0.714*** (0.001)

0.576*** (0.004)

0.407** (0.017)

Constant 0.810

(0.175) -1.424** (0.024)

-0.794 (0.138)

-0.875* (0.090)

R2 adjusted 0.930 0.942 0.942 0.948

VIF (max) 7.743 4.810 10.34 6.435

Reset test 0.016 0.044 0.047 0.046

Estimation method OLS OLS OLS OLS

N (sample size) 60 60 60 60

Page 79: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

79

Template 68. Wholesale wastewater plus models proposed by Yorkshire Water

Consultation model ID YKYWWWP5 YKYWWWP6 YKYWWWP7 YKYWWWP8

Company’s model ID 1 2 3 4

Dependent variable Agg. BOTEX (Growth)

Total number of properties (log) (000s)

0.616*** (0.000)

0.786*** (0.000)

0.716*** (0.000)

0.735*** (0.000)

Pumping station capacity per km sewer (log) (kW/km)

0.032 (0.550)

0.162** (0.020)

0.116 (0.127)

0.123** (0.022)

Number of combined sewer overflows per km of sewer (log) (nr/km)

0.043 (0.613)

% load with BOD<10mg/L and amm<1mg/L 4.328*** (0.006)

1.947** (0.017)

2.949** (0.022)

3.155*** (0.008)

% of area with more than 2000 people per km2 -0.529** (0.050)

-0.240 (0.375)

% of area with more than 4000 people per km2 -0.488* (0.098)

% of sewers that are combined sewers 0.534** (0.015)

0.397* (0.070)

0.259 (0.157)

% growth in number of properties 3.075

(0.297) 3.061

(0.259) 2.964

(0.256) 2.082

(0.503)

Constant 0.909

(0.190) -0.821 (0.225)

-0.193 (0.801)

-0.316 (0.569)

R2 adjusted 0.938 0.943 0.944 0.948

VIF (max) 7.856 4.827 10.34 6.481

Reset test 0.0626 0.0156 0.0324 0.0197

Estimation method OLS OLS OLS OLS

N (sample size) 60 60 60 60

Page 80: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

80

3 Retail models

3.1 Bad debt models

Template 69. Retail bad debt models proposed by Ofwat

Description of dependent variables

Bad debt plus debt management costs per household

The denominator, household, is the total number of connected households receiving either water only, wastewater only or dual services.

Comments on models

The two main variables in our debt per household models are average bill size and a proxy for the propensity to default.

We used three proxies for the propensity to default:

1. Percentage of households with default (eq_lpcf62)

2. Credit risk score derived from all Insight data (eq_rgc102). Higher credit score means a lower risk of default so we expected a negative coefficient as estimated.

3. The proportion of people experiencing income deprivation.

The first two variables were provided by United Utilities https://www.unitedutilities.com/corporate/about-us/our-future-plans/looking-to-the-future/ (see retail cost assessment) and are sourced from Equifax. The last variable (income deprivation domain) is sourced from the ONS (DCLG) and the Welsh Government.

The proportion of people in England and Wales experiencing income deprivation is calculated for each country. The criteria for the English and Welsh income deprivation measures are broadly similar, covering income related benefits, tax credit recipients and supported asylum seekers so we have combined the measures to obtain data for England and Wales.

The results were corroborated using other deprivation measures, such as unemployment rate and number of mortgage repossessions. The estimated coefficient provided a similar effect although the level of significance was slightly lower.

Models 3 and 4 include the total number of households as an additional explanatory variable to capture economies of scale. There is some evidence of economies of scale in models 3 and 4.

It is possible that bad debt costs could be impacted by different accounting policies adopted by companies, with a large effect on annual costs reported but without relation to the underlying drivers. In this context, we present model 6 where we averaged the data over the four-year period to smooth year-on-year volatility in reported costs.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Page 81: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

81

Consultation model ID ORDC1 ORDC2 ORDC3 ORDC4 ORDC5 ORDC6

Dependent variable --- ln(bad debt per household) --- sample avg

Ln(number of households)

-0.128* (0.083)

-0.032 (0.629)

-0.053 (0.601)

Ln(bill size)

1.160*** (0.000)

1.138*** (0.000)

1.341*** (0.000)

1.183*** (0.000)

1.095*** (0.000)

1.168*** (0.000)

HHs with default (%) (Eq_lpcf62)

0.050*** (0.006)

0.068*** (0.004)

Income deprivation domain (%)

0.058** (0.032)

Credit risk score (Eq_rgc102)

-0.032** (0.034)

-0.034** (0.034)

-0.036* (0.067)

Constant -5.479*** 0.393 -5.204*** 0.888 -4.580*** 1.467

R2 adjusted 0.79 0.773 0.803 0.771 0.774 0.789

VIF (max) 1.03 1.078 2.843 2.152 1.178 2.221

Reset test 0.146 0.257 0.153 0.352 0.018 0.477

Estimation method OLS OLS OLS OLS OLS OLS

N (sample size) 71 71 71 71 71 17

Template 70. Retail bad debt models proposed by Anglian Water

Description of dependent variable

All models are described in detail in our Cost Modelling report – Phase 2, published March 2018: http://www.anglianwater.co.uk/about-us/thinking-about-our-future/

Description of selected explanatory variables

Deprivation measure – 80th percentile for IMD with billing used as weight

Comments on models (Anglian Water)

Doubtful Debt and Debt Management model is expected to be a function of:

Average bill size

Customer numbers

Deprivation

Regional unemployment

Regional wages

Consultation model ID ANHRDC1

Company’s model ID 2

Dependent variable Doubtful debt & debt management

Ln(Average bill size) 0.26** (0.050)

Ln(Revenue2) 0.096*** (0.000)

Deprivation measure 0.762* (0.055)

Time trend -0.030 (0.227)

Constant

-1.870** (0.018)

R2 adjusted 0.9564

Reset test 0.0005

VIF (max) 3.21

Method OLS

N (sample size) 89

Page 82: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

82

Template 71. Retail bad debt models proposed by United Utilities

Description of dependent variable

Natural logarithm of bad debt costs per household, where bad debt costs is debt management plus doubtful debt.

Comments on models (United Utilities)

Households are counted as one regardless of whether they receive one or two services; this is not a unique customer measure. These models capture economies of scope through the bill size independent variable.

We have found bill size and deprivation to be significant drivers of bad debt cost.

These models all perform well in diagnostic tests not included in the pro-forma, including the Linktest and Shapiro-Wilk test.

Predicted IMD was constructed using a range of factors also provided by Equifax. More details can be found in Reckon LLP (2017) “Capturing deprivation and arrears risk in household retail cost assessment”.

All models are discussed in more detail in Reckon LLP (2018) published alongside this consultation.

Price base is in 2017 CPI terms.

Consultation model ID UURDC1 UURDC2

Company’s model ID BD1_d3 BD1_d5

Dependent variable Ln(bad debt per household)

Revenue per household (£/household) 1.142*** (0.00)

1.115*** (0.00)

Deprivation measure (Units vary across measures)

1.204* (0.085)

3.001** (0.024)

2014 dummy 0.157* (0.076)

0.159* (0.072)

2015 dummy 0.204** (0.038)

0.195** (0.042)

2016 dummy 0.136 (0.11)

0.121 (0.138)

Constant -4.287*** -4.553***

R2 adjusted 0.771 0.786

Reset test 0.428 0.393

VIF (max) 1.54 1.37

Estimation method OLS OLS

N (sample size) 71 71

Template 72. Retail bad debt models proposed by Severn Trent Water

Description of dependent variable

Doubtful debt plus Debt management costs

Description of selected explanatory variables

Bill to income ratio – average bill (total revenue/number of connected households) divided by weekly earnings. In the models, this is average weekly earnings of the lowest decile earners in the region.

Proportion of private rental properties - proportion of connected households that are rented.

Equifax credit risk (XPCF2) - Equifax “Partial Insight Postcode Event” – Average number of partial insight accounts or county court judgements per household.

Comments on models (Severn Trent Water)

All of the coefficients in the model seem reasonable and are in line with expectations.

Page 83: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

83

Consultation model ID SVTRDC1

Company’s model ID 2

Dependent variable Ln(Debt costs)

Ln (connected customers) 0.82*** (0.000)

Proportion metered -0.56** (0.000)

Ln(Bill to income ratio(10th percentile)) 1.34*** (0.000)

Proportion private rental property 0.05*** (0.01)

Equifax credit risk (XPCF2) 0.51** (0.00)

Constant 0.91

(0.13)

R2 adjusted 0.97

Reset test 0.2

VIF (max) 3.4

Estimation method OLS

N (sample size) 71

Template 73. Retail bad debt models proposed by South West Water

Description of dependent variable

Bad debt incudes doubtful debt and debt management

All costs are outturn and are not smoothed.

Description of selected explanatory variable

Deprivation: DCLG and Welsh government statistics

Council tax default rate: DCLG data on council tax collection rates, by local authority (2013/14–2016/17)

Prepayment: Ofwat data release for years 2013/14–2014/15, assumed same levels over AMP6

Comments on retail bad debt models (South West Water)

We have focused on capturing the effect of three key cost drivers:

scope, the number of dual customers a company serves;

bill size, which increases a company’s exposure to customers defaulting; and

deprivation, which increases the propensity of customers to default.

We have used income deprivation, collected by the ONS for England and Wales, to capture deprivation levels. We have also explored an alternative deprivation driver, the proportion of council tax defaults. Our bad debt models also include bill size and the Ofwat measure of unique customers from PR14 (the sum of all single customers + 1.3 × the sum of dual customers).

Given our focus on modelling what we consider to be key industry drivers of cost, we have not explored estimation approaches beyond OLS with robust standard errors. We will be considering the most appropriate estimation approaches as part of our consultation response.

All models are broadly robust from a statistical perspective.

The RESET test does not suggest that the model is miss-specified.

Modelling retail BOTEX or retail OPEX + depreciation has little impact on the model specification or efficiency ranges

Modelling for 4 years (2013/14-16/17) or for AMP6 only (2015/15-16/17) has little impact on the model specification or efficiency ranges

Ofwat comment: model 1 has been proposed also by Yorkshire Water and South Staffs Water.

Page 84: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

84

Consultation model ID YKYSSCSWBRDC1 SWBRDC2

Company’s model ID Model 1 Model 2

Dependent variable Bad debt (ln)

Unique customers, Ofwat constructed measure (ln) 0.911*** (0.000)

0.896*** (0.000)

Average combined bill (ln) 0.930*** (0.000)

0.974*** (0.000)

Income deprivation (ln) 0.841*** (0.003)

Council tax default rate (%) 0.198** (0.018)

Constant 6.212*** (0.000)

3.823*** (0.000)

R2 adjusted 0.950 0.949

Reset test 0.601 0.287

VIF (max) 2.96 3.381

Estimation method OLS OLS

N (sample size) 68 68

Template 74. Retail bad debt models proposed by Thames Water

Description of dependent variable

Bad debt costs incudes doubtful debt and debt management

Description of selected explanatory variables

Income deprivation AHC - measure of after-housing-costs (AHC) income deprivation established by combining ONS data on (before-housing-costs) income deprivation and HBAI data on AHC income deprivation (ONS, Department of Work and Pensions).

Total internal migration - propensity of people to migrate from/to UK local authorities, sum of inflows and outflows (ONS).

Total international migration - propensity of people to migrate from/to UK local authorities and abroad, sum of inflows and outflows (ONS).

Comments on models (Thames Water)

Transience, in our experience, is a key driver of our bad debt costs (to a lesser extent of customer service costs). The level of transience varies greatly across England and Wales. According to our analysis, transience is 20% higher for Thames Water compared to any other company.

The model output below shows that transience is a robust driver in some models of bad debt costs.

We consider that transience should be part of the mix of explanatory variables that Ofwat has regard to in developing its models of bad debt costs.

We refer to Economic Insight’s transience report (available on Water UK’s marketplace of ideas) for a more in-depth discussion of this matter.

Key findings

Both single and dual service customers are significant in models 1 to 4.

Total customers found to be significant in models 5 and 6, single service customers dropped on the ground of insignificance.

IMD income and wholesale bill variables found to be significant in all model where included.

The significance of the aforementioned variables across models strongly suggests that customer number, IMD income and wholesale bill variables should be included as drivers in models of bad debt costs.

The measure of transience measure (i.e. internal or international) has only a small impact on R2 and on the transience variable’s level of significance. This likely reflects the high degree of correlation between both measures.

Internal transience is significant at the 1% level in model 1, international transience is significant at 1% level in model 2 and at 5% level in model 6.

Page 85: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

85

Transience is not significant although it has the expected sign in models 3, 4 and 5.

Model 7 – Should a measure of deprivation account for housing costs?

Deprivation is a key driver of bad debt costs. This reflects that the risk of arrears is considerably greater for deprived customers. Models 1-6 use the IMD income measure published by the ONS. It measures the proportion of a local authority’s population with an income of <60% of the UK median income.

The IMD does not account for housing costs. We consider that a measure for the likelihood of households getting into arrears should account for housing costs. Housing costs account for a large proportion of household expenditures and is particularly high in London.

In model 7 we replaced the IMD with AHC – a deprivation measure that we developed by combining data on IMD income with HBAI data on AHC income deprivation. The AHC is statistically significant. Its lower coefficient relative to that of the IMD is a consequence of the proportion of deprived HHs being higher when accounting for housing costs.

Data on IMD income is published by ONS at the level of local authorities; data on AHC income deprivation by DWP but only at the regional level.

Consultation model ID

TMSRDC1 TMSRDC2 TMSRDC3 TMSRDC4 TMSRDC5 TMSRDC6 TMSRDC7

Company’s model ID

1 2 3 4 5 6 7

Dependent variable

Ln(bad debt related retail operating costs)

Ln(single service customers)

0.535*** (0.000)

0.513*** (0.000)

0.442*** (0.002)

0.482*** (0.000)

0.427*** (0.006)

Ln(dual service customers)

0.121*** (0.000)

0.119*** (0.000)

0.196*** (0.002)

0.183*** (0.002)

0.146*** (0.013)

Ln(total customers)

0.944*** (0.000)

0.919*** (0.000)

IMD income (ONS) (%)

0.189*** (0.000)

0.150*** (0.000)

0.168*** (0.002)

0.133*** (0.008)

0.082*** (0.000)

0.072*** (0.000)

Income AHC (%)

0.106*** (0.003)

Ln(wholesale bill)

1.744*** (0.000)

1.752*** (0.000)

1.188*** (0.003)

1.258*** (0.002)

1.153*** (0.000)

1.189*** (0.000)

1.532*** (0.000)

Total internal migration (%)

0.091*** (0.001)

0.101

(0.109)

0.030 (0.145)

Total international migration (%)

0.291*** (0.001)

0.160

(0.320)

0.131** (0.014)

Constant -14.37*** (0.000)

-13.21*** (0.000)

-10.95*** (0.000)

-10.22*** (0.000)

-11.91*** (0.000)

-11.65*** (0.000)

-11.14*** (0.000)

R2 adjusted 0.9333 0.9347 0.9315 0.9328 0.9619 0.9628 0.9223

Reset test 0.0004 0.0002 0.0031 0.0000 0.0076 0.0277 0.0000

VIF (max) 6.78 6.78 6.78 6.78 2.90 3.14 6.28

Method OLS OLS RE RE OLS OLS OLS

N (sample size)

89 89 89 89 89 89 89

Page 86: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

86

Template 75. Retail bad debt models proposed by Wessex Water and Bristol Water

Description of dependent variable

Bad debt and debt management costs

Comments on models (Wessex Water and Bristol Water)

The models were developed using an objective general to specific methodology, which was subject to academic peer review. This generated a suite of 16 econometric models:

Generalised models used a wide set of variables derived from a ‘first principles’ consideration of the drivers of retail costs.

Specific models were estimated taking a ‘liberal’ approach to statistical significance (i.e. including variables that were significant at levels approaching 10%).

‘Alternative’ models were estimated for total retail operating costs, which retained variables that were not significant, but were correctly signed.

Two approaches were used in the inclusion of scale (customer numbers) and scope (dual versus single service): Models ‘A’ include separate variables for the number of dual and single service customers. Models ‘B’ include a variable for total customer numbers, alongside the number of single service customers (where this remains after general to specific modelling).

We think that both approaches to the incorporation of scale and scope are valid, and each has advantages and disadvantages. Using separate dual and single service variables provides a very flexible specification, and the resulting models incorporate a wider range of potentially relevant variables. On the other hand, the coefficients are difficult to interpret, as some companies have no dual service customers. The alternative approach is less flexible, but provides more intuitive coefficient estimates.

Overall, we consider the models across the suite to be valid.

A full description of the work undertaken to arrive at these models is set out in a report by Economic Insight: ‘Household retail cost assessment for PR19: final report for Bristol and Wessex Water.

Consultation model ID WSXRDC1 WSXRDC2 WSXRDC3 WSXRDC4

Company’s model ID A2 A6 B2 B6

Dependent variable ln(bad debt related operating costs)

ln(total customers) 0.979*** (0.000)

0.933*** (0.000)

ln(single service customers) 0.535*** (0.000)

0.532*** (0.000)

ln(dual service customers) 0.121*** (0.000)

0.184*** (0.003)

IMD income (%) 0.189*** (0.000)

0.136*** (0.008)

0.067*** (0.000)

0.055* (0.071)

Property repossessions (%) 0.147** (0.015)

ln(average wholesale bill) 1.744*** (0.000)

1.235*** (0.002)

1.091*** (0.000)

1.165*** (0.000)

Internal population total flow (%) 0.091*** (0.001)

Constant -14.37*** (0.000)

-10.25*** (0.000)

-11.31*** (0.000)

-11.57*** (0.000)

R2 adjusted 0.933 0.926 0.962 0.964

Reset test 0.000 0.003 0.031 0.017

VIF (max) 6.78 6.78 2.07 2.62

Estimation method OLS RE OLS RE

N (sample size) 89 89 89 89

Page 87: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

87

Template 76. Retail bad debt models proposed by Yorkshire Water

Description of dependent variable

Doubtful debt + debt management costs

All costs are not deflated and are unsmoothed.

Comments on models (Yorkshire Water)

In developing bad debt models we have considered the same factors as for models of total BOTEX. We identified the same model specifications for bad debt costs as in total BOTEX, although the coefficients themselves are different.

The impact of bill size and deprivation is greater in size, while the relationship between cost and metering is reversed, with greater metering penetration implying lower bad debt costs.

General limitations and modelling observations noted under retail BOTEX apply here as well.

Deprivation: DCLG and Welsh government statistics

Private and social renters: 2011 census data extrapolated forwards using 2016 regional data on tenure by region from the ONS.

Ofwat comment: Model 4 was also proposed by South Staffs Water and South West Water.

Consultation model ID YKYSSCSWBRDC1 YKYRDC2 YKYRDC3

Company’s model ID 4 5 6

Dependent variable Bad debt (log)

Unique customers, Ofwat constructed measure (log) 0.911*** (0.000)

0.851*** (0.000)

0.889*** (0.000)

Average combined bill (log) 0.930*** (0.000)

0.981*** (0.000)

0.989*** (0.000)

Income deprivation (log) 0.841* (0.051)

1.056*** (0.031)

0.774** (0.048)

Proportion of private renters (%) 4.078

(0.115)

Proportion of metered customers (%) -0.345 (0.350)

Constant 6.212*** (0.000)

6.144*** (0.000)

6.079*** (0.000)

R2 adjusted 0.950 0.952 0.950

Reset test 0.601 0.275 0.514

VIF (max) 2.96 3.84 3.26

Estimation method OLS OLS OLS

N (sample size) 68 68 68

Template 77. Retail bad debt models proposed by South East Water

Description of dependent variable

Modelled bad debt includes doubtful debt plus debt management

Costs are unsmoothed and in nominal prices

Description of selected explanatory variables

Bill size: Water UK

Deprivation: DCLG and Welsh government statistics

Customer numbers and metered customers: company APR’s (2013/14 – 15/16) and Ofwat data release

Page 88: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

88

Comments on models (South East Water)

We have included average bill size as an explanatory factor. However, we would comment that the models may insufficiently control for the larger economies of scale and legal tools available to WASCs collecting larger

bills.

We believe that using the average combined bill as an explanatory factor for debt related costs could underestimate the costs faced by WOCs in collecting smaller bills, where the same costs are incurred but less debt is collected. In addition, larger combined bills often come with more debt recovery tools and court enforcement options to chase debt and receive a greater return on debt management expenditure.

Based on the diagnostic tests, while bill size does test as being statistically significant, the magnitude of the coefficient requires further examination.

Consultation model ID SEWRDC1 SEWRDC2

Company’s model ID Model 3 Model 4

Dependent variable Bad debt (log)

Total customers (log) 0.904*** (0.000)

0.888*** (0.000)

Average combined bill (log) 0.932*** (0.000)

0.980*** (0.015)

Unemployment (%) 0.126* (0.098)

0.113 (0.108)

Metering -0.300 (0.494)

Constant 3.801*** (0.000)

3.871*** (0.000)

R2 adjusted 0.949 0.949

Reset test 0.835 0.515

VIF (max) 3.20 3.40

Estimation method OLS OLS

N (sample size) 68 68

3.2 Totex less bad debt models

Template 78. Retail other expenditure models proposed by Ofwat

Description of dependent variable

Other residential retail costs per household.

Other retail costs includes customer service, meter reading, plus depreciation on capital investment. It excludes expenditure related to third party services.

The denominator, household, is the total number of connected households receiving either water only, wastewater only or dual services.

Comments on models

The main variables in our other retail cost models are the number and type of households served.

All models include the proportion of dual service households (households which receive both water and wastewater services from the same retailer). Providing both services may drive higher retail costs than providing a single service due to additional metering costs and more frequent customer contact. The coefficient is consistent, with a plausible magnitude and reasonable significance across all specifications.

Page 89: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

89

Models 2 and 4 include the proportion of metered households to account for metering costs and, possibly, for higher customer service costs due to more frequent contact. Although the coefficient is not statistically significant in any of the models, its value is plausible and consistent across the different specifications.

Models 3 and 4 also include the total number of connected households to allow for economies of scale. The negative coefficient provides some evidence that the costs per household reduce with the number of households served.

The time dummies suggest that costs have dropped in PR14.

The models have a very low R2. This suggests that the explanatory variables do not explain much of the variation in the dependent variable. To some extent, our modelling suggests that using an average cost to serve approach for other retail costs is a sensible approach, with any variation not explained by customer number regarded as noise.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Consultation model ID OROC1 OROC2 OROC3 OROC4

Dependent variable ------------ ln(other retail costs per household) ------------

% of dual service households 0.002 0.002 0.003** 0.003**

(0.132) (0.115) (0.010) (0.016)

% metered households

0.004 0.004

(0.227) (0.322)

Ln(number of households)

-0.080* -0.068 (0.094) (0.208)

2015 dummy 0.036* 0.026 0.036* 0.028

(0.081) (0.279) (0.080) (0.275)

2016 dummy -0.048 -0.067 -0.047 -0.064

(0.204) (0.127) (0.220) (0.159)

2017 dummy -0.078** -0.101** -0.069* -0.090*

(0.043) (0.021) (0.053) (0.052)

Constant 2.752*** 2.552*** 3.784*** 3.457***

R2 adjusted 0.06 0.124 0.117 0.162

VIF (max) 1.493 1.513 2.153 2.212

Reset test 0.497 0.819 0.315 0.907

Estimation method OLS OLS OLS OLS

N (sample size) 71 71 71 71

Template 79. Retail other expenditure models proposed by Anglian Water

Description of dependent variable

All models are described in detail in our Cost Modelling report – Phase 2, published March 2018: http://www.anglianwater.co.uk/about-us/thinking-about-our-future/

Description of selected explanatory variables

Deprivation measure – 80th percentile for IMD with billing used as weight

Comments on models (Anglian Water)

On the basis of section 2.2 of Annex 5 to our report, Other Retail costs are expected to be a function of:

The number of metered customers

The number of unmetered customers

The proportion of customers which take a wastewater service

Regional Wages

Quality of Service.

Page 90: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

90

Consultation model ID ANHROC1 ANHROC2

Company’s model ID 3 4

Dependent variable Other Retail costs Other Retail costs

Ln(# Metered customers) 0.549*** (0.000)

0.521*** (0.000)

Ln(# Unmetered customers) 0.339*** (0.000)

0.382*** (0.000)

Ln(Regional Wages) Unit:£

1.045** (0.014)

0.137 (0.743)

Sparsity 0.249*** (0.007)

WoC billed wastewater customers as % of total customers

-0.301** 0.035

WaSC billed wastewater customers as % of total customers

0.505*** (0.000)

SIM -0.007 (0.152)

-0.012** (0.021)

Time trend 0.018

(0.292) 0.033* (0.080)

Constant -5.395*** (0.000)

-2.875** (0.021)

R2 adjusted 0.9699 0.9619

Reset test 0.197 0.849

VIF (max) 5.44 5.35

Method OLS OLS

N (sample size) 89 89

Template 80. Retail other expenditure models proposed by United Utilities

Description of dependent variable

Total retail cost per household less costs related to bad debt.

Comments on models (United Utilities)

Households are counted as one regardless of whether they receive one or two services; this is not a unique customer measure. These models capture economies of scope through the dual service independent variable.

The dependent variable is not logged as we consider remaining retail costs are best modelled using an additive, rather than multiplicative, specification.

We do not consider the low R2 to be a significant issue. These specifications model cost per household. If these models were specified as total cost models the R2 would exceed 0.9. We consider the loss of R2 to be worth the additional gain in precision that comes from using a cost per household specification. Additionally, these models perform well on RESET, Linktest and Shapiro-Wilk tests.

The metered services variable accounts for the fact that a customer may receive water and wastewater services from different companies.

All models are discussed in more detail in Reckon LLP (2018) ‘Econometric models for residential retail cost assessment’.

Price base is in 2017 CPI terms.

Page 91: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

91

Consultation model ID UUROC1 UUROC2

Company’s model ID RR2 RR3

Dependent variable Remaining retail cost per household

Metered services per household 3.87

(0.311)

% dual service 0.753

(0.702) 2.714

(0.101)

2014 dummy 0.854

(0.198) 0.56

(0.332)

2015 dummy 1.54*** (3.11)

1.379** (0.02)

2016 dummy 0.234

(0.598) 0.183

(0.725)

Constant 12.69*** 15.12***

R2 adjusted 0.155 0.098

VIF (max) 1.69 1.54

Reset test 0.857 0.942

Estimation method OLS OLS

N (sample size) 71 71

Template 81. Retail other expenditure models proposed by Severn Trent Water

Description of dependent variable

Other operating expenditure model: (Operating costs + Depreciation + Amortisation) – (Doubtful debt + Debt management costs)

Description of selected explanatory variables

Bill to income ratio – average bill (total revenue/number of connected households) divided by weekly earnings. In the models, this is average weekly earnings of the lowest decile earners in the region.

Prop. of private rental properties - proportion of connected households that are rented.

Comments on models (Severn Trent Water)

The positive coefficient on the “metered customers” variable indicates, as we expected, that metered customers cost more to serve. We also include the deprivation measures in this model, as deprivation is likely to indirectly influence the scale of retail operations. As expected, these costs are less responsive to differences in deprivation levels but nonetheless, deprivation has some impact on the wider retail function.

Consultation model ID SVTROC1

Company’s model ID 3

Dependent variable Ln(Other opex)

Ln (connected customers) 0.88*** (0.00)

Proportion metered 0.46** (0.03)

Ln(Bill to income ratio(10th percentile)) 0.35*** (0.00)

Unemployment % 0.06

(0.25)

High Density (% customers residing in an area with more than 2000 people per square km)

0.42** (0.01)

Constant 2.42*** (0.00)

R2 adjusted 0.97

Reset test 0.04

VIF (max) 3.1

Estimation method OLS

N (sample size) 71

Page 92: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

92

Template 82. Retail other expenditure models proposed by Wessex Water and Bristol

Water

Description of dependent variable

Non-bad debt related retail operating costs: The subset of total retail operating costs not included in bad debt related retail operating costs – that is, all household retail operating costs other than debt management and doubtful debt.

Comments on models

See comments on bad debt models by Wessex Water and Bristol Water.

Consultation model ID WSXROC1 WSXROC2 WSXROC3 WSXROC4

Company’s model ID A3 A7 B3 B7

Dependent variable ln(non-bad debt related operation costs)

ln(total customers) 1.061*** (0.000)

1.069*** (0.000)

ln(single service customers) 0.498*** (0.000)

0.268** (0.025)

-0.120*** (0.000)

-0.138** (0.021)

ln(dual service customers) 0.263*** (0.000)

0.250*** (0.000)

Metered customers (%) 0.014*** (0.000)

0.002 (0.610)

0.005*** (0.004)

0.005 (0.114)

Metered household density (per km mains)

-0.0155*** (0.001)

ln(peak traffic speed) -1.830*** (0.000)

-1.217** (0.047)

-0.257* (0.062)

-0.327 (0.286)

Time trend -0.0372** (0.014)

-0.035*** (0.002)

Constant 4.539*** (0.000)

4.104* (0.067)

-3.200*** (0.000)

-2.820** (0.011)

R2 adjusted 0.8743 0.8539 0.9676 0.9709

Reset test 0.0025 0.0010 0.0273 0.0076

VIF (max) 2.83 1.40 1.44 1.44

Estimation method OLS Random effects OLS Random effects

N (sample size) 89 89 89 89

Template 83. Retail other expenditure models proposed by South Staffs Water,

Yorkshire Water and South West Water

Description of dependent variable

Total retail costs less bad debt and debt management costs (customer service OPEX + other OPEX + metering OPEX + capital maintenance)

All costs are unsmoothed and nominal.

Information on selected explanatory variables

Bill size: Water UK

Private and social renters: 2011 census data extrapolated forwards using 2016 regional data on tenure by region from the ONS

Unique customers: based on Ofwat’s PR14 assumption that that the cost to serve a dual customer is 1.3 times greater than for a single customer.

Comments on models (South Staffs Water)

Page 93: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

93

Model 9 uses Ofwat’s PR14 unique customer measure. Model 10 uses a flexible specification allowing economies of scope to vary. We find limited difference in outcomes across the two specifications, suggesting that the assumption underpinning unique customers’ definition may be broadly appropriate.

Given that retail costs have been changing significantly (primarily becoming more efficient) in recent years, partly driven by the PR14 price control, we would advocate comparative cost assessment using only the most recent data alongside business plan projections.

Comments on models (Yorkshire Water)

Model 9 captures scale, scope and metering costs. Model 11 captures the above as well as a measure of transient population – the proportion of population in social housing.

We included measures of transient population to capture any impact on customer service or other non-bad debt costs. Deprivation or bill size do not figure in as an explanatory variable for BOTEX less bad debt costs.

Comments on models (South West Water)

For BOTEX less bad debt models we have focused on capturing the effect of two key cost drivers:

Economies of scope – the number of dual customers a company serves; and

Metering penetration – the proportion of customers with meters.

We also looked at the proportion of revenue from customer prepayments as a potential driver.

Models 9 and 10 capture scale, scope and metering. Model 12 captures the above as well as the proportion of revenue from customer prepayments.

Modelling for 4 years (2013/14-16/17) or for AMP6 only (2015/15-16/17) has little impact on the model specification or efficiency ranges

Consultation model ID YKYSSCSWBROC1 SSCSWBROC2 YKYROC3 SWBROC4

Company’s model ID 9 10 11 12

Dependent variable Modelled retail less bad debt (log)

Log(total “unique” customers) 0.984*** (0.000)

0.970*** (0.000)

0.988*** (0.000)

Log(total customers) 0.990*** (0.000)

% dual customers 0.247* (0.052)

% metered customers 0.383

(0.130) 0.388

(0.130) 0.543*** (0.067

0.430*** (0.001)

% customers in social housing 1.683

(0.266)

% of revenue from pre-payments -1.310*** (0.007)

Constant 9.514*** (0.000)

9.480*** (0.000)

9.237*** (0.000)

9.569*** (0.000)

R2 adjusted 0.964 0.964 0.965 0.967

Reset test 0.179 0.182 0.410 0.231

VIF (max) 1.018 2.33 1.45 1.031

Estimation method OLS OLS OLS OLS

N (sample size) 68 68 68 68

Page 94: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

94

3.3 Total expenditure models

Template 84. Retail totex models proposed by Ofwat

Description of dependent variables

Total residential retail costs per household.

Total residential retail costs = total retail operating costs plus depreciation on capital investment. It excludes third party costs.

The denominator, household, is the total number of connected households receiving either water only, wastewater only or dual services.

Comments on models

The variables in our total retail cost models are those that performed well in our more disaggregated models – the bad debt models and the other cost models.

Models 1, 3 and 4 include the proportion of metered households to account for metering costs and, possibly, for higher customer service costs due to more frequent contact. Although the coefficient is not statistically significant in any of the models, its value is plausible and consistent across the different specifications.

Model 1 includes the proportion of dual service households (households which receive both water and wastewater services from the same retailer). This variable aims to capture higher costs associated with dual customers. It appears to capture the higher impact of dual customers on bad debt due to their higher bill relative to single service customers.

Models 2 to 4 include average bill size. Average bill size is very significant in all specifications. Its inclusion makes the proportion of dual customers insignificant, as it provide the same information on the effect of bill size on bad debt. We therefore excluded the proportion of dual customers from models 2-4.

Models 3 and 4 include a proxy for the probability of default (the percentage of households with default). Its coefficient has the correct sign, a plausible magnitude and a reasonable level of significance.

Model 4 includes the total number of households to allow for economies of scale. The significant negative variable implies that the costs per household reduce with the number of households served.

The time dummies suggest that costs have dropped in PR14.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Consultation model ID ORTC1 ORTC2 ORTC3 ORTC4

Dependent variable ------------ ln(total retail cost per household) ------------

% of dual service households 0.006*** (0.000)

Ln(number of households) -0.119** (0.012)

% metered households 0.005

(0.167)

0.004 (0.420)

0.004 (0.376)

Ln(bill size) 0.535*** (0.000)

0.468*** (0.000)

0.641*** (0.000)

% households with default (Eq_lpcf62) 0.026

(0.173) 0.042** (0.014)

2015 dummy 0.025

(0.344) 0.034

(0.156) 0.024

(0.344) 0.024

(0.372)

2016 dummy -0.070** (0.046)

-0.029 (0.301)

-0.043 (0.265)

-0.029 (0.446)

2017 dummy -0.133*** (0.001)

-0.090*** (0.003)

-0.096** (0.012)

-0.064* (0.094)

Constant 2.857*** 0.361 -0.14 0.117

R2 adjusted 0.583 0.612 0.638 0.694

VIF (max) 1.513 1.494 2.019 2.936

Reset test 0.732 0.005 0.033 0.396

Estimation method OLS OLS OLS OLS

N (sample size) 71 71 71 71

Page 95: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

95

Template 85. Retail totex models proposed by Anglian Water

Description of dependent variable

Retail botex as defined in Anglian Water Cost Modelling report Phase 2 report, published March 2018:http://www.anglianwater.co.uk/about-us/thinking-about-our-future/

Description of selected explanatory variables

Deprivation measure – 80th percentile for IMD with billing used as weight.

Comments on models (Anglian Water)

The drivers of the integrated model are the drivers of the DDDM (Doubtful Debt and Debt Management) and Other Retail models on the assumptions that these models have been properly specified.

Consultation model ID ANHRTC1

Company’s model ID Retail

Dependent variable Total Retail botex

Ln(number of metered customers) 0.484 *** (0.000)

Ln(number of unmetered customers) 0.347 *** (0.000)

Ln(Average bill size) 0.419 *** (0.002)

Ln(Regional Wages) 1.263 *** (0.001)

Deprivation measure 0.582 ** (0.020)

Regional unemployment 4.432 * (0.092)

% wastewater customers of total customers 0.443 *** (0.006)

WoC billed wastewater customers as % of total customers -0.453 ** (0.028)

Billing complaints per 10,000 customers 0.003 ** (0.040)

Time trend 0.036 * (0.097)

Constant

-8.804 *** (0.000)

R2 adjusted 0.9797

Reset test 0.055

VIF (max) 15.19

Method OLS

N (sample size) 89

Template 86. Retail totex models proposed by United Utilities

Description of dependent variable

Total retail costs per household.

Households are counted as one regardless of whether they receive one or two services; this is not a unique customer measure. These models capture economies of scope through the bill size/dual service variables.

Price base is in 2017 CPI terms.

Page 96: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

96

Comments on models (United Utilities)

All models are discussed in more detail in Reckon LLP (2018) published alongside this consultation.

Bill ratio was constructed by Reckon LLP to control for the correlation between bill size and proportion of dual service customers. Bill ratio captures the differences between companies’ average bills, while accounting for the differing service mix.

These models perform well on diagnostic tests of model specification and analysis indicates that the coefficients are robust to observations being omitted. We consider this to be a more important test of predictive power than statistical significance.

Model RT4_d2 seeks to capture extreme deprivation. For example, the top-20 percent referred to is the 20 percent most deprived households, as measured IMD predicted.

Consultation model ID UURTC1 UURTC2

Company’s model ID RT4_d2 RT4_d4

Dependent variable Ln(total retail costs)

% dual service 0.323* (0.06)

0.278 (0.107)

Bill ratio 0.81*** (0.008)

0.859*** (0.004)

Deprivation measure (units vary by measure)

0.606 (0.29)

1.62 (0.177)

2014 dummy 0.065** (0.04)

0.067** (0.035)

2015 dummy 0.119*** (0.002)

0.115*** (0.002)

2016 dummy 0.042* (0.079)

0.035 (0.117)

Constant 2.279 2.016

R2 adjusted 0.676 0.686

VIF (max) 2.72 2.94

Reset test 0.208 0.615

Estimation method OLS OLS

N (sample size) 71 71

Template 87. Retail totex models proposed by Severn Trent Water

Description of dependent variable

Total revenue = operating costs + Depreciation + Amortisation

Description of selected explanatory variables

Customers - total number of households connected

Unemployment - % of the population in the region that are unemployed

Bill to income ratio – average bill (total revenue/number of connected households) divided by weekly earnings. In the models, this is gross weekly earnings of the lowest decile earners in the region.

Density - Proportion of customers residing in an area with more than 2000 people per square km.

Comments on models (Severn Trent Water)

Model 1 OLS: In results not reported we find that the fit of the model (AICC, BIC) improves as we change the denominator of the bill to income ratio variable from median, to 20th and then 10th percentile. We have used the 10th percentile in all of the models included here.

Page 97: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

97

Consultation model ID SVTRTC1

Company’s model ID 1

Dependent variable Ln(Total revenue)

Ln (Customers) 0.87*** (0.00)

Ln(Bill to income ratio(10th percentile)) 0.74*** (0.00)

Unemployment % 0.05

(0.29)

Density 0.42** (0.009)

Constant 3.22*** (0.00)

R2 adjusted 0.98

Reset test 0.012

VIF max 2.1

Estimation method OLS

N (sample size) 71

Template 88. Retail totex models proposed by Welsh Water

Description of dependent variable

Retail Operating Costs = “Total Operating Costs”

Comments on models (Welsh Water)

The submitted retail model controls for deprivation within a total operating cost model. Deprivation is measured using either the Income IMD or the IMD score. A comparable measure of IMD has been produced by Economic Insight detailed in the accompanying report “Evaluating a predicted IMD approach to debt cost assessment-Final-STC-12-03-18.pdf”. The model also controls for the number of customers, the proportion of metered customers and economies of scope. The proportion of metered properties is insignificant but is included from an operational point of view.

Economies of scope has been incorporated using two different approaches:

Models 7 and 8 use Ofwat’s PR14 1.3 assumption for dual service customers to calculate the “Ofwat Adjusted Customers”.

Models 8 and 9 use the number of unique accounts (dual service customers are counted as two accounts). The models then include the proportion of dual customers to account for economies of scope. The negative coefficient indicates the presence of economies of scope.

South West and Bournemouth have been modelled separately in these models.

Costs at outturn prices.

Page 98: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

98

Consultation model ID WSHRTC1 WSHRTC2 WSHRTC3 WSHRTC4

Company’s model ID 8 9 10 11

Dependent variable Ln(Retail Operating Costs)

Ln(Unique Accounts) (,000)

0.973*** (0.000)

0.944*** (0.000)

Ln (Ofwat PR14 Adjusted Customers) (,000)

0.946*** (0.000)

0.931*** (0.000)

Ln(Average Wholesale Bill) 0.330*** (0.003)

0.436*** (0.001)

0.571** (0.010)

0.534** (0.010)

Income IMD (%) 3.842** (0.021)

4.360** (0.011)

IMD Score 0.022** (0.032)

0.030*** (0.005)

% metered customers 0.254

(0.423) 0.251

(0.402) 0.234

(0.450) 0.227

(0.462)

% Dual customers -0.437** (0.014)

-0.309** (0.039)

Constant -6.011*** -6.195*** -7.162*** -6.800***

R2 adjusted 0.980 0.980 0.981 0.980

VIF (max) 3.04 2.87 7.40 5.46

Reset test 0.158 0.143 0.027 0.045

Estimation method OLS OLS OLS OLS

N (sample size) 89 89 89 89

Template 89. Retail totex models proposed by Yorkshire Water

Description of dependent variable

Total retail costs = OPEX + capital maintenance. Costs are not deflated and are unsmoothed.

Comments on models (Yorkshire Water)

We focused on four key cost drivers:

economies of scope, the number of dual customers a company serves;

metering penetration, which drives metering reading costs;

bill size, which increases a company’s exposure to customers defaulting; and

level of deprivation, which increases the propensity of customers to default.

We have additionally considered several measures of transient population to explore the impact that a high turnover of population has on the propensity of customers to default thereby increasing costs of debt management and customer service. Source: Private and social renters: 2011 census extrapolated forwards using 2016 regional data on tenure by region from the ONS.

We have used income deprivation to measure deprivation. Our aggregate BOTEX models also include the Ofwat measure of unique customers from PR14 (the sum of single customers + 1.3 * the sum of dual customers) and bill size. We consider one model with an additional control to capture the impact of metering on costs and another which uses the proportion of the population privately renting to capture population transiency. Income deprivation source DCLG and Welsh government statistics.

Robustness checks:

1. Modelling retail BOTEX as OPEX + depreciation rather than OPEX + capital maintenance does not significantly change the model coefficients or outcomes

2. Model coefficients remain similar and generally statistically significant if we limit to AMP6 data alone 3. Models do not appear to be mis-specified based on the RESET test 4. Models are generally robust to using the random effects estimator. Some coefficients become statistical

insignificant, but the sign and magnitude hold. 5. The models are not materially impacted by the exclusion of outliers.

Page 99: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

99

Ofwat comment: Model 1 was also proposed by South Staffs Water

Consultation model ID YKYSSCRTC1 YKYRTC2 YKYRTC3

Company’s model ID 1 2 3

Dependent variable Modelled retail BOTEX (log)

Unique customers, Ofwat PR14 measure (log) 0.935*** (0.000)

0.946*** (0.000)

0.915*** (0.000)

Average combined bill (log) 0.374*** (0.002)

0.344*** (0.008)

0.391*** (0.000)

Income deprivation (log) 0.302

(0.348) 0.335

(0.287) 0.375

(0.317)

Proportion of metered customers (%) 0.173

(0.465)

Proportion of private renters (%) 1.375

(0.534)

Constant 9.078*** (0.000)

9.145*** (0.000)

9.056*** (0.000)

R2 adjusted 0.973 0.973 0.973

Reset test 0.410 0.525 0.491

VIF (max) 2.96 3.26 3.84

Estimation method OLS OLS OLS

N (sample size) 68 68 68

Template 90. Retail totex models proposed by Wessex Water and Bristol Water

Description of dependent variable

Total retail operating costs: The totality of household operating retail costs, including opex and capital costs: customer services; debt management; doubtful debts; meter reading; services to developers; other operating expenditure; local authority rates; exceptional items; third party services; depreciation and amortisation.

Comments on models (Wessex Water and Bristol Water)

See comments on bad debt models by Wessex Water and Bristol Water.

Consultation model ID WSXRTC1 WSXRTC2 WSXRTC3 WSXRTC4 WSXRTC5 WSXRTC6 WSXRTC7 WSXRTC8

Company’s model ID A1 A4 A5 A8 B1 B4 B5 B8

Dependent variable ln(total retail operating costs)

ln(total customers)

0.877*** (0.000)

0.966*** (0.000)

1.043*** (0.000)

1.065*** (0.000)

ln(single service customers)

0.536*** (0.000)

0.563*** (0.0000)

0.349*** (0.001)

0.318*** (0.003)

-0.069* (0.087)

-0.134** (0.041)

-0.150** (0.030)

ln(dual service customers)

0.122*** (0.000)

0.159*** (0.0000)

0.226*** (0.000)

0.246*** (0.000)

Metered customers (%) 0.007* (0.062)

0.00198 (0.500)

0.005*** (0.005)

0.002

(0.400)

Metered household density (per km mains)

-0.007** (0.041)

Flats (%) 0.057*** (0.000)

0.060*** (0.001)

0.053

(0.144)

ln(peak traffic speed) -0.364 (0.290)

IMD income (%) 0.164*** (0.000)

0.155*** (0.000)

0.066 (0.167)

0.105* (0.056)

0.027*** (0.001)

0.027*** (0.003)

Property repossessions (%)

0.107*** (0.000)

0.119*** (0.002)

0.121*** (0.000)

0.147*** (0.000)

0.113*** (0.000)

0.130*** (0.000)

Page 100: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

100

Consultation model ID WSXRTC1 WSXRTC2 WSXRTC3 WSXRTC4 WSXRTC5 WSXRTC6 WSXRTC7 WSXRTC8

ln(average wholesale

bill)

1.206*** (0.000)

0.999*** (0.000)

0.341*** (0.000)

0.301 (0.213)

0.659*** (0.000)

0.480*** (0.000)

0.400*** (0.004)

0.351** (0.019)

Constant -10.02*** (0.000)

-8.06*** (0.000)

-2.74 (0.103)

-3.84** (0.039)

-6.97*** (0.000)

-6.50*** (0.0000)

-5.52*** (0.000)

-5.45*** (0.000)

R2 adjusted 0.9284 0.9283 0.8957 0.9060 0.9821 0.9835 0.9815 0.9824

Reset test 0.016 0.006 0.000 0.000 0.204 0.408 0.007 0.017

VIF (max) 6.98 13.49 6.78 8.12 2.62 9.81 5.79 7.84

Estimation method OLS OLS RE RE OLS OLS RE RE

N (sample size) 89 89 89 89 89 89 89 89

Template 91. Retail totex models proposed by South East Water

Description of dependent variable

Total retail costs = total retail OPEX - third party services + capital expenditure

Costs are unsmoothed and in nominal prices

Description of selected explanatory variables

Prepayment: Ofwat data release for years 2013/14-14/15, assumed same levels over AMP6

Comments on models (South East Water)

For our aggregate retail cost models we have considered 3 key drivers of retail cost: economies of Scope; metering and deprivation.

We have excluded average combined bill. We believe there are sufficient explanatory factors to infer debt related costs and are mindful not to exaggerate number of explanatory factors to the detriment of underplaying more significant elements of retail functions (e.g. billing, customer query/investigations and metering) – subsequently we consider one explanatory factor, unemployment rate (as considered at PR14), to be suitable proxy of deprivation and debt related expenditure.

We consider the proportion of revenue from customer prepayments is a potential driver of bad debt and customer service costs. We have derived this variable from Ofwat’s publication of data used in the retail services efficiency report 28 September 2017. We defined it as the amount of deferred income from customer prepayments over appointed revenues.

During the course of the present regulatory period as SEW become one of the industry leading companies on the proportion of metered customer, our cost to serve reporting has indicated that the real cost impact of servicing metered customers and we are keen to ensure this important factor is not underplayed by the introduction of other explanatory factors added. Including average combined bill appears to reduce the impact of the proportion of metered customers on costs. But as noted, average combined bill may not take into account possible diseconomies in chasing smaller combined bills. Smaller bills generally do not have the advantage of tougher legal action options, and are therefore more costly to chase with smaller reward.

Consultation model ID SEWRTC1 SEWRTC2

Company’s model ID Model 1 Model 2

Dependent variable Modelled retail BOTEX (log)

Unique customers (log) 1.064*** (0.000)

1.078*** (0.000)

Unemployment (%) 0.0644* (0.080)

0.0448 (0.269)

Proportion of metered customers (%) 0.564* (0.082)

0.574** (0.015)

Proportion of revenue from prepayments (%) -1.581*** (0.016)

Constant 8.965*** 9.106***

R2 adjusted 0.969 0.973

Reset test 0.234 0.658

VIF (max) 1.575 1.661

Page 101: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

101

Estimation method OLS OLS

N (sample size) 68 68

Template 92. Retail totex models proposed by South Staffs Water

Description of dependent variable

Total retail = total retail OPEX – third party services + capital expenditure

Comments on models (South Staffs Water)

Oxera developed models that broadly pass the diagnostic tests, however we have identified an issue with these models not fully capturing company specific levels of deprivation. We have found that modelling bill size and deprivation together may work at an industry level as most companies with higher deprivation are WaSCs, with higher combined bill levels. Such models are however unable to appropriately capture bad debt costs for a WoC with high deprivation levels but a lower bill. It is not appropriate that such companies should be penalised in cost assessment as a result of the statistical distribution of cost across WoCs and WaSCs.

Given this difficulty, we would support validating any econometric modelling with an efficient cost to serve approach for the customer service costs and a separate deprivation model for bad debt and debt collection costs for WoCs.

The models we have included for retail use income deprivation as a cost driver, which we believe to be most robust and reflective of our customer base and service area. We have not been able to develop models which use LSOA data on income deprivation; however this could be a plausible option if the data was robust and an appropriate threshold could be identified.

Given that retail costs have been changing significantly (primarily becoming more efficient) in recent years, partly driven by the PR14 price control, we would advocate comparative cost assessment using only the most recent data alongside business plan projections.

All costs are unsmoothed and modelled in nominal prices.

Ofwat comment: Model 6 is identical to Yorkshire Water’s model 1.

Consultation model ID YKYSSCRTC1 SSCRTC2

Company’s model ID 6 7

Dependent variable Total retail (log)

Unique customers, Ofwat measure (log) 0.935*** (0.000)

Average combined bill (log) 0.374*** (0.002)

0.415** (0.039)

Income deprivation (log) 0.302

(0.348) 0.351

(0.025)

Total customers (log) 0.938*** (0.000)

Proportion of dual customers (%) 0.175

(0.563)

Constant 9.078*** 8.954***

R2 adjusted 0.973 0.973

Reset test 0.410 0.430

VIF (max) 2.96 7.689

Estimation method OLS OLS

N (sample size) 68 68

Page 102: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

102

4 Enhancement expenditure models

4.1 Meeting lead standards costs

Template 93. Meeting lead standards models proposed by Ofwat

Description of dependent variable

Capital expenditure for meeting lead standards, gross of grants and contributions.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

We present three models with alternative scale variables: total water delivered; total population served, and total number of communication pipes. We also include the number of lead pipes replaced as a proxy for the amount of work done to improve lead standards.

We use smoothed data averaged over three-year periods. The models perform slightly better in the original scale than in the logarithmic scale. However we note that the constants in models OE1 and OE2 are negative. If this has a large, distortive, impact on implied efficient costs for companies (eg if the implied cost allowance is negative) we may use the logarithmic model instead. By definition, the logarithmic model does not have the problem of a negative constant.

The estimated coefficients are robust and in line with expectations.

Consultation model ID OE1 OE2 OE3

Dependent variable Meeting lead standards costs (smooth)

Water delivered (smooth) (Ml/d)

0.0015*** (0.000)

Total population served (smooth) (000’s)

0.0003*** (0.000)

Lead communication pipes (number)

0.000002*

(0.051)

Lead communication pipes replaced (number)

0.0004*** (0.000)

0.0004*** (0.000)

0.0004** (0.01)

Constant - 0.156 (0.461)

- 0.109 (0.624)

0.129 (0.593)

R2 adjusted 0.879 0.862 0.843

VIF (max) 1.752 1.534 2.905

Reset test (p-value) 0.087 0.050 0.000

Estimation method RE RE RE

N (sample size) 48 48 48

Page 103: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

103

4.2 Water new developments and new connections

Template 94. Water new developments and new connections models proposed by

Ofwat

Description of dependent variable

Capital expenditure associated with new developments and new connections, gross of grants and contributions.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on new models

We modelled the costs of new developments combined with the costs of new connections. We did so for two reasons. First, because these activities shared common cost drivers. Second, we wanted to mitigate potential cost allocation issues between the two activities. New connections expenditure was not reported separately until we requested companies to do so in December 2017.

We present two alternative models with a single (scale) variable. The models perform reasonably well and the coefficients are in line with expectations. We use data from 2005-06 with a three-year moving average to smooth the lumpiness of the data and mitigate misalignment of the costs and the drivers in any one year.

We will also consider including new developments and new connections costs as part of the wholesale water econometric models.

Consultation model ID OE4 OE5

Dependent variable ln smooth (new developments and new connections costs)

ln total population served (smooth) (000’s)

1.061*** (0.000)

ln total number of household and non-household new connections (smooth) (000’s)

1.040*** (0.000)

Constant - 6.498*** (0.000)

- 0.242 (0.309)

R2 adjusted 0.823 0.815

VIF (max) 1.000 1.000

Reset test (p-value) 0.228 0.736

Estimation method RE RE

N (sample size) 70 70

Page 104: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

104

4.3 First time sewerage costs

Template 95. First time sewerage models proposed by Ofwat

Description of dependent variable

Capital expenditure for new and additional sewage treatment and sewerage assets for first time sewerage schemes to meet the duty under s101A of the Water Industry Act 1991. The expenditure is gross of grants and contributions.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

We consider that first time sewerage costs are likely to be related to:

The size of the s101A schemes as measured by the number of connectable properties;

The number of s101A schemes completed; and

The size of the scheme, for which the average number of properties per scheme is a proxy.

Our models use data from 2009-10 with a three-year moving average to smooth the lumpiness of the data and mitigate misalignment of the costs and the drivers in any one year. The models performed better in the original scale than in logarithmic scale. The estimated coefficients are robust and in line with expectations.

Consultation model ID OE6 OE7 OE8

Dependent variable ------------ smooth (first time sewerage costs) ------------

Connectable properties served by s101a schemes (smooth)

0.017*** (0.000)

0.012*** (0.000)

S101a schemes (smooth) 1.245*** (0.000)

0.432*** (0.003)

Average number of connectable properties per s101a schemes (smooth)

0.009* (0.096)

Constant 0.063

(0.842) 0.877** (0.014)

0.584* (0.096)

R2 adjusted 0.824 0.918 0.923

VIF (max) 1.144 1.000 4.588

Reset test (p-value) 0.000 0.945 0.987

Estimation method RE RE RE

N (sample size) 59 59 59

Page 105: Cost assessment for PR19: a consultation on econometric ... › wp-content › uploads › 2018 › ... · Cost assessment for PR19: a consultation on econometric cost modelling Appendix

Cost assessment for PR19: a consultation on econometric cost modelling Appendix 1: Modelling results

105

4.4 Sewage growth

Template 96. Sewage growth models proposed by Ofwat

Description of dependent variable

Capital expenditure associated with three areas: new developments and growth; growth at sewage treatment works and reducing sewer flooding risk for properties. The costs are gross of grants and contributions.

All monetary values have been inflated to 2016-17 prices using the CPIH.

Comments on models

We combined costs of three enhancement activities: new development and network growth; growth at sewage treatment works; and reducing sewer flooding risk. These activities are likely to be affected by similar factors (eg the size of the customer base) and combining them will mitigate issues regarding potential inconsistencies in the way companies allocated costs between reducing sewer flooding risk and new development and network growth.

We present models with two alternative scale variables, resident population and number of household and non-household properties billed for sewerage. We include load per sewage treatment works in two models, to capture economies of scale.

We use a three-year moving average to smooth the lumpiness of the data and mitigate misalignment of the expenditure and the drivers in any one year. The models perform better in the original scale than in logarithmic scale. The estimated coefficients are robust and in line with expectations.

We will also consider including sewage growth costs, including new developments, as part of the wholesale wastewater econometric models.

Consultation model ID OE9 OE10 OE11 OE12

Dependent variable ------------ smooth (sewage growth) ------------

Resident population (smooth) (000s)

0.005*** (0.000)

0.003* (0.068)

Household and non-household properties billed for sewage (smooth)

0.012*** (0.000)

0.006* (0.069)

Load per sewage treatment work (smooth) (kg BOD5/day)

0.012

(0.113) 0.015** (0.022)

Constant 4.869

(0.279) 2.766

(0.603) 7.004

(0.139) 6.109

(0.223)

R2 adjusted 0.780 0.751 0.817 0.818

VIF (max) 1.000 1.000 4.753 3.590

Reset test (p-value) 0.308 0.154 0.683 0.682

Estimation method RE RE RE RE

N (sample size) 40 40 40 40


Recommended