GL Garrad Hassan Short term power forecasts for large ... · GL Garrad Hassan Short term power...

transcript

GL Garrad Hassan Short term power forecasts for large offshore

wind turbine arrays

Require accurate wind (and hence power) forecasts for 4, 24 and 48 hours in the future for trading purposes. Receive 4 forecasts from different NWP models. One problem is resolution: UK NWP models have a grid size of 4 square km, whereas a typical wind farm may be a few hundred square km. No knowledge of the NWP models but have (some) on-site measurements to combine with model outputs to improve the forecast.

NWP versus On-site measurements

Approaches split into two camps:

• Linear regression analysis

• Probabilistic forecasting

Statistical analysis Modelling

• Auto regressive model

• 4D-Var data assimilation

• Artificial Neural Networks

Linear regression analysis

• Bias correction

• “Optimal NWP” as a linear combination of NWPs:

Obtain using a least squares fit to the measurements from the first half of

2012. Forecast for the second half.

44332211 NWPwNWPwNWPwNWPwNWPoptimal

Linear regression analysis – errors for second half of 2012

48 hour forecast

NWP 1 NWP 2 NWP 3 NWP 4 NWP optimal

RMS error 1.84 1.92 1.82 2.23 1.68

30 hour forecast

RMS error 1.63 1.73 1.59 1.77 1.46

4 hour forecast

RMS error 1.49 1.52 1.28 1.58 1.28

Auto regressive model

ANN for SCADA data

• SCADA is useful on a very short scale (<8 hours)

• Uses smoothed measured wind speed data, hourly average values of 10 min averages

• The simulation uses the average wind speed from the last K hours (K=1..96),

• the network is trained on N number of vectors of length K (N=1000..10000),

• estimates the wind speed with an H hour horizon (H=1..8)

ANN simulation results

• RMSE of the ANN is smaller than that of the average of the estimations of the four NWP models:

– RMSE of NWP average: 1.5767

– RMSE of estimation: 1.3938 (about ~12% improvement)

– RMSE of persistence: 2.8818

– Standard deviation: 2.271

Data Assimilation

Observations

𝑢𝑏

Background Estimate

𝑢𝑏

𝑢𝑎

Analysis Vector (Optimal Solution)

Re-run periodically

Resulting Equation

• 4D-Var Data Assimilation cost function 𝐽 𝑢0= (𝑢𝑏−𝑢0)

𝑇𝐵−1 𝑢𝑏 − 𝑢0

+ [𝑦𝑙 −𝐻𝑙 𝑢𝑙 ]𝑇𝑅𝑙−1[

𝑙=0

𝑦𝑙 −𝐻𝑙(𝑢𝑙)]

Optimal Solution 𝑢𝑎 = min

𝑢0𝐽(𝑢0)

Data for this Problem • Aim: Estimate optimal initial condition for the ARMA model, 𝑥0.

• 𝑥𝑛 = 𝑎𝑖𝑥𝑛−𝑖 + 𝜉𝑛+𝑖𝑚𝑖=1

To find the optimal solution from the model, we need optimal parameters to estimate 𝑥0 (𝑎𝑖 and 𝜉𝑛+𝑖 fixed by training data).

• These are 𝑥−1, … , 𝑥−𝑚.

• So let, 𝑢𝑙 =

𝑥𝑙−1𝑥𝑙−2⋮𝑥𝑙−𝑚

which implies 𝑢0 =

𝑥−1𝑥−2⋮𝑥−𝑚

• Use NWP data to find as 𝑢𝑏 for the a priori information to constrain the solution.

Making Probabilistic Forecasts for wind activity.

Given point forecasts from 4 models and observations how can we make probabilistic

forecasts of wind speed?

First step – If we don't have any specific knowledge of the future we can naively look at

past observations. We call this the 'climatology'.

We expect any useful model to do better than climatology.

Our simple approach (due to time constraints) is to create a climatological distribution modelled as a Gaussian distribution from a whole year's data.

We could also use climatology that is month or even date specific (data permitting).

Moving Variance of wind speed over a year

Ignorance Skill Score

To compare our models we use the 'ignorance score' (Good, 1952) given by ign=-log

where p is the amount of probability assigned to the true outcome. All ignorance scores are given relative to climatology (where a negative score means we are doing better).

Blending with climatology

It is common to create new models that are a linear combination of a particular model and the climatology.

We then have a distribution in the form

P(y)= αPmod

(y)+(1-α)Pclim

(y) where α is optimised in some way (Commonly to minimise the ignorance score).

A simple model

To create a simple model from the data we take a Gaussian distribution with moving mean, averaged from the last 5 observations and a moving variance taken from the last 30 observations.

The relative ignorance for this model is 0.38.

This means that our model does worse than climatology.

However, when we blend with climatology, we get a relative ignorance score of -0.02. (α=0.3)

Kernel Dressing Models

We can turn point forecasts into probabilistic forecasts using kernel dressing.

We replace each point forecast with a Gaussian distribution (also known as a kernel).

The mean of the Gaussian distribution is just the point forecast with a bias correction from past experience.

The standard deviation for each model is the mean error found from past experience.

This is done for all 4 NWP models

We can compare the models using the relative ignorance score. Each one has been blended with climatology.

NWP1 NWP2 NWP3 NWP4

-0.93 -0.88 -0.67 -0.79

We can create a new model that is a weighted average of the other 4 models and blended with climatology. i.e. P(Y)=α(w

1(y)+w

2(y)+w

3(y)+w

4(y))+(1

-α)*Pclim

4 α Rel. ign

0.35336 0.34057 0.15533 0.16120 0.99 -0.99

0.25 0.25 0.25 0.25

0.99 -0.94

0.7 0.3 0 0 0.99 -1.00

Kernel Dressing methods

We chose our Kernel widths with a forecast using an individual model in mind, but given that the 4 models are likely to cover more possible outcomes, we might want to reduce the kernel widths. The results of halving them are shown below.

W1 W2 W3 W4 alpha Rel. ign

0.35336 0.34057 0.15533 0.16120 0.94 -1.16

0.25 0.25 0.25 0.25

0.99 -1.14

0.7 0.3 0 0 0.99 -0.96

Summary of Results

First Gaussian Model 0.38

First Gaussian Model blended with climatology -0.02

Individual models blended with climatology -0.93(NWP1)

Weighted model using kernel widths from individual models and blended with climatology

Weighted model using smaller kernel widths and blended with climatology.

Possible future work

Extend the work to more realistic distributions rather than Gaussian.

Find probabilistic forecasts from ensembles.

Find a way of optimising the weightings and kernel widths by minimising the ignorance score.

Extend the work to probabilistic forecasts of power.

Artificial Neural Networks (ANN) with Gaussian Radial Basis Functions*

* „newgrnn” is used in MATLAB

ANN for SCADA evaluation • Evaluation is based on percentage reduction of the RMSE

(Root Mean Square Error) compared to persistence (assuming the same wind speed H hours later as what is now)

ANN’s for NWP values

Training of the ANN

• Input: 30 hours ahead wind speeds calculated by four Numerical Weather Predictions (NWP1-4)

• Target: Measured wind speeds at the site 30 hours later

• Evaluation: results are compared to the estimation provided by the average of the four NWP’s and persistence

Training & Results

• Limited number of data for training the network: 365 days’ 9am prediction for 3pm wind speed the next day and measured wind speeds at that time from SCADA data

• The RMS error decreases with the number of training data:

Conclusions

• Neural Networks (with Gaussian Radial Basis Functions) used for short term prediction of wind speed based on SCADA data exclusively do not provide significant improvement compared to persistence (naive estimator) – maximum 5% improvement

• Longer term (30 hours) predictions using ANN’s based on the four NWP inputs provide good results and significant improvement compared to averaging the NWP’s – at least 12% improvement (probably more if the network is trained on high amount of data, 300 was the maximum in this study)

• Combining the SCADA data and the NWP’s data to form an input for the ANN would probably be able to provide better results for both long and short term; further investigation required

GL Garrad Hassan Short term power forecasts for large ... · GL Garrad Hassan Short term power...

Documents