My Lumby 11nov

0

Republic of Sudan

Ministry of High Education & Scientific Research

Nile Valley University

College of Post Graduate Studies

Lumpy Demand

Reported by/

Abdulhmeed Mohamed Elhassan Mahjoub Ali

1

Lumpy Demand

Introduction

Demand forecasting is one of the most crucial issues of inventory management.

Forecasts, which form the basis for the planning of inventory levels, are probably

the biggest challenge in the repair and overhaul industry.

The problem of controlling items with lumpy demand patterns has received

relatively little attention, even though these items constitute an appreciable portion

of the inventories in parts and supply types of stockholdings. Lumpy demand arises

in service parts and electronics components when there are variations in volumes

associated with the product mix, and with intervals between demands being fairly

erratic and unpredictable.

Lumpy demand, intermittent demand, or slow-moving demand, that is when

there are time periods without demand and then suddenly a time period with

demand, becomes even more difficult to forecast. If the demand is underestimated

it will lead to lost sales and therefore lost revenues. If the demand is overestimated,

in the best case the stock is increased or in worst case, the items lie unsold until

they become obsolete.

Definitions

However, to forecast lumpy demand is different compared to when there is a

demand in every forecasting period. Silver (1981) defines lumpy

2

demand as when both the demand and the periods between the demands are

random. Croston (1972) defines lumpy as when the demand is zero in a

number of forecasting periods. When the zero demands are present some of the

most used forecasting methods start to overestimate the demand. Croston (1972)

suggested that the forecast should be split in two; one for the demand and the other

forecast for the number of periods between the demand occasions, inter-demand.

Syntetos and Boylan (2001) proved that the Croston forecasting method also had

bias. The method overestimates the demand.

There have been a number of alternatives to Croston’s forecasting method,

some of the alternatives to Croston’s method are; Segerstedt (2000) Snyder (2002),

willemain et al (2004) and Syntetos and Boylan (2005). [1]

Forecasting Methods

Some of the most common methods to forecast lumpy demand are SES

and moving average. Moving average is the mean of a fixed number of the

demand of the previous periods, as new demands occurs it replaces the oldest

(Makridakis et al, 1998). However, the ability for SES to forecast slow moving

items or an lumpy demand has been questioned. Croston (1972) presents a

method that is updated only when there is a demand and therefore the forecast

precision should increase. The method consists of two forecasts, one for the

demand and the other for the inter-demand period. Segerstedt (2000) suggests a

variant of Croston but with one forecast. Syntetos and Boylan (2005) present a

version of Croston where a bias correction is added. Bias is a systematic error,

when the forecast is, on the average, significantly above or below the demand

during the forecasted periods

3

1 Single Exponential Smoothing (SES)

Single exponential smoothing is a technique applied in different fields,

such as forecasting Brown (1959), and process regulation Montgomery (2005).

According to Gardner (2006) the method was originally developed for

antisubmarine purposes. Brown used a variant of the exponential smoothing to

create a tracking model for fire-control information on the location of the

submarine. Makridakis and Hibon (1991) consider SES to be a robust method

that is easy to use.

In every time period the model is re-estimated with the most recent available

demand data and the previous forecast. The smoothing constant, Į, regulates

the amount of influence the forecast error have. The forecast error is the difference

between the real demand and the forecasted demand.

(Montgomery et al, 1990)

X ˆ t+1 = X ˆ t +α (X t − X ˆ t ) .

Another way of describing the function of the smoothing

constant is that the different observations have weights that decrease geometrically

with age. The smoothing constant regulates the influence of historical values; a low

smoothing constant emphasis the past, favorable with a stable demand but

then the technique is slow to react if systematic changes occur. A high

smoothing constant emphasis the most recent observations, which is better

suited when faster reaction is wanted, but the drawback is sensitivity to random

changes. (Montgomery et al, 1990)

In a practical application different smoothing constants should be used for

4

different classes of items, which should also be the case with SES. A

smoothing constant between 0.1-0.3 is suitable for SES when forecasts are

done on a monthly basis (Silver et al, 1998).

The weight given to data with number of k periods ago can be expressed as

Į(1- Į)k which makes the average age:

With a higher resolution of the forecast intervals (shorter time periods) the

probability for periods with zero demand increases. If several zero demand is

consecutive the forecast will decrease and eventually approach zero. This

scenario is most likely to occur when the items are slow moving. The items

have an lumpy demand. An alternative is to update the forecast only after

a demand has occurred. This makes SES biased which is not the case when the

update occurs in every period (Boylan and Syntetos, 2007).

The conclusions of SES as a forecast method of lumpy demand are varied. Croston

(1972) discusses SES problem with overestimation of the demand when the

forecast update takes place right after a demand. Boylan and Johnston (1996)

consider SES to be suitable when the inter-demand is 1.25 periods or lower. Eaves

and Kingsman (2004), on the other hand, concludes that SES can be used as a

method for demand that is lumpy.

2 The Croston Method (Croston)

Croston (1972) presented a solution for slow-moving items. He suggests

that the forecast should be divided in two parts; one for the demand size and one

for the interdemand interval. The forecast is only recalculated when there is a

demand.

5

Where

The two exponential smoothing forecasts are then combined to estimate the mean

demand per period length:

Syntetos and Boylan (2001) showed that the original Croston method

overestimated the outcome and that it therefore was biased. They suggested a

modification to the Croston method. Bias is that the forecast is, on the average,

significantly above or below the demand during the forecasted periods. The

modification can be described as a bias correcting function. In eq. (5) a bias

corrector is added to the original Croston. The forecast updates are the same as for

the original Croston:

3 Modified Croston (ModCr)

SES uses one smoothing constant where Croston and SyBo use two

smoothing constants which increase the complexity compared to SES. The Croston

technique forecasts the mean demand and the mean inter-demand. Another

interpretation of the quotient is that it represents the demand. Levén and Segerstedt

(2004) presented a version of Croston, Modified Croston (ModCr), that forecasts

the demand rate directly instead of separating the forecast into demand and inter-

demand; therefore it requires one smoothing constant. The update occurs when

there is a demand, but maximum is one per working day. If there are several

6

demands in a day, the demands are summarised. The demand rate is the quotient

between the demand and the inter-demand interval. In a simulation study the

method was shown to perform better than SES. The method was proposed to avoid

the bias problem that Croston had.

The idea behind ModCr is to avoid the decision of what method to use in a

practical application, SES or Croston. A withdrawal every time period

(working day) transforms equation. When the demand takes place in every time

period the ModCr is equal to SES. The smoothing constant for ModCr must have

lower start values, probably 0.05-0.3, when the resolution of the time period is

higher, days or weeks instead of months. (This is also valid for other forecast

methods). Items with high frequency of withdrawals or demand (every day) with

ModCr should have a lower constant than SES, if the forecast interval for SES is

much longer (weeks, months) than for ModCr (days).

The simulation study of Levén and Segerstedt (2004) has not been confirmed

in other studies. Teunter and Sani (2009) find the ModCr to have bias problem

that is more severe than for Croston that also tends to overestimate the demand.

This is a confirmation of the results Syntetos and Boylan (2007) presented.

They also found that the bias of ModCr is not dependent of the value of the

smoothing constant. The statement that ModCr is nonbiased is a relevant claim

when demand rate is considered, however this is not what the method is

supposed to forecast (Syntetos Boylan, 2007). Gardner (2006) stated that there

was no evidence presented that motivates the statement of nonbiased.[2]

7

4 Teunter, Syntetos, Babai (TeunterSB)

Teunter et al (2011) present a new idea to forecast lumpy demand. Every

time period a probability for demand is updated; and a forecast for expected

demanded quantity is updated only when there is a demand:

In a practical application different smoothing constants should be used for

different classes of items, which should also be the case with SES. Silver et al

(1998) discuss that a smoothing constant between 0.1-0.3 is mostly suitable for

SES when forecasts are done on a monthly basis. With ModCr items with high

frequency of withdrawals or demand (every day) should have a lower smoothing

constant than SES, as the forecast interval for SES is longer (weeks, months) than

for ModCr (days); this is also valid for other forecast methods.[3]

5 Markov Chain Model.

In a first order Markov chain model, the estimation of the next state is

obtained depending only on the one-step transition probability matrix, whereas k-

step transition matrix is needed for kth-order Markov chain model. The higher

order (kth-order) Markov chain models were first proposed by Raftery [22].

Raftery’s model is extended by [23,24] to a more general higher order Markov

chain model given in (1).

8

X ˆ (n) is the state vector which is the prediction of the next state at time n. Q

ˆi is the i-step transition probability matrix and λi are the weights given by [24] as

nonnegative real numbers such that

λi (i = 1,2, . . . , k) can be estimated by the maximum likelihood estimation

or obtained

by solving a linear programming (LP) model which is proposed by [23].

6 Modified Markov chain model.

Accurate estimates of lumpy demands cannot be obtained by applying

Markov chain model because of the high percentage of zero demands. A

modification is needed when obtaining forecast values. The steps of the algorithm

proposed are described in this section. All computations were performed by a

computer code written in Matlab.

Step 1. The frequency of each demand value is obtained for data set j (j = 1,2, . . .

,695). While some demand values (such as 0, 1, 2 and 3) are frequently observed,

the frequencies of some demand values (such as 8,9, . . . ,12) are low. Each

demand value with high frequency corresponds to a state of a Markov chain. The

demands with low frequencies are collected under a single group in order to be

reconsidered in the procedure of forecasting.

Step 2. The states of the Markov chain are determined.

Since Markov chain model used to model mostly in categorical data, the states of

the Markov chain are the categorical variables. However, in modified Markov

9

model, the states include the demand information.

Step 3. The one-step and two-step transition probability matrices are computed.

Step 4. Steady state probabilities are calculated.

Step 5. The linear programming (LP) model is constructed.

The steady state probabilities and the transition probabilities are the parameters of

the LP model and the λis are the decision variables.

Step 6. Values are calculated as a solution of the LP model.

The number of values determines the order of the Markov chain model. First order

Markov chain model is required for some of the data sets whereas second order

Markov chain model is constructed for some of them.

Step 7. By using values and the transition probability matrices, the Markov chain

model is constructed for data set j.

Step 8. According to the resulting Markov chain model, the demand forecasts are

obtained.

The modification of the Markov chain model is mostly included in this step. [23]

states that, according to k-th order state probability distribution, the prediction of

the next state (X ˆ (n)) must be taken as the state with the maximum probability.

The highest probability obviously states that the related state is the most probably

to occur but if there are some states which have non-zero probabilities, these states

are also likely to occur relative to their probabilities. We claim that when

estimating the next state, considering all the states relative to their probabilities

will yield better forecast values. Therefore, in the procedure we proposed we have

considered not only the state with the highest probability but also all the states in

proportion to their probabilities when estimating the next state of the Markov

chain. If more than one demand with a low frequency corresponds to a state, it is

thought that the demands within this state occur with an equal probability.

Consequently, modified Markov chain model determines the next state taking into

10

account all states relative to their probabilities whereas in Markov chain model

the state with maximum probability determines the next state.

Step 9. The accuracy measure r is calculated for the forecast values.

The accuracy measure used for the Markov chain model, which is represented by r,

is computed for the forecast values.

Step 10. Repeat Steps 8 and 9 until obtain the highest r value.

Since the determination of forecast values depends on the probabilities of states,

different sequences will be obtained whenever Step 8 is repeated. Step 8 is

repeated for several times and the data set that gives the highest r value is recorded

as the final forecast values.

Step 11. The lead time demands (LTD) are computed consistent with the lead time

information for data set j. Each data set may have different lead time. Summation

of monthly demand forecasts along with the lead time results in forecasts of LTD.

Step 12. Steps 8-11 are repeated many times for the same data set and the LTD

distribution is obtained.

7.Holt –Winters methods

Additive and multiplicative winter are two methods proposed by Winters and Holt

in rder to considerate hypothetical seasonal effects. A first way to considerate these

seasonal effects is the introduction of a drift D which modifies the levelled values

according to variables which depend upon time. Drift d is a function which

represents the trend. For example, a model which considerate trend effect is this:

11

The first can be seen as a weighted average of the observed value (yt) and

the forecast calculated at the previous period; the second as a weighted average of

the difference between forecasts calculated at the period t and t-1 and the drift

calculated at the period t-1 (to attribute a weight equal to 1 to this last one is

equivalent to assume a linear trend, that is a constancy in the drift).

The additive winter AW and multiplicative winter MW are an extension

of this first example in order to also considerate the seasonality in strict meaning.

The Additive Winter starts from the following relations:

where st is a factor of seasonality and p his periodicity (4 for quarterly data,

12 for monthly data, and so on). The demand forecast for the period t is:

12

In parallel, Multiplicative Winter has the following relations:

and the forecast demand for the period t is:

These models are very flexible , because they can also consider non-

polynomial trends and not-constant seasonality . With regard to the choice of the

weights , and ,values the minimize the square of the gaps can be taken or, in

alternative, they can be chosen in line with the scope of the analysis.

8.Bootstrap method

Hua et al. (2006, p.1037) say that when historical data are limited, the

bootstrap method is a useful tool to estimate the demand of spare parts.

Bookbinder and Lordahl (1989, p 303) found the bootstrap superior to the normal

approximation for estimating high percentiles of spare parts demand for

independent data. Wang and Rao (1992, p 333-336) also found the bootstrap

effective to deal with smooth demand. All these papers do not consider the special

problems of managing intermittent demand. Willemain et al. (2004, p.377-381)

provided an approach of forecasting intermittent demand for service parts

inventories. They developed a bootstrap-based approach to forecast the distribution

of the sum of intermittent demands over a fixed lead time. Bootstrapping is a

modern, computer-intensive, general purpose approach to statistical inference,

falling within a broader class of re-sampling methods. Bootstrapping is the practice

13

of estimating properties of an estimator (such as its variance) by measuring those

properties when sampling from an approximating distribution. One standard choice

for an approximating distribution is the empirical distribution of the observed data.

In the case where a set of observations can be assumed to be from an independent

and identically distributed population, this can be implemented by constructing a

number of re-samples of the observed dataset (and of equal size to the observed

dataset), each of which is obtained by random sampling with replacement from the

original dataset.

The bootstrap procedure can be illustrated with the following steps:

1- take an observed sample (in our case a sample of historical spare parts

demand) of number equal to n, called X = (x1, x2, …, xn);

2- from X, resample m other samples of number equal to n obtaining X1, X2, …,

Xm (in every bootstrap extraction, the data of the observed sample can be

extracted more then one time and every data has the probability 1/n to be

extracted);

3- given T the estimator of , parameter of study (in our case it may be the

average demand), calculate T for every bootstrap sample. In this way we have

m estimates of ;

4- from these estimates calculate the desired value: in our case the mean of T1,

…, Tm can be the demand forecast.

This method can be applied not only to find the average demand (that can be the

demand forecast) but also the intervals between non zero-demand or other desired

values.

9.Poisson method

Poisson method is typically used for the forecast of the probability of happening of

a rare event (Manzini et al., 2007, p.205). It derives directly from the binomial

distribution. This method doesn’t allow the direct calculation of the variable to

14

forecast, but it consents an estimate of the probability that it assumes a determined

value. The point of start of this model is the valuation of the average value of the

variable to forecast. In case of spare parts, given the average consumption in an

interval time T equal to d, the probability to have a demand equal to x (i.e. x

requires of components) in the interval time T is:

In consequence, the cumulative probability (a measure that not more than x

components are required) can be expressed as:

Accuracy metrics for lumpy demand

The most commonly used scale-dependent metrics are based on absolute

errors or on squared errors:

Mean Absolute Error (MAE)

Geometric Mean Absolute Error (GMAE)

Mean Square Error (MSE)

where “gmean” is a geometric mean. The MAE is often abbreviated as the

MAD (“D” for “deviation”). The use of absolute values or squared values prevents

negative and positive errors from offsetting each other.

Since all of these metrics are on the same scale as the data, none of them are

meaningful for assessing a method’s accuracy across multiple series.

15

For lumpy-demand data, Syntetos and Boylan (2005) recommend the use of

GMAE, although they call it the GRMSE. (The GMAE and GRMSE are identical

because the square root and the square cancel each other in a geometric mean.)

Boylan and Syntetos (this issue) point out that the GMAE has the flaw of being

equal to zero when any error is zero, a problem which will occur when

both the actual and forecasted demands are zero. This is the result seen in Table 1

for the naïve method.

Boylan and Syntetos claim that such a situation would occur only if an

inappropriate forecasting method is used. However, it is not clear that the naïve

method is always inappropriate. Further, Hoover indicates that division-byzero

errors in lumpy series are expected occurrences for repair parts. I suggest that the

GMAE is problematic for assessing accuracy on lumpy-demand data.

Percentage errors

The percentage error is given by pt = 100et /Yt. Percentage errors have the

advantage of being scale independent, so they are frequently used to compare

forecast performance between different data series. The most commonly used

metric is

Mean Absolute Percentage Error (MAPE) = mean(|pt |)

Measurements based on percentage errors have the disadvantage of being

infinite or undefined if there are zero values in a series, as is frequent for lumpy

data. Moreover, percentage errors can have an extremely skewed

distribution when actual values are close to zero. With lumpy-demand data, it is

impossible to use the MAPE because of the occurrences of zero periods of

demand. The MAPE has another disadvantage: it puts a heavier penalty on positive

errors than on negative errors. This observation has led to the use of the

16

“symmetric” MAPE (sMAPE) in the M3-competition (Makridakis & Hibon,

2000). It is defined by

sMAPE = mean(200 |Yt – Ft | / (Yt + Ft ))

However, if the actual value Yt is zero, the forecast Ft is likely to be close to

zero. Thus the measurement will still involve division by a number close to zero.

Also, the value of sMAPE can be negative, giving it an ambiguous interpretation.

Relative errors

An alternative to percentages for the calculation of scale independent

measurements involves dividing each error by the error obtained using some

benchmark method of forecasting. Let rt = et /et denote the relative error where

et Usually the benchmark method is the naïve method where Ft is equal to the last

observation. Then we can define

Median Relative Absolute Error (MdRAE) = median(|rt |)

Geometric Mean Relative Absolute Error (GMRAE) = gmean(|rt |)

Because they are not scale dependent, these relative-error metrics were

recommended in studies by Armstrong and Collopy (1992) and by Fildes

(1992) for assessing forecast accuracy across multiple series. However,

when the errors are small, as they can be with lumpy series, use of

the naïve method as a benchmark is no longer possible because it would

involve division by zero.

Scale-free errors

The MASE was proposed by Hyndman and Koehler (2006) as a generally

applicable measurement of forecast accuracy without the problems seen in

17

the other measurements. They proposed scaling the errors based on the in-

sample MAE from the naïve forecast method. Using the naïve method,

we generate one-period-ahead forecasts from each data point in the sample.

Accordingly, a scaled error is defined as

The first row of the table below shows the lumpy series plotted in Figure 1.

The second row gives the naïve forecasts, which are equal to the previous

actual values. The final row shows the naïve-forecast errors. The

denominator of qt is the mean of the shaded values in this row; that is the

MAE of the naïve method.

The only circumstance under which the MASE would be infinite or

undefined is when all historical observations are equal. The in-sample MAE

is used in the denominator because it is always available and it effectively

scales the errors. In contrast, the out-of-sample MAE for the naïve method

may be zero because it is usually based on fewer observations. For example,

if we were forecasting only two steps ahead, then the out-of-sample MAE

would be zero. If we wanted to compare forecast accuracy at one step ahead

for ten different series, then we would have one error for each series. The

out-of-sample MAE in this case is also zero. These types of problems are

avoided by using in-sample, one-step MAE.

18

A closely related idea is the MAD/Mean ratio proposed by Hoover (this

issue) which scales the errors by the in-sample mean of the series instead of

the in-sample mean absolute error. This ratio also renders the errors scale

free and is always finite unless all historical data happen to be zero.

Hoover explains the use of the MAD/Mean ratio only in the case of in-

sample, one-step forecasts (situation 2 of the three situations described in the

introduction). However, it would also be straightforward to use the

MAD/Mean ratio in the other two forecasting situations. The main

advantage of the MASE over the MAD/Mean ratio is that the MASE is more

widely applicable. The MAD/Mean ratio assumes that the mean is stable

over time (technically, that the series is “stationary”). This is not true for

data which show trend, seasonality, or other patterns. While lumpy

data is often quite stable, sometimes seasonality does occur, and this might

make the MAD/Mean ratio unreliable. In contrast, the MASE is suitable

even when the data exhibit a trend or a seasonal pattern.

The MASE can be used to compare forecast methods on a single series, and,

because it is scale-free, to compare forecast accuracy across series. For

example, you can average the MASE values of several series to obtain a

measurement of forecast accuracy for the group of series. This measurement

can then be compared with the MASE values of other groups of series to

identify which series are the most difficult to forecast. Typical values for

one-step MASE values are less than one, as it is usually possible to obtain

forecasts more accurate than the naïve method. Multistep MASE values are

often larger than one, as it becomes more difficult to forecast as the horizon

increases. The MASE is the only available accuracy measurement that can

be used in all three forecasting situations described above, and for all

forecast methods and all types of series. [5]

19

References:

1/ Wallstrom P., Evaluation of forecasting techniques and forecast

errors With focus on intermittent demand, Lulea University of

Technology, Lulea, Sweden,2009

2/ Ralph D. Snyder, Keith J. Ord, Beaumont J., Forecasting the Intermittent

Demand for Slow-Moving Items, RPF Working Paper No. 2010-003, The George

Washington University,2010

3/ Segerstedt A., Levén E., A study of different Croston-like forecasting methods,

Working Paper Industrial Logistics, Lulea University of Technology, Lulea,

Sweden, 2012.

4/ Kocer U., Forecasting intermittent demand by Markov chain model,

International Journal of Innovative Computing, Information and Control Volume

9, Number 8, August 2013,

5/ Hyndman R., Another look at forecast-accuracy metrics for intermittent

demand, International Journal of Forecasting, Monash Australia June 2006.

6/ Maria Caridi1 and Roberto Cigolini, Buffering against lumpy demand in MRP

environments: a theoretical approach and a case study, Proceedings of The Fourth

SMESME International Conference, Milano, Italy, 2012.

20

7/ Maria Elena Nenni, Luca Giustiniano, and Luca Pirolo, Demand Forecasting in

the Fashion Industry: A Review, International Journal of Engineering Business

Management Vol 5 July 2013.

8/ Umay Uzunoglu Kocer, Forecasting Intermittent Demand by Markov Chain

Model, International Journal of Innovative Computing, Information and Control

Volume 9, Number 8, August 2013.

9/ Ralph D. Snyder, J. Keith Ord and Adrian Beaumont, Forecasting the

Intermittent Demand for Slow-Moving Items, Center of Economic Research

Department of Economics The George Washington University Washington, DC

20052, Revised: March 11, 2011.

10/ S. D. Prestwich, R. Rossi, S. A. Tarim, and B. Hnich, Mean-Based Error

Measures for Intermittent Demand Forecasting, arXiv:1310.5663v1 [stat.ME] 18

Oct 2013.

Date post:	19-Feb-2016
Category:	Documents
Upload:	abdulhmeed-mutalat
View:	221 times
Download:	0 times

My Lumby 11nov

Documents