+ All Categories
Home > Documents > Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study...

Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study...

Date post: 13-Jan-2016
Category:
Upload: franklin-collins
View: 222 times
Download: 0 times
Share this document with a friend
39
Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz
Transcript
Page 1: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Lufthansa

Outlier Detection Methods on Booking Data

AGIFORS Reservation and Yield Management Study Group

BangkokMay 2001

Ulrich Oppitz

Page 2: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 2 Lufthansa

Outlier Detection Methods on Booking Data- Agenda -

Definitions and Theory

Outlier Detection Methods

Analysis Method

Some Words on Quality Measurement

Results

Summary

Literature

Page 3: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 3 Lufthansa

Booking data in RM systems can be influencedby many disturbances

Definition: Outliers are data points which differ in their appearance from the majority of the data. (Rousseeow, 1990)

Caused by:• system errors• schedule changes• special events

Two approaches to cope with outliers: • robust approach:

– use robust methods/predictors• diagnostic approach:

– identify outliers– trimm or ignore them– apply classical methods/predictors

Best practice for chain processes

Page 4: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 4 Lufthansa

If ignored, outliers can affect the quality of the forecasting process significantly

To measure the robustness of a forecast method, Hodges introduced the term breakdown point. (Hodges 1967)

The breakdown point can be loosely defined as the smallest fraction of outliers that seriously offsets the estimator from the true one.

(Rousseeuw 1991)

The breakdown point of any regression method based on the least squares technique is 1/n, which means a single outlier in a set of n data points can degenerate the LS estimate.

Page 5: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 5 Lufthansa

Outlier Detection Methods on Booking Data- Agenda -

Definitions and Theory

Outlier Detection Methods

Analysis Method

Some Words on Quality Measurement

Results

Summary

Literature

Page 6: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 6 Lufthansa

Z-Score Testing

• calculate empirical average and variance based on historical bookings for each DCP

• check whether number of historical bookings > minimum observations

• tag as outlying if outside the following interval upper threshold: + maxSigmaPos * lower threshold: - maxSigmaNeg *

• trimm outlying data to threshold value before updating and

Page 7: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 7 Lufthansa

Z-Score Testing

0

0,05

0,1

0,15

bkgs

den

sity

fu

nct

ion

of

nor

mal

dis

trib

uti

on

lower bound upper bound

Page 8: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 8 Lufthansa

Determination Coefficient Testing on Residual Regression

• update exponentially smoothed bookings for each dcp -> reference curve

• check whether number of historical bookings > minimum observations

• calculate residuals bkd(dcp) from actual bookings and reference curve

• calculate linear regression curve reg(dcp) on residuals bkd(dcp)

Page 9: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 9 Lufthansa

Determination Coefficient Testing on Residual Regression

b

kd(d

cp)

reg(dcp)

dcp

Page 10: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 10 Lufthansa

• calculate the determination coefficient

Determination Coefficient Testing on Residual Regression

(reg (dcp) - reg)

(bkd (dcp) - bkd )R =

2 2

2

• if R2 < minR2 tag dcp with largest vertical distance to regression curve as outlying and take it out of the set

• iterate with cleaned data set

• stop if R2 > minR2 or number of outlier > maxOutlier

• reset outlier taggings if more than maxOutlier

Page 11: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 11 Lufthansa

Outlier Detection Methods on Booking Data- Agenda -

Definitions and Theory

Outlier Detection Methods

Analysis Method

Some Words on Quality Measurement

Results

Summary

Literature

Page 12: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 12 Lufthansa

The simulation is performed on real booking data

42 flight numbers (2 multi-leg flights)

data type: actual bookingsdata source: PROS IV data basedeparture time range: 01Jun94 - 31May97booking classes: FA CDZ HBLGYKTWEevaluated DCPs: 1-15total flight departes: 422 054total DCPs: 6 330 810

Page 13: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 13 Lufthansa

Analysis method: artificial outlier implantation

1) Preprocessing: outlier cleaning with very conservative parameters (high outlier tagging rates)

2) Different manipulations are performed with predefined probabilities• XLA enlarge all DCPs x 3.00 PXLA = 0.01• XSA shrink all DCPs x 0.33 PXSA = 0.01• XL1 enlarge single DCP x 3.00 PXL1 = 0.01• XS1 shrink single DCP x 0.33 PXS1 = 0.01• X-Y swap booking classes X and Y PX-Y = 0.02

3) Artificially created outliers are tagged.

4) Apply outlier detection method

5) Evaluation: count number of recognized outliers and non-outliers

Page 14: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 14 Lufthansa

Outlier Detection Methods on Booking Data- Agenda -

Definitions and Theory

Outlier Detection Methods

Analysis Method

Some Words on Quality Measurement

Results

Summary

Literature

Page 15: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 15 Lufthansa

observables: True Positives TPTrue Negatives TNFalse Positives FPFalse Negatives FN

TPsensitivity1: TP + FN =: sens (masking)

TNspecificity1: TN + FP =: spec (swamping)

TP + TNefficiency1: TN + FN + TP + FP =: eff

TP + FPtemperament: TN + FN + TP + FP =: temp

The quality measures known in the literatureare not sufficient in the RM environment.

1 (Walczak, 1998)

Page 16: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 16 Lufthansa

Quality Measures for Outlier Detection Methods

For an outlier detection method on booking data it is most important to detect almost all outliers. Few data points which are erroneously taken out of the valid set, have less impact.

• weighting of error types TP and TN• dynamical adaption of weights to degree of contamination

• axioms for a quality measure let A,B Â denote the complex set of correct classifications,0 <= (A) <= 1(A) = 0 A = (A) = 1 A= ÂA B (A) < (B)( AB) = (A) + (B) - (AB)

Page 17: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 17 Lufthansa

TN + (1- )TP (TN+FP) + (1- ) (TP+FN)

TP + FN TN + FN + TP + FP (outlier rate)

TN + (1- )TP (TN+FP) + (1- ) (TP+FN)

TP + FP TN + FN + TP + FP (temperament)

Contamination and Temperament Weighted Efficiencymeet the conditions

CWE =

TWE =

with =

with =

Page 18: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 18 Lufthansa

Outlier Detection Methods on Booking Data- Agenda -

Definitions and Theory

Outlier Detection Methods

Analysis Method

Some Words on Quality Measurement

Results

Summary

Literature

Page 19: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 19 Lufthansa

temperament, z-score testing

Sensitivity Analysis on Cleaned Booking Data- temperament for z-score testing -

Page 20: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 20 Lufthansa

Sensitivity Analysis on Cleaned Booking Data- sensitivity for z-score testing -

sensitivity, z-score testing

Page 21: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 21 Lufthansa

specificity, z-score testing

Sensitivity Analysis on Cleaned Booking Data- specificity for z-score testing -

Page 22: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 22 Lufthansa

efficiency, z-score testing

Sensitivity Analysis on Cleaned Booking Data- efficiency for z-score testing -

Page 23: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 23 Lufthansa

Max: (0.9, 0.6, 0.924243)

CWE, z-score testing

Sensitivity Analysis on Cleaned Booking Data- contamination weighted efficiency for z-score testing -

Page 24: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 24 Lufthansa

Max: (0.9, 0.6, 0.929789)

TWE, z-score testing

Sensitivity Analysis on Cleaned Booking Data- temperament weighted efficiency for z-score testing -

Page 25: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 25 Lufthansa

temperament, DCT

Sensitivity Analysis on Cleaned Booking Data- temperament for DCT -

min R2

max outlier

Page 26: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 26 Lufthansa

sensitivity, DCT

Sensitivity Analysis on Cleaned Booking Data- sensitivity for DCT -

min R2

max outlier

Page 27: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 27 Lufthansa

specificity, DCT

Sensitivity Analysis on Cleaned Booking Data- specificity for DCT -

min R2

max outlier

Page 28: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 28 Lufthansa

efficiency, DCT

Sensitivity Analysis on Cleaned Booking Data- efficiency for DCT -

min R2

max outlier

Page 29: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 29 Lufthansa

Max: (0.45, 14, 0.736911)

CWE, DCT

Sensitivity Analysis on Cleaned Booking Data- contamination weighted efficiency for DCT -

min R2

max outlier

Page 30: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 30 Lufthansa

Max: (0.5, 14, 0.788245)

TWE, DCT

Sensitivity Analysis on Cleaned Booking Data- temperament weighted efficiency for DCT -

min R2

max outlier

Page 31: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 31 Lufthansa

Raw data analysis delivers more realistic results

Optimal Parameters on Cleaned and Raw Booking Data

z-score testing (ZST)cleaned data raw data

CWE 0.9 / 0.6 -> 0.924 1.5 / 0.8 -> 0.680TWE 0.9 / 0.6 -> 0.930 2.2 / 0.9 -> 0.752

determination coefficient testing (DCT)cleaned data raw data

CWE 0.45 / 14 -> 0.737 0.70 / 14 -> 0.662TWE 0.50 / 14 -> 0.788 0.50 / 13 -> 0.747

Page 32: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 32 Lufthansa

Proper parameter calibration is more important than method choice.

DCT, standard (0.6;2)z-score, standard

(2.0;2.0) DCT, optimal

z-score, optimal

CWE

TWE

27.3

67.6

74.7 75.2

54

61.666.2 68

0

10

20

30

40

50

60

70

80

quality [%]

Comparison on raw data

Page 33: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 33 Lufthansa

Z-score testing on booking changes is more efficientthan on booking values.

Optimal Parameters on Raw Booking Data

z-score testing (ZST)on bookings on booking changes

CTW 1.5 / 0.8 -> 0.680 1.8 / 1.1 -> 0.728DTW 2.2 / 0.9 -> 0.752 2.9 / 1.5 -> 0.820

Page 34: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 34 Lufthansa

Outlier Detection Methods on Booking Data- Agenda -

Definitions and Theory

Outlier Detection Methods

Analysis Method

Some Words on Quality Measurement

Results

Summary

Literature

Page 35: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 35 Lufthansa

• We defined new quality measures for outlier detection models which enable a parameter optimization and the comparison of different methods.

• Symmetric acceptance ranges for z-score testing are of disadvantage

– potential for improvement by only adjusting parameters

– revenue impact unknown, but positive

– low risk

• Clear superiority of z-score testing on cleaned booking data• Slight superiority of z-score testing on raw booking data

• Parameter optimization incorporates higher potential for improvement than choice of method.

• Z-score testing can be improved if applied on booking changes

Outlier Detection Methods on Booking Data- Summary -

Page 36: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 36 Lufthansa

Outlier Detection Methods on Booking Data- Agenda -

Definitions and Theory

Outlier Detection Methods

Analysis Method

Some Words on Quality Measurement

Results

Summary

Literature

Page 37: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 37 Lufthansa

Hodges 1967J.L. Hodges,Proc. Fifth Berkeleley Symp. Math. Stat. Probab.,1967, 1, 163-168

Rousseeuw 1987P.J. Rousseeuw, A.M. Lerroy,Robust Regression and Outlier Detection,Wiley, New York, 1987

Rousseeuw 1990P.J. Rousseeuw,Unmasking Multivariate Outliers and Leverage Points (with discussion),Journal of the American Statistical Association,1990, 85, 633-651

Outlier Detection Methods on Booking Data- Literature -

Page 38: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Outlier Detection Methods on Booking Data Ulrich Oppitz, May 2001, Page 38 Lufthansa

Rousseeuw 1991,P.J. Rousseeuw,Journal of Chemometrics,1991, 5, 1-20

Walczak 1998,B. Walczak, D.L. Massart,Multiple Outlier Detection Revisited,Chemometrics and Intelligent Laboratory Systems,1998, 41, 1-15

Outlier Detection Methods on Booking Data- Literature, ctd. -

Page 39: Lufthansa Outlier Detection Methods on Booking Data AGIFORS Reservation and Yield Management Study Group Bangkok May 2001 Ulrich Oppitz.

Lufthansa

Outlier Detection Methods on Booking Data

AGIFORS Reservation and Yield Management Study Group

BangkokMay 2001

Ulrich Oppitz


Recommended