No Slide TitleSimon Cumming
(
[email protected])
Title Name Page *
History and some examples of data mining at BA
Data mining and business complexity
Successful data mining
Title Name Page *
114 from Heathrow
Domestic
European
Longhaul
First Class
The challenges BA has faced over the last 3 years
Middle East (war in Iraq etc.)
World Trade Centre aftermath / terror threats, security etc.
Low Cost carriers
The audience can probably think of others –
Future Sixe and Shape (FSAS)
Cash crisis
DVT, Cabin Air Quality,
Share Price, Profit Performance,
Debt status (but we’re better than all the American
airlines!!)
Are we selling Waterside and all other assets to survive
Title Name Page *
- Customer service and innovation.
Ensuring continued financial performance
and ability to invest for future.
Making the most of new technologies, e.g. web, self-service.
Getting ready for Terminal 5 at Heathrow.
Reducing unnecessary complexity.
Title Name Page *
OR at BA has been going for over 50 years.
The Airline industry has some complex and interesting OR problems,
e.g.
Revenue management (yield management) – optimising number of seats
available in different selling classes (prices).
“End-to-end” scheduling, I.e. scheduling, planning, rostering,
etc.
Engineering inventory, vehicle fleets, etc.
“Commercial” – customer data, frequent flyer programme, transaction
data, market research, consultancy
“Operational” – Check-in, queuing, seat allocation, punctuality,
baggage etc.
The academic body for airline OR is AGIFORS, the Airline
Group
of the International Federation of OR Societies
(www.agifors.org)
Title Name Page *
Problem Structuring
Business Modelling
Quantitative and qualitative modelling of complex business areas or
issues
Complex Data Analysis
Delivering insight into complicated issues and questions within the
business, through uncovering trends, causes and relationships, to
ensure decisions are made on basis of evidence that reflects the
real world
There are also data mining people in the Sales and
Marketing departments.
Decision trees (Classification & Regression Trees – Breiman et
al, 1984) –recursive partitioning based on significance
measure.
Cluster analysis. Ward , k-means, etc.
Self-organising map (Kohonen, 1982) – can think of as a structured
set of clusters.
Neural network – works out an approximation to the function
relating the inputs to the outputs.
Association rules – based on conditional probabilities p(y|x), e.g.
If I buy bread, what is the probability I buy butter?
Title Name Page *
How a SOM works
Each dot represents a cluster centre, i.e. a vector of data with
the same columns (dimensions) as your data set.
For each row of the data set, the algorithm finds the nearest
cluster centre and moves it, and its neighbours, ‘towards’ the
current data row by a small amount
This process iterates through the data set a number of times.
Title Name Page *
Example of cluster output from SOM
From the ‘statistics’ tab in the SOM node, do ‘save as’ and save
the table as a
CSV file. Then the best thing to do is to open it in Excel,
transpose it and ‘tidy it up’ to give something like the table
shown above.
Note that in this example, clusters 7 and 8 are not used, in other
words there are no records for which those are the nearest
clusters.
Sheet1
0.1
0.2
0.7
0
0
2.3
0.9
1.1
1.5
0.3
0.2
9.1
0.2
0.3
0.3
0.1
0.2
1
0.2
0.5
0.4
0.6
3.9
0.4
4.4
1.3
5.5
0.3
0.2
6.6
4
0.8
1.1
0.8
1.1
0.8
4.3
3
2.5
1.2
0.7
2.1
11.1
2.8
8.8
1.6
1.7
9.6
0.4
0.6
1.1
0.2
1.1
2.5
0.2
0.2
0.4
0.1
0.3
1
0.8
1.5
1.5
0.8
2.8
9.6
3340
2442
4778
845
1163
19057
1
4
8
0
1
7
EuroTrav non-pts earning psjs in LY
First psjs in LY
jg_af_me_sa
jg_f_east
jg_n_c_am
WTP psjs in LY
ONLINE=N
ONLINE=Y
SAS Enterprise Miner
http://www.sas.com/technologies/analytics/datamining/miner
Note that with some of the techniques, SAS have “reverse
engineered” the idea and applied their existing products to solve
the underlying problem, e.g. treated neural networks as a
non-linear programming problem, SOMs as a clustering problem,
etc.
Title Name Page *
Sample - by creating one or more data sets
Explore - by searching for anticipated relationships, unanticipated
trends, and anomalies in order to gain understanding and
ideas
Modify - by creating, selecting, and transforming the variables to
focus the model selection process
Model - by using the analytical tools
Assess - by evaluating the usefulness and reliability of the
findings
You may not want to include all of these steps
It may be necessary to repeat one or more of the steps several
times
Another examples of a data mining methodology is CRISP-DM
(cross-industry platform for data mining)
Title Name Page *
& research at BA
1989/90 - looking at neural nets for forecasting bookings and
identifying special events.
1992 - Predicting “no-shows” (use of neural networks to predict,
from the booking attributes, the number of people who have made a
booking but do not check in for the flight)
1996/7 - Engine condition monitoring : feedforward neural network
and self-organising maps used for ‘novelty detection’ to spot
abnormal engine condition states and monitor trends (in addition to
use of sophisticated conventional physical and data analysis
techniques)
1996/7 - Neural network for estimation of work requirement for
major engineering overhauls of aircraft.
1999 - Forecasting pilot training requirements
Patterns in takeup of electronic ticketing and check-in.
Effect of disruption and compensation on customer loyalty.
Title Name Page *
1999 – Decision trees used in customer value prediction
(PCV).
1999 – Self-organising maps used in “Travel Service” CRM.
2000/1 – attrition models & segmentation for Executive Club
(frequent flyer) data.
2001 – September 11th L
2002/3 – Analysis of on-board customer survey data (global
performance monitor)
In-flight retail. Analysis of who buys what, on-board.
2004 – Executive Club travel pattern segmentation
Title Name Page *
British Airways Executive Club
“Frequent flyer” scheme (but also includes “partner” organisations
e.g. car hire, hotels, credit cards, foreign exchange etc. )
BA Miles – can redeem these for free flights (and other
things)
Tier points – count towards promotion from Blue to Silver and Gold
Tiers.
Silver and Gold members are eligible for “benefits” such as lounge
access, preferential check-in etc.
Data kept on flights booked and travelled and miles earnt with
partner companies.
Title Name Page *
some Executive Club models
“Commercial partners” usage segmentation (car hire, hotels,
financial cards, etc. )
“Segment management” (specific business propositions for top
segment “frequent premium stars”)
“New joiners” model (predict value from customer attributes and
patterns)
Techniques used … .
Cluster analysis
Self-organising maps
Logistic regression
Title Name Page *
BA Examples (2): “Travel Service”
Leisure travel scheme whereby customer gave details of favourite
destinations, activities, plus time of year and budget, and BA sent
details of tailored offers.
(now discontinued)
Self-organising maps (SOMs) used to cluster database and select
groups for matching. (1998/9)
The diagram shows 16 customer segments (the green squares within
each box) viewed on 20 different variables, to show booking, tavel
and destination patterns. The area of the small squares shows
magnitude.
Note: this chart was not generated using Enterprise Miner,
though SAS was used in some of the analysis
Title Name Page *
Sun seekers who want all components included (13.5,2.8)
Blue tier exec club members with city breaks (1.2,4.3)
Busy people who get away when can & are not price sensitive
(2.3,8.2)
Adventure Trail Finders (2.6,3.2)
Type of person who just ticks “all offers” box (2.3,4.8)
Retired Southerners looking for Australia? (9.7,2.3)
Diners & shoppers (or who like to think they do)
(3.2,1.3)
The bookers who have not provided us with all info (8.5,20.5)
Cluster as % of total
This example
packages.
BA Example (3) : In-flight retail
This example shows the use of a SOM in Enterprise Miner to identify
a small cluster of customers with
very high value purchase patterns
A (small) cluster of shopaholics!
Purple squares show normalised mean
For this specific cluster
Blue squares show average
Difference of this cluster from overall mean
Page *
An airline is a very complex business
In this presentation, we are just considering commercial
complexity, that is in the selling process.
Operational complexity is very important to us too, but is another
subject!
Some of this complexity is there for good reasons,
e.g. good commercial sense, supply and demand economics,
or for the convenience of the customer
However, some is ‘historic’ or dictated by third parties,
or is not serving its purpose.
One area in which British Airways is interested at the moment
is,
How should we measure commercial complexity?
and how effective are the many different ‘ways’ of selling tickets
?
and does the complexity matter?
Title Name Page *
Using data mining methods to measure complexity
How can we use data mining methods to try to measure complexity
?
Data mining techniques are good at adjusting their parameters to
represent the level of complexity in the data (number of
dimensions, or interactions, or ‘different things going on’).
Machine learning theory makes use of measures such as entropy
(information), minimum description length, VC-dimension, etc.
Take a decision tree, for example.
It will continue to partition the data set recursively until it can
no longer find significant splits.
So, in the right circumstances, a decision tree can show which
parts of the business are ‘simple’ and which are complex. If we set
the target variable to be a measure of revenue or profitability, we
can also see how the complexity relates to yield, in a crude sort
of way. (Note I have taken no account of ‘cost’ here for the
moment)
Title Name Page *
in Enterprise Miner
“tree ring” diagram
The centre of the diagram represents the ‘root’ of the tree, i.e.
the whole data set
The outside of the diagram
represents the lowest levels
= higher value)
tree
Using a decision tree to measure commercial complexity
In this example, a decision tree is used to show aspects of
commercial complexity.
The input data was for a London-Edinburgh flight on a single
day.
The input variables represent
corporate deals,
special fares,
different currencies, etc.
Large simple areas such as this one for undiscounted club tickets
represent low complexity in this sense. There may be other kinds of
complexity e.g. due to ticket or booking changes.
Highly fragmented areas such as here represent many different rates
and specific circumstances.
“tree ring” diagram
Title Name Page *
Complexity
Profitability
High complexity, high revenue e.g. corporate deals
Low complexity, low revenue - e.g web bookings
High complexity, low revenue e.g. groups
Title Name Page *
Data mining and complexity: Caveats
Data representation. Need to allow enough detail not to average out
the effect we are trying to measure, but need to limit it so we get
a workable model.
Choosing a target variable. There may be elements of complexity
which we are interested in, but which do not cause a change in the
‘target’ variable, and vice versa.
Problem with decision tree if the output is a straightforward
linear function of the input (it will try to model it as
step-functions).
This analysis does not tell us necessarily whether the complexity
we are looking at is good or bad, but gives us places to start
looking.
Much of the time, of course, we are not bothered about the number
of combinations, because the different variables are
decoupled.
There may of course be good reasons for retaining the complexity
!
Title Name Page *
Operational Research
Using a self-organising map to look at patterns in ticket sale
data
revenue
E-tickets
Each of the 8
(10x10) clusters.
Frequency (number of passengers in each cluster ) is not shown but
should be examined alongside these charts.
The input data were for a London-Edinburgh flight on a single
day.
The input variables represent different ticket classes, ‘channels’
(agents, call centres, website and so on), corporate deals,
special fares, different currencies, etc. A subset of 8 variables
is shown here.
Title Name Page *
Here, there is no target variable
We are using the SOM to find structure in the data
We could find the size of SOM needed to model the ‘envelope’ which
covers the data, and use that size as a direct measure of
complexity, in the same way as we could use the size of a decision
tree to measure this ‘dimension’.
We need to be careful how we represent the data, that we are not
just measuring artefacts of the representation.
In the SOM, we can also visually ‘overlay’ the patterns of
different variables as a way of visualising correlations and fine
structure.
In the example shown, some findings are immediately evident,
e.g...
Most non-e-tickets on these flights were multi-leg flights (i.e.
transfers) ticketed by other airlines, in foreign currencies.
Web bookings, though accounting for a relatively large number of
transactions, show up as low complexity.
Title Name Page *
We gave the SOM the space to form
100 clusters. It actually populated 90 of them.
Part of the objective is to find out how much of
the business falls into ‘simple’ and ‘complex’
categories.
That is,
blue executive club tier,
5 passengers in.
Expectations either too high or too low.
Myths of data mining.
Asking the wrong questions.
Data driven and iterative, so cannot necessarily plan in
advance.
Can get swamped by results / options / model versions.
Danger of stating the obvious or not being believed.
Data quality, data definition and business understanding
issues.
Title Name Page *
Spreading understanding
It is often difficult initially to communicate the place, nature
and benefits of data mining, even to experienced statisticians,
operational researchers, or artificial intelligence people, but
once people “get it” they are enthusiastic.
Engineers, Revenue Management and Marketing analysts are often the
closest to the ideas.
Often difficult to convey complex results in meaningful business
terms.
There is sometimes a need to convince ‘upstream’ processes of the
value of collecting, cleaning and maintaining data for data
mining.
Title Name Page *
asking the right questions
Much of the skill in data mining is in helping the client to
articulate the question that they really want to answer and decide
if it is really a data mining question.
E.g.
How many executive club members travelled to New York in business
class last year ?
n
What should our marketing strategy be for the Far East
region?
n
What factors influence a customer’s propensity to recommend
BA?
y
y
?
the right mix of knowledge
With today’s computing tools, it is easy to get ‘results’ from a
data mining exercise.
The difficult part is interpreting these, sense-checking them, and
articulating a simple message from what is often a complex
picture.
Mix of technical and business knowledge essential.
Close involvement of clients and business domain experts.
Title Name Page *
Algorithms:
‘Build vs buy’ decisions
What BA is looking for in a data mining tool …
Set of algorithms with good coverage of problem types.
Scalability
Integration with data sources: ‘openness’
Compatibility with other software and company policy
Justifiable value
Email :
[email protected]
number of clusters
Column (in SOM grid)123123123
Club World PSJs in last year0.91.11.50.30.29.1..0.8
WTP psjs in last year0.20.30.30.10.21...0.2
WorldTrav non-pts earning psjs in LY0.20.50.40.63.90.4..1.
Club Europe psjs in last year4.41.35.50.30.26.6..1.6
EuroTrav non-pts earning psjs in LY4.0.81.10.81.10.8..2.7
Domestic PSJs in last year4.33.2.51.20.72.1..3.3
Europe PSJs in last year11.12.88.81.61.79.6..5.3
Africa / M. East PSJs in last year0.40.61.10.21.12.5..0.3
Far East PSJs in last year0.20.20.40.10.31...0.1
North / Central America PSJs in LY0.81.51.50.82.89.6..1.8
Net Revenue in last year334024424778845116319057..2668
Has miles to redeem into this zone148017..2
ONLINE=Y0.0.0.0.0.0...1.