Mike Ferguson
Managing Director
Intelligent Business Strategies
Information Builders Data Strategy Workshop
London, April 2015
Transitioning to a Data Driven Enterprise
- What is A Data Strategy and Why Do You Need One?
2
About Mike Ferguson
Mike Ferguson is Managing Director of Intelligent Business
Strategies Limited. As an independent analyst and consultant
he specialises in business intelligence, analytics, data
management and big data. With over 33 years of IT
experience, Mike has consulted for dozens of companies,
spoken at events all over the world and written numerous
articles. Formerly he was a principal and co-founder of Codd
and Date Europe Limited – the inventors of the Relational
Model, a Chief Architect at Teradata on the Teradata DBMS
and European Managing Director of DataBase Associates. www.intelligentbusiness.biz
Twitter: @mikeferguson1
Tel/Fax (+44)1625 520700
3
Topics
The increasingly complex data landscape
Why have a data strategy?
• The impact of data issues on your core business processes
• The impact of fractured master data on business operations
• The impact of inconsistent data on analysis, reporting and decision making
• Competitive advantage – the impact of new data
Creating a data strategy – what do you need to consider?
What is needed for enterprise data governance and data management and where
are you on the roadmap?
• People
• Process
• Technology
Getting started
4
The Data Landscape Is Becoming Increasingly Complex And Lack of
Integration Are Working Against Business
Line of business IT initiatives when there is a need for enterprise wide common
infrastructure
Multiple copies of data
Processes not integrated
Different user interfaces
Server platforms complexity
Duplicate application functionality
Point-to-Point “Spaghetti” application integration
Marketing System
Customer Service System
HR Gen. Ledger
Procurement system
Billing system
Fulfilment System
Sales System
Gen. Ledger
5
Trends – More And More Appliances Appearing On The Market Causing
‘Islands’ of Data
Oracle Exadata
IBM PureData
System for
Analytics
Pivotal Greenplum DCA
Teradata
6
Big Data Is Also Now In The Enterprise Introducing More Data Stores, e.g.
Hadoop, NoSQL, Analytic RDBMS
Graph DBMS MPP Analytical
RDBMS
BI tools platform &
data visualisation tools
SQL indexes
Search based BI
tools
Custom
MR apps Map Reduce
BI tools
OLTP data Unstructured / semi-structured content Event streams
actions
Stream
processing
users business analysts developers
real-time
DW
social graph
data
RDBMS Files
clickstream
Web logs social data
Graph analytics
tools
Enterprise Information Management Tool Suite Stream
processing
7
Complexity Is Increasing Further As Companies Adopt and Deploy A Mix of
On-Premise, SaaS and Cloud Based Systems
On-Premise Systems Within the
Enterprise
employees partners customers
Private cloud Private or public cloud
Enterprise Service Bus
Enterprise Portal Mashups Office Applications
SaaS BI
Off-premise
hosted apps Operational & BI Systems
WWW
corporate
firewall
Data is now potentially fractured even
more than before
8
Hundreds of New Data Sources Are Emerging
- The Internet of Things (IoT)
9
The Task Of Governing and Managing Data Is Becoming Increasingly
Complex As Data Becomes Distributed
<XML>Text</XML>
Digital media
RDBMSs
Web content
Flat files
Packaged
applications
Office
documents Legacy
applications
BI
systems
Big Data applications
Cloud based
applications
ECMS
“Where is all the
Customer Data?”
10
Why Do We Need A Data Strategy and Enterprise Data Governance?
Uncontrolled and unmanaged data impacts:
• Business operations
– Employees, customers, partners and suppliers struggle to find information
– Incomplete and inaccurate data can cause process defects and delays
– Business are slow to respond when they do not have the required data in time or when it is not
fully trusted
– Can cause errors that result in customer dissatisfaction
• Business decision making and performance management
– Incorrect or poor quality decision making
– Inability to make decisions
– Performance management reconciliation problems
– Excel mania!
• Compliance
– Violation of regulations e.g. inaccurate regulatory and legislative reporting
11
As Processes Execute, Subsets And Aggregates of Master and Transaction
Data Are Stored In Many Different Systems
order
credit
check fulfil ship invoice payment package
Process Example - Manufacturing Order to cash
schedule
Order
entry
system
Finance
credit
control
system
Production
planning &
scheduling
system
CAM
system
Inventory
system
Distribution
system
Billing Gen Ledger
Orders data Customer data Product data
This makes data difficult to track, maintain, synchronise and manage
12
Business Operational Transaction Processing
– The Ideal Situation
order credit
check
fulfill ship invoice payment package
Order-to-Cash Process
An ideal situation would be smooth operation, increased automation, no
delays, no defects and no unplanned operational cost
Orders
13
Data Issues In Transaction Processing Impact Business
- What Are We Looking For In Business Processes?
order credit
check
fulfill ship invoice payment package
Data errors
Orders
Order-to-Cash Process
errors errors
££
data quality
problems e.g.
missing or wrong
data on order entry
£
Unplanned operational cost = (£ + £££ + ££) * Number of Orders
£££
manual
intervention
and process
delays
All these defects add up to unplanned operational cost of processing an Order
Whatever you do has to reduce unplanned operational cost
Domino impact
What about other
types of transactions
that have data related
problems?
14
The Impact of Data Anomalies In Transaction Processing As The Business
Scales Can Be Considerable
order credit
check
fulfill ship invoice payment package
Data errors
Orders
Order-to-Cash Process
errors errors
££££
data quality
problems e.g.
missing or wrong
data on order entry
£££
Unplanned operational cost increases as the business scales if anomalies are not
fixed and data is not governed
£££££££
manual
intervention
and process
delays
Domino impact
15
Master Data Anomalies – Audience Question?
ERP
What happens if you have to invoice a customer?
What happens when you receive a payment from a customer?
How many of you have duplicate customers in your ERP system(s)?
Duplicate
customers? Change customer details
If you change the details of a customer address do you change all duplicates?
Does your ERP system send customer data to other systems?
If so does it send all duplicates? What happens if duplicates are not in sync?
16
Master Data Is Often Fractured Across Multiple Data Entry Systems – E.G.
Customer Data
Mortgage
System
Customer
data subset
Branch
Banking
System
Customer
data subset
Loans
System
Customer
data subset
ERP
System
Customer
data subset
Credit Card
System
Customer
data subset
Call Centre
System
Customer
data subset
Different identifiers for the same entity in each data entry system
Different data definitions for the same data in each data entry system
Different subsets of master data in each system
Inconsistent master data in each data entry system
Varying degrees of duplication of master data in each data entry system
Synchronisation issues
Data conflicts
17
Changes To Master Data In A Stand Alone Multi-ERP Environment Makes
Globalisation Very Difficult
ERP
ERP
ERP
ERP
ERP
ERP
ERP
XYZ Banking
Group
XYZ
Mortgages
XYZ
Loans
XYZ
Cards
XYZ
Insurance
XYZ
Investments
ERP
ERP
ERP ERP
ERP ERP
Suppliers
Products/
Services
Accounts
Assets
Employees
Customers Partners
Materials
New product
New supplier
Update
materials
Update
account
Update materials
update
chart of
accounts
Update customer
New partner
update chart
of accounts
18
Master Data Maintenance - The Problem of Multiple Data Entry Systems and
Master Data Synchronisation
This has to be done for
changes to EVERY
master data entity
Mortgage
System
Customer
data subset
Branch
Banking
System
Customer
data subset
Loans
System
Customer
data subset
ERP
System
Customer
data subset
Credit Card
System
Customer
data subset
Call
Centre
System
Customer
data subset
The “synchronisation
nightmare”
The problem gets worse as you
add more applications
19
Master Data Synchronisation – The Spaghetti Architecture
Complexity & Lack of Integration Is Working Against Business
Where is the complete set of master information?
How do I get the master data I need when I need it?
With so many definitions for master data what does it mean?
Can I trust it?
Is it complete and correct?
How do I get it in the form I need?
How do I know where it goes and if it is correct?
How do I control it?
Spaghetti Interfaces between systems
How much does it cost to
operate this way??!
20
Inconsistent Master Data Can Disrupt Business Operations and Drive Up
Costs Due To Manual Intervention Being Needed
order credit
check
fulfill ship invoice payment package
Manufacturing - Order to cash
prod cust
asset
Master data
X
How many people do you
employ to fix and reconcile data
because it is not synchronised?
What master data entities are
used in your core processes
In what systems in your core
processes does it reside?
Where in your core processes
is master data created?
Where in your core processes
is it consumed?
21
XYZ
Corp.
Many Companies Have Business Units, Processes & Systems Organised
Around Products and Services
Customers/
Prospects
Product/service line 1
order credit
check
fulfill ship invoice payment package
Product/service line 2
Product/ service line 3 Channels
/
Outlets
order credit
check fulfill ship invoice payment package
order credit
check fulfill ship invoice payment package
Order (product line 1)
Order (product line 2)
Order (product line 3)
Enterprise
22
Business and Data Complexity Can Spiral Out Of Control if Processes And
Systems Are Duplicated Across Geographies
Product line 1
Product line 2
Product line 3
Product line 1
Product line 2
Product line 3
Product line 1
Product line 2
Product line 3
Product line 1
Product line 2
Product line 3
Product line 1
Product line 2
Product line 3
Suppliers
Products/
Services
Accounts
Assets
Employees
Customers Partners
Materials
23
Business Implications Of Product Orientation and Fractured Customer Data In
A World Where Customer Is Now King
Different marketing campaigns from different divisions aimed at the same customer
Different sales teams from different divisions selling to the same customer
Customer service is hard e.g. What is my order status for all products ordered?
Cost of operating is much higher due to duplicate processes across product lines
Can’t see customer / product ownership
Can’t see customer risk and customer profitability
Higher chance of poor data quality
Difficult to maintain customer data fractured across multiple applications
24
Enterprise Data Governance and MDM Business Case
- What is the Business Benefit?
How much complexity would be removed from your business
if master data was centralised?
How much could you save in reducing the cost of operating if
master data was centralised?
How much more responsive would your business be if
everyone could see changes to master data as soon as they
happen?
How many duplicate processes associated with master data
could be removed from your business if master data was
centralised?
How many FTP transfers and emails with spread sheets would
be eliminated if data could be managed by a single suite of
tools
Data Governance &
MDM is a corporate
‘weight loss’ program
25
marts marts
marts
Data Issues - Many Companies Have Built Multiple DWs and Marts In
Different Parts of Their Value Chain
Fore-
casting
Product,
Materials
Supplier
Master data
Planning
ERP ERP CAD Manufacturing
execution
system
Shipping
system
CRM
system SCADA
systems
Finance DW Manufacturing
volumes &
inventory DW
Sales &
mktng DW
Financial /
Reg Reporting
& Planning
Makes management and regulatory
reporting more challenging as data
needs to be integrated to see
across the value chain
May also be the case that data is inconsistent across data warehouses
e.g. different PKs, data names, hierarchies and DI/DQ jobs for same data in each DW
The issue here is project related DI
26
Do You Have Data Consistency Across All Your BI Systems?
BI tool BI tool
DW mart
BI tool BI tool
DW mart
BI tool BI tool
DW mart
Data Integration Data Integration Data Integration
Common data definitions across all
tools for the same data?
Common data definitions across
all DWs for the same data?
Common data transformations across
all DWs for the same data?
Same data integration
tool for all DWs?
27
Why Standardise on Data Definitions?
Confusion as to what data means
Lack of Trust to use it
28
What Else Should A Data Strategy Bring?
Competitive Advantage!
29
Sales
Product line n
Product line 4
Product line 3
Product line 2
Product/
service line 1
Marketing
Service
Credit
Verification
HR
Finance
Planning
Procurement
Su
pp
ly C
ha
in
Su
pp
liers
Front Office BackOffice
Operations
Cu
sto
me
rs
New Data Sources Have Emerged Inside And Outside The Enterprise That
Business Now Wants To Analyse
E.g. RFID tag
sensor
networks
weather data
Data volume
Data variety
Number of sources
Data volume
Data velocity
30
Popular Types of Data That Businesses Now Want to Analyse
Web data
• Clickstream data, e-commerce logs
• Social networks data e.g., Twitter
Semi-structured data e.g., e-mail
Unstructured content
IT infrastructure logs
Sensor data
• Temperature, light, vibration, location, liquid flow, pressure, RFIDs
Vertical industries structured transaction data
• E.g. Telecom call data records, retail
31
Why New Data?
– The Demand for Enhanced Customer Data
Source: IBM Redbook - Information Governance Principles and Practices for a Big Data Landscape
32
We Need To Combine Data To Get Deeper Insights
MDM System
C
R
U
D
Prod
Asset
Cust
Who are our
customers?
What products
do we sell?
What is the online behaviour of loyal, low risk, low fee
customers so we can offer them higher fee products?
Basing customer analysis on transactions activity AND behaviour patterns
helps to determine whether or not to strengthen or weaken a relationship
DW
Who are our
most loyal, low
risk customers
that generate
low fees?
What are the most
popular navigational
paths through our
web site that lead to
high fee products
33
Data Deluge - Data Is Arriving Faster Than We Can Consume It
– How Good Is Your Filter?
F
D I
A L
T T
A E
R
Enterprise
Enterprise systems
34
Organising New Data In A Data Reservoir
– This Needs To Be Built Incrementally
Data
Ingest
zone
Exploratory
analysis zone
(prepare &
analyse data)
DW Archive
zone
New
Insights
zone
DW
Graph
DBMS
DW
Appliance
Analytical DBMS NoSQL DB
Data marts
insig
hts
Txn
s
sandbox
Enterprise Local
Trusted
Data
e.g.
Master
Data
MDM
C
R
U
D
35
Organising New Data In A Data Reservoir
– You Have To Catalog Data, Its Status And Where It Is
Raw data In-Process data
Untrusted Trusted
corporate
firewall
Data Refinery
Fit for use
Information Catalogue
Raw data
cloud
status status
Social Media,
Web Logs Documents,
Industry
Standards
Machine Device,
Scientific
Transactions,
OLTP
Refined data
Data Strategy
37
Key Requirements for Enterprise Data Management And Data Governance
1. Create a vision and strategy for information management
2. Create the right organisational structure (people) to govern data
3. Nominate, standardise and define the data to be managed and governed
4. Create the right processes to manage and govern data
5. Define policies and policy scope to manage and govern specific data items
6. Follow an implementation methodology to get your data under control
7. Use technology in each step of the methodology to help implement the policies
and processes to manage and govern the data
8. Produce and publish trusted data and services for others to easily find, order and
consume
38
Why Is A Data Strategy Important?
- What Do You Need To Consider?
What are your data issues?
• e.g. incorrect or missing data, late data, duplicate data (customers)
What is the business impact caused by data anomalies?
• Processes
– E.g. Major increases in manual activity to redo tasks
– Manufacturing errors, late deliveries, customer dissatisfaction
– Process delays e.g. month end close delayed, reports delayed
– Transactions rejected
• Decisions
– Incorrect, delayed, inaccurate/ incomplete reporting, lost opportunity
Who is affected by data anomalies?
• e.g. departments, customers, suppliers
What is the estimated unplanned annual cost to the business?
• Break it down by department (business and IT)
39
What Do You Need To Consider – 2
What is the risk to the business going forward?
• What is the risk? e.g. headcount increase, anomalies out of control as the business scales
• Where is the risk?
What is the estimated opportunity cost savings if you could fix it?
• Break it down by department
What new (big) data should you bring on board that offers the greatest competitive
advantage? What is your big data strategy?
How will you capture, manage, clean and integrate new data and make trusted data
and new insights available for consumption?
How will you manage IT and self-service data integration?
How will you co-ordinate activity to enrich what you already know
The recommendations you need to maximise the value of data
40
What Are The Issues With Structured Data Management and Data
Governance
What data needs controlled?
Where is that data?
What data names is it known by?
What should it be known by?
What state is the data in?
Does it need to be cleaned, transformed, integrated and shared?
Where does it originate and where does it flow to?
Should it be kept synchronised?
Who is allowed to access it?
Who is allowed to maintain it?
How much power do those users have and how are they audited?
41
Key Requirements – We Need to Create A New World of Information
Producers and Information Consumers
Need to make use of
• A business glossary and information catalog
• Re-usable services to manage and process data
• Collaboration and social computing to manage, process and rate data
• Role-based data management tools aimed at IT AND business
clean &
integrate
service
raw data trusted data
Information
catalog
BI tool or
application
search
find
shop
order consume
data scientist
IT professional
information producers
clean &
integrate
service
raw data
business analysts
information consumers
like a
“corporate
iTunes” for
data
42
What Are You Producing?
Trusted, integrated, commonly understood master data
Trusted, integrated, commonly understood reference data
Trusted new insights from big data
Trusted new master data attributes from big data
Trusted, integrated, commonly understood data in data warehouses and data marts
Trusted, commonly understood data in OLTP systems
Trusted, commonly understood data available on-demand on an enterprise service
bus
43
Data Management and Enterprise Data Governance Needs People, Process,
Policies and Technology
Data Management and Enterprise Data Governance
The people, processes, policies and technology used to formally
manage and protect structured and unstructured data assets to
guarantee commonly understood, trusted and secure data
throughout the enterprise
This is about simplification, reducing complexity, lowering cost and
increasing integration across the enterprise