+ All Categories
Home > Documents > Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical...

Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical...

Date post: 09-Jun-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
56
Architecting a BI Solution Erich Teichmann Technical Director, BT
Transcript
Page 1: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Architecting aBI SolutionErich TeichmannTechnical Director, BT

Page 2: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Quick Introduction

• Enterprise Architecture– Why do we need the solution?

• Market drivers; business needs

• Business Drivers

– What are the major components?

• Key pieces that make up the solution & Key interfaces

• Application Architecture

– How are we going to buy/build the components?

• Vendor Analysis

• Frameworks, Methodologies & Tools

– How are we going to deploy & manage the solution?

• Case Study– BI solution driven by the Basel II accord

Page 3: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Market Analysis

Page 4: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Market Definition

• Business intelligence (BI) is now a ubiquitous, all-encompassing and often overused term covering a range of different tools, technologies and disciplines.

• Most analysts have a very fluid definition of BI describing it as ‘a set of technologies and processes that support the decision making process’.

Page 5: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Market Definition - BI Components

• From a software perspective, BI technology covers, but is not limited to, the following capabilities and disciplines:

– extracting and/or processing information from operational IT systems;

– integrating, possibly supplementing and storing that information in the most appropriate form

– providing the ability to opportunistically analyse and/or automate the analysis of information according to business needs

– disseminating and delivering analyses of that information

– determining how best and when to feed the results of analysis back into a company’s business processes so that it may be actioned

– providing robust development and management tools to support the BI information infrastructure.

• A BI platform is the name commonly given to the culmination of these technologies and processes, and is the technical infrastructure by which technology or industryspecific analytic applications can be built and deployed.

Page 6: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Components

EnterpriseLayer

User Layer

HDS SiloHDS SiloHDS SiloHDS SiloHDS SiloHDS Silo

Restructure - Cleanse

Subset - Aggregate

Data feed Data feed Data feed

HistoricalData Store

Staging

Met

a da

ta

Dev

elop

men

t

Man

agem

ent

Page 7: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Market Drivers

• The growth in BI can be attributed to the use of BI technology to address three broad business drivers:– reducing exposure to risk

– reducing costs

– driving competitive advantage.

Page 8: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Market Drivers – Reducing Risk

• Regulation is a fact of corporate life today.– Over the last few years we have seen the introduction of a raft of regulations,

such as Sarbanes-Oxley, Basel II, antimoney laundering and HIPPA, all of which place pressure on companies to be more accountable for themonitoring and use of information.

– These regulatory requirements not only impact on the business but also IT, as the process of capturing, storing and reporting business information is brought under tighter control.

• BI software can play a vital role in helping organisations comply with these regulations– information audit trails,

– data lineage,

– providing an integrated view of the business, and

– enhanced financial reporting and performance measurement

• More standardisation and control to a company’s business processes.

Page 9: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Market Drivers – Reducing Cost

• CxOs are being set far more drastic targets by their CEOs.– CIOs, for example, are being asked to reduce operational IT costs by

30% over three years, which means that for many trimming around the edges is no longer enough.

• From a BI perspective organisations are having to take more radical steps to reduce costs and this is manifesting itself in two broad forms:– Greater emphasis on the consolidation and rationalisation of BI

products and vendors. The cost and complexity of maintaining multiple BI tools is too high.

– Organisations are using BI software to identify areas of the business that are not performing, by defining and measuring key performance indicators across the business - commonly referred to as performance management.

Page 10: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Market Drivers – Competitive Advantage

• The window of opportunity for competitive differentiation is short but highly valuable, and BI software can play a key role in helping organisations make the most of this window.

• In particular, the ability to access, analyse and exploit the right information at the right time allows organisations to optimise and make more effective and timely day-to-day business decisions.

• Examples?

Page 11: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Inhibitors to BI Mass Market

• Data integration;

• Pricing;

• Ease of use

Page 12: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Data Integration

• The critical success factor for many BI projects centres on the confidence levels attributed to the data underlying a BI solution.

• Users need to be confident about the validity of the data in order to make successful decisions and ensure widespread BI adoption.

• Need a series of data integration tools and techniques for better integrating their data in such a way that these initiatives and efforts can be shared and reused, thereby reducing some of the costs associated with duplication and unnecessary complexity.– not just a case of slapping some reporting tools on to exisiting data.

• Data integration issues such as poor data quality, isolated data silosand multiple versions of the truth contribute to this lack of reuse and data confidence, and prevent many BI deployments from maximising their true enterprise potential and ROI.

Page 13: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Pricing Pressures

• The high cost of buying, implementing and maintaining a BI solution has so far prevented adoption of BI technologies on a mass scale.

• The most immediate obstacle to this mass adoption comes from the pricing and licensing models employed by many of the BI vendors.– vendors realise that this has to change…

• Large, enterprirse-wide projects.

Page 14: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Ease of Use

• Information consumers are typically concerned with getting access to the right information at the right time and do not want to struggle with the baffling array of feature and functions present within certain BI tools.

• Vedors address this by:– offering dashboards that provide a more simplistic and relevant view of business

information;

– placing an increasing emphasis on integrating with Office productivity tools, especially Excel, to increase the familiarity and usability of the user interface for accessing BI information;

– employing varying levels of visualisation capabilities within their products, ranging from basic charts and gauges to more advanced mapping technologies to improve the display and comprehensibility of information.

Page 15: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Positioning BI

Page 16: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI at the right Organisational Layer

• There are three general strata, each with unique characteristics:– Strategic BI.

• Targets executives and other high-level decision-makers concerned with long-term corporate strategy.

• e.g. long-term customer and product trend analysis; long-term forecasting (sales over the next 12 months, expenses for the next three years); and corporate financials by divisional, geographic, and temporal dimensions.

– Tactical BI.

• Targets middle management, where mid-level decision-makers are concerned with short-term domain-specific tactics.

• e.g. short-term customer and product analysis; short-term forecasting (workers needed for the next three weeks, resource use expectations for two months); and departmental financial forecasts versus actuals.

– Operational BI.

• Targets production workers who make low-level decisions about instantaneous process-specific actions.

• e.g. immediate call-center upsell/cross-sell; just-in-time manufacturing; trading floor buy/sell/hold; and retail credit approval.

Page 17: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Key differentiators for three strata of BI

Source: Forrester (2004)

Page 18: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Key differentiators for three strata of BI

Source: Forrester (2004)

Page 19: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Products & Platforms

Page 20: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Products - Shortfalls

• Today's BI products fall short in three key areas:– Data Mining.

Complex statistical and mathematical capabilities that map complex patterns from large quantities of data. Sophisticated products are few and far between…

– ETL: Extraction, cleaning, and normalizing of data that is then translated into a common format and loaded into data warehouses for use.

– Vertical reporting tools: Standardized report formats and specific business issue solutions that can be generated and delivered across the enterprise, as well as predefined analytical questions of the data.

• Tier one and tier two companies are spending R&D pounds on developing interface and standard report offerings to address vertical issues.

• Companies like Oracle are building out their ETL and data miningofferings, too, but vendors with niche expertise and product offerings are well positioned for acquisition for this functionality, e.g. Ab Initio

• Data mining in particular needs sophisticated business analytical personnel

Page 21: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BI Platforms - Shortfalls

• Most firms' BI strategies involve a multitude of tools connected to many different data sources, resulting in:

• Unnecessary software spending.– BI software purchases have historically occurred at the department level, so most

firms have more than one competitive BI tool in-house.– Piecemeal tool purchasing based on per-user licenses also leads to shelfware,

further increasing firms' software costs without delivering any extra value.

• Excessive training, development, and support.– Vendors use the same language to describe the functionality of their query,

reporting, and OLAP tools, but their actual implementations and tools vary significantly. As a result, each BI package that firms own requires unique training, development, and IT support.

• Multiple versions of the truth.– The status quo at most firms is that departments use their own BI software and

data sources to produce their own metrics.– The problem? Many metrics - like profit, customer value, and revenue, for

example - are bigger than any single department. Get conflicting metrics that purport to measure the same thing.

Page 22: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Status quo – a tangled web of tools

Page 23: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

One-Stop Enterprise Intelligence

• Firms have historically pieced together multivendor BI strategies because they didn't have a choice -- no single product or vendor provided the analytical features necessary to serve all users.

• New breed of BI software has arrived: BI platforms. They offer…• A unified data layer.

– Instead of multiple tools making independent connections to data in different combinations of source systems, BI platforms provide a common metadata layer that unifies data access, creating a "virtual warehouse" view of enterprise data.

– Reporting, query, OLAP, and mining tools that make up the BI platform all interact with this central metadata model so that all users, regardless of their department or analytical prowess, have access to the same “version of the truth”

• Centralised tools.– BI platforms replace firms' chaotic mix of multiple, redundant tools for functions like reporting

and OLAP with centralised, Web-based interfaces or services (SOA).– Sharing a single tool set across the enterprise means that IT can finally gain control over the

aspects of BI they handle best: infrastructure, security, and account maintenance. (go BT!)

• Shared metrics.– BI platforms put an end to contradictory computations by storing key metrics centrally within

the same metadata layer that unifies access to data. When users want to design a report that computes "churn," they simply drag it to their report from the corporate metric repository.

Page 24: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Streamlined Enterprise Intelligence

Page 25: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Few vendors offer true BI platforms

• Vendors have converged on the BI platform market from the reporting market (Brio Software and Crystal Decisions), query tools (Business Objects), multidimensional analysis (Hyperion Solutions and Cognos), infrastructure (Microsoft and Oracle), business applications (SAP), and out of the blue (MicroStrategy). They've all got work to do.

• Database vendors must improve tools.– Key players host much of the data that feeds firms' BI platforms and have built

OLAP engines and substantial analytical capabilities into their database engines, but neither has brought a coherent set of BI tools to market.

• BI veterans must embrace data mining.– Deliver solid reporting and OLAP features, but they don't provide sophisticated

data mining capabilities that professional data analysts need.

• Leaders must focus on functional depth.– Provide software that runs the gamut of BI platform functionality, but their tools

are merely adequate, not outstanding, in advanced disciplines like data mining and inline analytics.

Forrester Research (2004)

Page 26: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Business Intelligence Platforms – ‘04

Forrester Research (2004)

Page 27: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Magic Quadrant for BI Platforms

Gartner (Jan 2007)

Page 28: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Questions?

Page 29: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Questions?

Page 30: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Case Study

DWH / Data Mining Solutionin response toBasel II Accord

Page 31: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Background

• The Basel II Accord demands a new approach to Risk Management and Capital Adequacy– Current rules require banks set aside a fixed proportion (8%) of deposited funds

to pay customers when they withdraw money. The rest (up to 92%) they lend out (mortgages, loans, etc) or otherwise invest

– The magic 8% is supposed to be sufficient to cover occasional losses incurred when debtors “default” (fail to pay back their loans when expected)

– It doesn’t take into account the actual level of risk of loss presented by different debtors – i.e. the likelihood that they will default, and the amount of the lending exposure that will be lost when they default

– For example – the likelihood of default may be lower for a 40 year old employed person with no bad credit history than for a 22 year old self-employed person with several “black marks” on their credit file

– Also – if a mortgage for £50,000 on a house worth £100,000 goes into default, the expected loss may be significantly less than for an unsecured loan of £25,000

– Current rules do not require banks to perform any risk analysis on their loan book, nor disclose how they have calculated their capital set-aside

Page 32: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Background

• The Basel II Accord demands a more rigorous approach in three key respects (“Pillars”):– 1) Capital adequacy requirements must be calculated using a well-defined

quantitative approach, based on estimates of three key measures:• Probability of Default (PD) – the likelihood an account will default• Exposure At Default (EAD) – the amount of the loan exposure at the time of

the default• Loss Given Default (LGD) – the proportion of the loan exposure that the bank

will lose (i.e. not recoup from collections) if the account goes into default

– 2) Risk calculations and capital requirements must be managed and validated through a controlled, supervised process approved by the Regulator

– 3) Summary risk evaluations and capital adequacy calculations must be disclosed through regular reporting to the FSA

Page 33: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Background

• To support the fulfilment of the 3 Pillars, the Accord explicitly or implicitly specifies a number of high-level requirements relating to capture and exploitation of data:– Access to comprehensive, accurate, up-to-date account and customer data

across the entire portfolio of products sufficient to support quantitative estimation of the 3 risk measures

– Ability to create and validate calculation processes (“models”) for risk measures using historical data going back up to 7 years or more

– Facility to execute models on a regular basis for actual capital calculations

– Capability to provide a complete audit trail of data back to source systems to show how every calculation was derived

– Facility to create reports on risk measures and capital requirements broken down by product and/or customer segments

Page 34: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Customer Requirements

• BASEL HISTORICAL DATA CAPTURE– Acquire and retain historical data for risk analysis, enabling data exploration and modelling

• CAPACITY– Store 5 - 7 years of detailed historical data, minimise duplication of data

• FLEXIBILITY– Support wide range of user queries against current and historical data - data store may need

to support other areas (marketing, general CIS queries) as well as Basel

• PERFORMANCE– Deliver high-performance for data load process and end-user queries

• USABILITY– Enable users to access, query and understand data with ease

• SINGLE VIEW OF TRUTH– Provide consistent “single view” of data to all user groups

• AUDITABILITY– Provide full audit trail from source data to model output and reporting

Page 35: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

BDS Architecture Overview

• BASEL HISTORICAL DATA REQUIREMENTS - retains complete set of historical data for 5 to 7 years

• CAPACITY - efficient data capture structure, will support nearline storage technology

• PERFORMANCE - High speed “Change Data Capture”provides rapid population of HDS

• AUDITABILITY - Data is held in raw, unmodified form, enabling derived data to be traced back to source

• SINGLE VIEW OF TRUTH - data is cleaned, transformed and mapped into a single “conformed”structure, with clear business definitions available from the Metadata repository

• USABILITY - based on Kimball’s dimensional modelling techniques, ensuring usable, high-performance querying. Clear business definitions for all data available from the Metadata repository

• PERFORMANCE -”Changes only” approach supports rapid EL population. Star schema models support fast querying.

• FLEXIBILITY - Can easily be modified or partially rebuilt using historical data to support new or evolving requirements

• ANALYSIS - Uses SAS to support exploration, analysis and modelling to generate risk data, which is fed back into the BDS for internal and external reporting

• REPORTING TOOLS - Used directly against Enterprise Layer, or against OLAP cubes for rapid multidimensional analysis

EnterpriseLayer

User Layer

HDS SiloHDS SiloHDS SiloHDS SiloHDS SiloHDS Silo

Restructure - Cleanse

Subset - Aggregate

Data feed Data feed Data feed

HistoricalData Store

Staging

Page 36: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

The Historical Data Store

Page 37: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

HistoricalData Store

What is the HDS?

• The Historical Data Store is a key component of the Data Warehouse Architecture

• Addresses the challenges of– Storing complete history of a data

source

– Storing it efficiently

– Updating it quickly

– Supporting flexible development of high-performance, usable data marts in the EL

EnterpriseLayer

User Layer

HDS SiloHDS SiloHDS SiloHDS SiloHDS SiloHDS Silo

Restructure - Cleanse

Subset - Aggregate

Data feed Data feed Data feed

Staging

Page 38: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

What is historical data?

• Data used for analysis and reporting can be divided into two broad categories– Transactional data – i.e. records which represent events occuring in time, which tend to be written once

and not subsequently updated, e.g. “financial transaction” or “contact event”

– Static data – i.e. records which represent persistent business entities, e.g. “customer”, “account”, “product”, which may change over time

• Previous states of static records, and the underlying events which are embodied in changes to static records, can be vital elements of business information

• New records, changes to existing records, and even the deletion of existing records are all events that can be useful for analysing trends, producing models, triggering customer communication, etc

• History of static data is generally not retained in operational systems. A change in a variable, such as a customer changing name or marital status, typically overwrites the variable, losing its previous value

• The HDS is designed to capture data regularly from source systems, and retain the current and all previous states of each static record over time. It also holds a full history of associate transactional data, enabling transactional records to be linked to the correct static record versions.

Page 39: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

What is historical data?

• Representing historical static data requires the storage of multiple record “versions” for each source record– Initial version

– Possibly one or more updated versions

– Possibly a “deleted record” version, indicating the record has been deleted

– Change data records referred to as “deltas”

• Each delta is time-stamped with a start and (for previous versions) an end date, representing the time range over which the delta is/was valid.

J SMITH123

First version

Original data

Historical data

1 £3000Primary

Key

Data

J SMITH123 0 £2500

Updated

Updated

Deleted

J SMITH123 0 £3000

J SMITH123 1 £3000

J SMITH123 1 £3000

Deltas

Page 40: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

How is historical data stored?

• The HDS stores historical data as follows– For each source table, an equivalent table is constructed

in the HDS containing “deltas”

– The fields in the HDS table include all the fields in the source table (except where specified as excluded by the developer), plus a “surrogate key”, and a number of control fields

– The surrogate key is the primary key in the HDS table. This is because the source primary key cannot be used, since multiple deltas for a single source record may be held, resulting in multiple records with the same source primary key.

– Control fields used to manage multiple record versions

– In general, minimal transformations are performed on the data prior to load into the HDS – it is typically held in the same form to the received extract(s), or split in two to represent slow-changing (static) and fast-changing (transactional) data

J SMITH123 £300

456

567

678

…J SMYTH123 £250 Mon Tue -

Sur

roga

te k

ey

Che

cksu

m

567

Sta

rt d

ate

End

dat

e

Pre

viou

s ke

y

Nex

t key

…J SMITH123 £250 Tue Fri 456 678

…J SMITH123 £300 Fri - 567 -

0

Ope

n fla

g

0

1

Acc

ount

num

ber

Nam

e

Mon

thly

pay

men

t

I

Act

ion

U

U

I U DInsert Update Delete

Staging

HDS

Page 41: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

How is historical data stored?

• Each delta in the HDS has a time range over which it is/was valid, and is described as being one of three “action” types - an insert (I.e. a new record), an update , or a delete

• Each delta is linked to its previous and next incarnations, so that each source record is represented as an independent “chain” of deltas spanning the time of the source record’s existence (and possible subsequent deletion)

Load Number

A

Time

C

D

E

1 2 3 4 5 6 7 8 9

B

Lega

cy K

ey

I

I

I

I

I

U

U

U D

U

U U

U D I

I

U

D

Insert

Update

Delete

Page 42: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

How is historical data stored?

• The “Open view” is used to represent the set of deltas that are/were “open” (i.e. representing current and currently deleted source records) at a particular point in time

• Normally it represents the open records at the current time

A

C

D

E

1 2 3 4 5 6 7 8 9

Load Number

B

Lega

cy K

ey

I

I

I

I

I

U

U

U D

U

U U

U D I Open View

Page 43: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

How is historical data stored?

• The Open view can be “rolled back” to a previous load cycle, allowing HDS users to “see” the data as it existed at any previous time-point

A

C

D

E

1 2 3 4 5 6 7 8 9

Load Number

B

Lega

cy K

ey

I

I

I

I

I

U

U

U D

U

U U

U D I

Open View for Load Number 6

Page 44: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

How is historical data loaded?

A £300

D £400

A £250

C £100

D £400

HDSHDS

I

I

I

I

I

U

U

U D

U

U U

U D I

E £350

New Data Open View

A £300

E £350

C £100

U

D

I

Detected Changes

Change Data Capture

• Static data is loaded into the HDS using the “Change Data Capture” process. This compares the new set of data with the current Open view, and determines which changes have occurred (inserts, updates, deletes)

Data source

Page 45: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Benefits and caveats• Benefits

� Efficient storage - does not store multiple versions of the same record unless it has changed

� Avoids loss of data through cleansing process - full original versions of all data fields are retained in the HDS irrespective of whether they are fully or partially used, or even understood - this is a key requirement of the Data Feeds Architecture

� Flexible - No restrictions over “temporal granularity” - i.e. can capture changes every day if required, not just once per week/month/quarter

� Rapid data load - change data capture processes data very quickly

� High performance data mart load process (built in load cycle iterations, using indexes on appropriate control columns and keys)

• However…

• Architects must bear in mind that the HDS is not a suitable structure for general user querying. In addition to being “unclean”, data in the HDS does not conform to generally understood principles of relational integrity -i.e. relationships between tables rely on applying time constraints as well as foreign key joins. This results in sub-optimal performance and usability for some query types.

• The Enterprise Layer is provided in order to expose HDS data to general users in usable, high-performance, clean structure.

• The HDS captures data from an early point in the development process and ensures that no data is lost through cleaning or restructuring, allowing the Enterprise Layer to be created and modified to support all foreseeable user query types

Page 46: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

The Enterprise Layer

Page 47: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

What is the Enterprise Layer?

• Based on Kimball’s Dimensional Data Warehouse

• Presents data to users in a form that is

– Easy to use - no complex joins, meaning mistakes are less likely

– Consistent to use - all fact tables and dimensions are used in broadly the same way

– High-performance - each table only one join away from fact table - fact/dimension joins are consistently optimised by RDBMSs(including SQL Server)

– Supports straightforward load into cube formats

– Data mart build process accelerated using CDC

– Capability to partially or completely reconstruct EL from HDS if necessary

HistoricalData Store

EnterpriseLayer

User Layer

HDS SiloHDS SiloHDS SiloHDS SiloHDS SiloHDS Silo

Restructure - Cleanse

Subset - Aggregate

Data feed Data feed Data feed

Staging

Page 48: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Kimball’s Dimensional WarehouseData feed

Staging

Data feed Data feed

Restructure - Cleanse

DataWarehouse

Dimensional model - a collection of “star schemas” in which data is divided into measures (stored in fact tables) and context (stored in dimension tables), using surrogate keys to link records over time. Fact tables are independent of one another, whereas Dimensions are “conformed” (shared between fact tables.

Users directly access Data Warehouse, or use summary/subsetteddata for enhanced performance or specialised needs

Page 49: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

What is the Enterprise Layer?

• Methods include– Dimensional modelling , widely used, well-documented structured

method for star schema design

– Pervasive use of “surrogate keys ” - integer primary keys unrelated to original source key, to enable correct treatment of historical data, reduce data volumes and improve performance

– Normalised fact tables (all categorical or textual data such as codes, account numbers, indicators, etc, are moved out to dimensions) reducing transaction data volumes

– Denormalised dimension tables - making dimensional data easier to use - e.g. combining customer with current residential address so that users don’t need to use an intersection table

Page 50: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

SAS IntegrationSAS

Enterprise Layer

Party

Account

TreasuryDeal

AccountGroup

Time

Product

Credit RiskMeasures

HistoricalData Store

Model Build Mart

Model Production

Mart

Ad-hocload

Productionload

Deploymodel

Model writeback

DataExploration

MartAd-hoc

load

SAS/CONNECT to SQL Server

Page 51: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

OnlineNearline

Storage Summary

Static dataClosed more

than one cycle previously

Static dataOpen

Static dataClosed in last

cycle

Transaction data=< 14 days old

Transaction data> 14 days old

Dimension Dimension

Dimension Dimension

HDSEL

ETL

Archive Transactions

Page 52: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Layer Sizing (5 years)

Basel II Data Store

BorexODS

HDS

EnterpriseLayer

Staging

MetadataRepository

OtherTransactions

CIS

BorexMaster

Unisys TCSOUT

MetadataBrowser

Reporting/OLAPMart

Model DevelopmentMart

Model ProductionMart

User Layer

HDS Staging

FileUnpack &Decode

Silo

Silo

Silo

Silo

FileUnpack &

Format

Data ExplorationMart

FileUnpack &

Format

OtherODS

Other static data Silo

Dep

loy

Mod

el

Model output "write-back"

EnterpriseLayer

Change Data Capture

ODS HDS ELUL

42GB(online)

176GB(online)1.4TB(nearline)

2.2TB (online)

230GB(online)• Includes dB overhead, not BCV backup space

SAS2.5TB

(online)

Page 53: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Metadata architecture

HistoricalData Store

EnterpriseLayer

UserLayer

HDS SiloHDS SiloHDS SiloHDS SiloHDS SiloHDS Silo

Restructure - Cleanse

Staging (Complete refresh/Deltas only)

Subset - Aggregate

Reporting toolsOLAP cubesSAS

FDE Unisys ICL CIS TMSRBS

Data feeds

Experian Data definitions

Data quality

Transformations / data lineage

Data models

Contacts / ownership

Data structure

Business rules

Metadata

Page 54: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Metadata management process

• Processes for generating and maintaining Metadata are integrated into the development cycle– Data analysis

– Design and build

– User Testing

– Live operation

• Users have full visibility of metadata, ensuring that they understand the data being accessed

• Metadata is held and managed using the CASE tool (Popkin), with extensions to support data lineage information

Page 55: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Summary

• A BI solution is an enterprise wide inititative

• Realised by a long term project (with a life of its own…);

• Affecting all major systems;

• And users => STAKE HOLDER MANAGEMENT;

• There are various layers of storage;

• Front-end tools only provide the “tip of the iceberg”;

• It is imperative for survival of the enterprise.

Page 56: Architecting a BI Solution · 2007-03-06 · Architecting a BI Solution Erich Teichmann Technical Director, BT. ... BI platforms provide a common metadata layer that unifies data

Questions?


Recommended