An Introduction to Data Warehousing Concept and Technology Mort Anvari.

Post on 31-Mar-2015

223 views 3 download

Tags:

transcript

Data Warehousing ConceptData Warehousing Concept

Data Access TechnologyData Access Technology

Enterprise Real-Time Knowledge Enterprise Real-Time Knowledge Architecture for Data WarehousingArchitecture for Data Warehousing

Data Collection and Delivery Data Collection and Delivery

Data Warehousing ConceptData Warehousing Concept

Data Access TechnologyData Access Technology

Enterprise Real-Time Knowledge Enterprise Real-Time Knowledge Architecture for Data WarehousingArchitecture for Data Warehousing

Data Collection and Delivery Data Collection and Delivery

TopicsTopics

M. Anvari Page 3

Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”

BusinessBusinessEnvironmentEnvironment

TechnologyTechnologyEnvironmentEnvironment

BusinessBusinessPlanningPlanning

BusinessBusinessOperationsOperations

M. Anvari Page 4

Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”

BusinessBusinessEnvironmentEnvironment

TechnologyTechnologyEnvironmentEnvironment

BusinessBusinessPlanningPlanning

BusinessBusinessOperationsOperations

TechnologyTechnologyPlanningPlanning

TechnologyTechnologyOperationsOperations

M. Anvari Page 5

Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”

BusinessBusinessEnvironmentEnvironment

TechnologyTechnologyEnvironmentEnvironment

BusinessBusinessPlanningPlanning

BusinessBusinessOperationsOperations

TechnologyTechnologyPlanningPlanning

TechnologyTechnologyOperationsOperations

AlignmentAlignment

ImpactImpact

OrganizationOrganization OpportunityOpportunity

M. Anvari Page 6

Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”Benson & Parker’s “Square Wheel”

BusinessBusinessEnvironmentEnvironment

TechnologyTechnologyEnvironmentEnvironment

BusinessBusinessPlanningPlanning

BusinessBusinessOperationsOperations

TechnologyTechnologyPlanningPlanning

TechnologyTechnologyOperationsOperations

AlignmentAlignment

ImpactImpact

OrganizationOrganization OpportunityOpportunityInformation Technology has to do more than Information Technology has to do more than just align itself with the business, it has to helpjust align itself with the business, it has to helpthe business have the maximum impact in the the business have the maximum impact in the marketplace.marketplace.

Data Access Data Access and

Delivery System

Data Access Data Access and

Delivery System

M. Anvari Page 8

Technology EvolutionTechnology EvolutionTechnology EvolutionTechnology Evolution

New classes of computersNew classes of computers

New classes of communicationsNew classes of communications

New classes of technology (image, sound, video, New classes of technology (image, sound, video, multimedia)multimedia)

New classes of softwareNew classes of software

Much more complex technical environmentMuch more complex technical environment

Cooperative Processing/Client-ServerCooperative Processing/Client-Server Distributed Data BasesDistributed Data Bases LANs, WANs, etc.LANs, WANs, etc.

Obsolescence Problem Multiple Legacy Systems

New classes of computersNew classes of computers

New classes of communicationsNew classes of communications

New classes of technology (image, sound, video, New classes of technology (image, sound, video, multimedia)multimedia)

New classes of softwareNew classes of software

Much more complex technical environmentMuch more complex technical environment

Cooperative Processing/Client-ServerCooperative Processing/Client-Server Distributed Data BasesDistributed Data Bases LANs, WANs, etc.LANs, WANs, etc.

Obsolescence Problem Multiple Legacy Systems

M. Anvari Page 9

IT Impact on BusinessIT Impact on BusinessIT Impact on BusinessIT Impact on Business

HP

IBMDEC

Compaq

Enterprise Network Computing and Client/Server Technology areEnterprise Network Computing and Client/Server Technology arechanging the way organizations look at all of their information systemschanging the way organizations look at all of their information systems

Data Jail

Obsolescence

IT Wastes

M. Anvari Page 10

The Existing EnterpriseThe Existing EnterpriseThe Existing EnterpriseThe Existing Enterprise

Support Existing ProductsSupport Existing Products

Support Existing CustomersSupport Existing Customers

Support Existing OrganizationSupport Existing Organization

Support Existing WorkforceSupport Existing Workforce

Support Existing TechnologySupport Existing Technology

Support Existing ProductsSupport Existing Products

Support Existing CustomersSupport Existing Customers

Support Existing OrganizationSupport Existing Organization

Support Existing WorkforceSupport Existing Workforce

Support Existing TechnologySupport Existing Technology

M. Anvari Page 11

Controlling the (Global)Controlling the (Global)Real-time OrganizationReal-time OrganizationControlling the (Global)Controlling the (Global)Real-time OrganizationReal-time Organization

RTO = 24 x 7 x ERTO = 24 x 7 x ERTO = 24 x 7 x ERTO = 24 x 7 x E

(Where E means every major market)(Where E means every major market)

M. Anvari Page 12

Information and the EnterpriseInformation and the EnterpriseInformation and the EnterpriseInformation and the Enterprise

Organizational needs for dataOrganizational needs for data

Organizational needs for informationOrganizational needs for information

Organizational needs for knowledgeOrganizational needs for knowledge

Organizational needs for dataOrganizational needs for data

Organizational needs for informationOrganizational needs for information

Organizational needs for knowledgeOrganizational needs for knowledge

M. Anvari Page 14

Needs for DataNeeds for DataNeeds for DataNeeds for Data

Data = Values (Measurements)Data = Values (Measurements)

Data to operateData to operate

Data to controlData to control

Data to planData to plan

Data = Values (Measurements)Data = Values (Measurements)

Data to operateData to operate

Data to controlData to control

Data to planData to plan

M. Anvari Page 15

Needs for InformationNeeds for InformationNeeds for InformationNeeds for Information

Information = Content + Structure (Relationships)Information = Content + Structure (Relationships)

Structure of the Real-worldStructure of the Real-world

Relating data to the businessRelating data to the business

Cross functional processesCross functional processes

Relating data to the real worldRelating data to the real world

External DBExternal DB

External Data Feeds (D&B, Reuters, etc.)External Data Feeds (D&B, Reuters, etc.)

Text, Image, Voice, Video, etc.Text, Image, Voice, Video, etc.

Statistical StudiesStatistical Studies

Information = Content + Structure (Relationships)Information = Content + Structure (Relationships)

Structure of the Real-worldStructure of the Real-world

Relating data to the businessRelating data to the business

Cross functional processesCross functional processes

Relating data to the real worldRelating data to the real world

External DBExternal DB

External Data Feeds (D&B, Reuters, etc.)External Data Feeds (D&B, Reuters, etc.)

Text, Image, Voice, Video, etc.Text, Image, Voice, Video, etc.

Statistical StudiesStatistical Studies

M. Anvari Page 16

Needs for KnowledgeNeeds for KnowledgeNeeds for KnowledgeNeeds for Knowledge

Knowledge = Goals + Actions + LearningKnowledge = Goals + Actions + Learning

Learning more about our businessLearning more about our business

Learning more about our marketLearning more about our market

Learning more about the business environmentLearning more about the business environment

Knowledge is the area in which Data Warehousing and Knowledge is the area in which Data Warehousing and Data Mining are potentially critical technologiesData Mining are potentially critical technologies

Knowledge = Goals + Actions + LearningKnowledge = Goals + Actions + Learning

Learning more about our businessLearning more about our business

Learning more about our marketLearning more about our market

Learning more about the business environmentLearning more about the business environment

Knowledge is the area in which Data Warehousing and Knowledge is the area in which Data Warehousing and Data Mining are potentially critical technologiesData Mining are potentially critical technologies

M. Anvari Page 17

Data, Information and KnowledgeData, Information and KnowledgeData, Information and KnowledgeData, Information and Knowledge

Data Centers Data Centers

Information CentersInformation Centers

Knowledge CentersKnowledge Centers

Data Centers Data Centers

Information CentersInformation Centers

Knowledge CentersKnowledge Centers

Data BasesData Bases

Information BasesInformation Bases

Knowledge BasesKnowledge Bases

Data BasesData Bases

Information BasesInformation Bases

Knowledge BasesKnowledge Bases

M. Anvari Page 18

Old Data Never DiesOld Data Never DiesOld Data Never DiesOld Data Never Dies

Note that none of the early computing styles have Note that none of the early computing styles have ever gone away!!!ever gone away!!!

Note that none of the early computing styles have Note that none of the early computing styles have ever gone away!!!ever gone away!!!

Batch

On-line

Minis

PCs

Networking

Enterprise Computing (Peer to Peer, Network to Network)

60s 70s 80s 90s

M. Anvari Page 19

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

Information Access TodayInformation Access Today

M. Anvari Page 20

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

Information Access TodayInformation Access Today

OperationalOperationalSystemsSystems

Mafg.Mafg. Ord.Ord.EntryEntry

M. Anvari Page 21

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

Information Access TodayInformation Access Today

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

M. Anvari Page 22

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

Information Access TodayInformation Access Today

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

EstimatingEstimating & Analysis& Analysis

MarketingMarketingSystemsSystems

ProductProductPlanningPlanning

M. Anvari Page 23

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

Information Access TodayInformation Access Today

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery System

M. Anvari Page 24

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

Information Access TodayInformation Access Today

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemData Warehousing is fundamentallyData Warehousing is fundamentally

an issue of Enterprise Data Architecturean issue of Enterprise Data Architecture

M. Anvari Page 25

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery System

M. Anvari Page 26

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemDataDataWarehouseWarehouse

M. Anvari Page 27

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

OperationalOperationalSystemsSystems

InformationInformationDelivery SystemDelivery SystemDataDataWarehouseWarehouse

InformationalInformationalSystemsSystems

Data Data MartsMarts

M. Anvari Page 28

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

OperationalOperationalSystemsSystems

InformationInformationDelivery SystemDelivery System

InformationalInformationalSystemsSystems

Data Data WarehouseWarehouse

ExternalExternalDataData

DataDataGaragesGarages

M. Anvari Page 29

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

OperationalOperationalSystemsSystems

InformationInformationDelivery SystemDelivery System

InformationalInformationalSystemsSystems

DataDataWarehouseWarehouse

ExternalExternalDataData

ExternalExternalUsersUsers

M. Anvari Page 30

End User EvolutionEnd User EvolutionEnd User EvolutionEnd User Evolution

Data Base Management Systems usersData Base Management Systems users

Ad Hoc Reports usersAd Hoc Reports users

Today’s Customer Demands Automated Real-Time Today’s Customer Demands Automated Real-Time Response.Response.

End User SystemsEnd User Systems

Decision Support SystemsDecision Support Systems

Executive Information SystemsExecutive Information Systems

Information CentersInformation Centers

Data Base Management Systems usersData Base Management Systems users

Ad Hoc Reports usersAd Hoc Reports users

Today’s Customer Demands Automated Real-Time Today’s Customer Demands Automated Real-Time Response.Response.

End User SystemsEnd User Systems

Decision Support SystemsDecision Support Systems

Executive Information SystemsExecutive Information Systems

Information CentersInformation Centers

M. Anvari Page 31

Ways to Organize DataWays to Organize DataWays to Organize DataWays to Organize Data

TablesTables Flexible, SimpleFlexible, Simple

HierarchiesHierarchies Speed, Natural ReportingSpeed, Natural Reporting

NetworksNetworks Multiple Directions, Complex StructureMultiple Directions, Complex Structure

ListsLists Updating Complex StructureUpdating Complex Structure

Matrices / ArrayMatrices / Array Manipulate Multiple Dimensions Manipulate Multiple Dimensions

Inverted FilesInverted Files Unplanned queries, text retrievalUnplanned queries, text retrieval

ObjectsObjects Complex structures, hide structureComplex structures, hide structure

Multidimensional Data Bases (Data Warehousing)Multidimensional Data Bases (Data Warehousing)

TablesTables Flexible, SimpleFlexible, Simple

HierarchiesHierarchies Speed, Natural ReportingSpeed, Natural Reporting

NetworksNetworks Multiple Directions, Complex StructureMultiple Directions, Complex Structure

ListsLists Updating Complex StructureUpdating Complex Structure

Matrices / ArrayMatrices / Array Manipulate Multiple Dimensions Manipulate Multiple Dimensions

Inverted FilesInverted Files Unplanned queries, text retrievalUnplanned queries, text retrieval

ObjectsObjects Complex structures, hide structureComplex structures, hide structure

Multidimensional Data Bases (Data Warehousing)Multidimensional Data Bases (Data Warehousing)

M. Anvari Page 32

End User Computing EvolutionEnd User Computing EvolutionEnd User Computing EvolutionEnd User Computing Evolution

Tool or Technique StrengthsFile Access Systems Physical access to dataNetwork DB Support for complex data interrelationsHierarchical DB Support for hierarchical views of dataInverted File DB Support for unplanned inquiry, esp. textRelational DB Flexibility and ease of updatingReport Generator Support for simple ad hoc reportingQuery Language Support for simplead hoc inquiries4GL Ability to develop simple systems easilyDecision Support System Ability to support financial and statistical

data analysisExecutive Information System Ability to present information to

executivesInformation Center Support for end users trying to access

enterprise information

M. Anvari Page 33

Data WarehousingData WarehousingData WarehousingData Warehousing

Data Warehouse can be thought of as an automated version of the

Information Center that was widely popular in the mid-1980s or

even ultimately as the automation of Information Resource

Management. And while technologies such as client-server have

begun to put enormous computing and graphics power in the

hands of individuals, however, these technologies have not, in

general, provided the link to the operational data that end users

need to make critical business decisions.

Data Warehouse can be thought of as an automated version of the

Information Center that was widely popular in the mid-1980s or

even ultimately as the automation of Information Resource

Management. And while technologies such as client-server have

begun to put enormous computing and graphics power in the

hands of individuals, however, these technologies have not, in

general, provided the link to the operational data that end users

need to make critical business decisions.

M. Anvari Page 34

Data Warehouse RequirementsData Warehouse RequirementsData Warehouse RequirementsData Warehouse Requirements

Support for Universal Access to Multi-platform Data BasesSupport for Universal Access to Multi-platform Data Bases

Support for Multiple User Types Support for Multiple User Types

Separation of Operational and Informational ConcernsSeparation of Operational and Informational Concerns

Support for Networked DataSupport for Networked Data

Support for Directories, Repositories and Information Models, Support for Directories, Repositories and Information Models,

Support for Advanced End User InterfacesSupport for Advanced End User Interfaces

Support for Universal Access to Multi-platform Data BasesSupport for Universal Access to Multi-platform Data Bases

Support for Multiple User Types Support for Multiple User Types

Separation of Operational and Informational ConcernsSeparation of Operational and Informational Concerns

Support for Networked DataSupport for Networked Data

Support for Directories, Repositories and Information Models, Support for Directories, Repositories and Information Models,

Support for Advanced End User InterfacesSupport for Advanced End User Interfaces

M. Anvari Page 35

Access to Heterogeneous DataAccess to Heterogeneous Data

HP

IBMDEC

Compaq

M. Anvari Page 36

Multiple User Types Multiple User Types (Knowledge workers)Multiple User Types Multiple User Types (Knowledge workers)

Top ExecutivesTop Executives ManagersManagers AnalystsAnalysts PlannersPlanners Product DevelopersProduct Developers ConsultantsConsultants LawyersLawyers etc.etc.

Top ExecutivesTop Executives ManagersManagers AnalystsAnalysts PlannersPlanners Product DevelopersProduct Developers ConsultantsConsultants LawyersLawyers etc.etc.

M. Anvari Page 37

Separation of Operational and Separation of Operational and Informational ConcernsInformational Concerns

Separation of Operational and Separation of Operational and Informational ConcernsInformational Concerns

Operational SystemsOperational Systems Response TimeResponse Time

ReliabilityReliability

SecuritySecurity

RecoverabilityRecoverability

Informational SystemsInformational Systems Flexibility, Performance, Ease of NavigationFlexibility, Performance, Ease of Navigation

Large numbers of different viewsLarge numbers of different views

Manage Huge Amounts of Data (VLDBs)Manage Huge Amounts of Data (VLDBs)

Need to drill down/drill thru into data Need to drill down/drill thru into data

Need to draw on data from many sourcesNeed to draw on data from many sources

Operational SystemsOperational Systems Response TimeResponse Time

ReliabilityReliability

SecuritySecurity

RecoverabilityRecoverability

Informational SystemsInformational Systems Flexibility, Performance, Ease of NavigationFlexibility, Performance, Ease of Navigation

Large numbers of different viewsLarge numbers of different views

Manage Huge Amounts of Data (VLDBs)Manage Huge Amounts of Data (VLDBs)

Need to drill down/drill thru into data Need to drill down/drill thru into data

Need to draw on data from many sourcesNeed to draw on data from many sources

M. Anvari Page 38

Support for Networked DataSupport for Networked DataSupport for Networked DataSupport for Networked Data

All the data that is required to support informational needs is often not on the same operational data base. The need for Labor Negotiations, for example, may come from a variety of operational data bases, such as Manufacturing, Personnel, and Accounting.

Distributed Systems

All the data that is required to support informational needs is often not on the same operational data base. The need for Labor Negotiations, for example, may come from a variety of operational data bases, such as Manufacturing, Personnel, and Accounting.

Distributed Systems

M. Anvari Page 39

Support for Advanced End User Support for Advanced End User InterfacesInterfacesSupport for Advanced End User Support for Advanced End User InterfacesInterfaces

M. Anvari Page 40

Dimensions of Data WarehousingDimensions of Data WarehousingDimensions of Data WarehousingDimensions of Data Warehousing

PerformancePerformance

FlexibilityFlexibility

ScalabilityScalability

Ease ofEase ofUseUse

QualityQuality

Connection to Connection to the Operational Datathe Operational Data

Distributed DataDistributed Data

SecuritySecurity

M. Anvari Page 41

Enterprise Knowledge ArchitectureEnterprise Knowledge Architecture

for for

Data WarehousingData Warehousing

Enterprise Knowledge ArchitectureEnterprise Knowledge Architecture

for for

Data WarehousingData Warehousing

M. Anvari Page 42

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

OperationalOperationalSystemsSystems

InformationalInformationalSystemsSystems

InformationInformationDelivery SystemDelivery System

M. Anvari Page 43

Operational vs. InformationalOperational vs. InformationalSystemsSystemsOperational vs. InformationalOperational vs. InformationalSystemsSystems

M. Anvari Page 44

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

DataDataMartMart

M. Anvari Page 45

Freeing the “Data in Jail”Freeing the “Data in Jail”Freeing the “Data in Jail”Freeing the “Data in Jail”

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 46

The Information Access LayerThe Information Access LayerThe Information Access LayerThe Information Access Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 47

The Legacy Data LayerThe Legacy Data LayerThe Legacy Data LayerThe Legacy Data Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 48

The External Data LayerThe External Data LayerThe External Data LayerThe External Data Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 49

The Data Access LayerThe Data Access LayerThe Data Access LayerThe Data Access Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 50

The Data Access LayerThe Data Access LayerThe Data Access LayerThe Data Access Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Data AccessData AccessFilterFilter

M. Anvari Page 51

The Data Access LayerThe Data Access LayerThe Data Access LayerThe Data Access Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

SQL QueriesSQL Queries

M. Anvari Page 52

The Data Access LayerThe Data Access LayerThe Data Access LayerThe Data Access Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

SQL QueriesSQL Queries

SQL AnswersSQL Answers

M. Anvari Page 53

Application MessagingApplication MessagingApplication MessagingApplication Messaging

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 54

The Meta-Data Repository LayerThe Meta-Data Repository LayerThe Meta-Data Repository LayerThe Meta-Data Repository Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 55

The Process Management LayerThe Process Management LayerThe Process Management LayerThe Process Management Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 56

The Core Data WarehouseThe Core Data WarehouseThe Core Data WarehouseThe Core Data Warehouse

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 57

Data Staging and QualityData Staging and QualityData Staging and QualityData Staging and Quality

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 58

Data Mart (Post-process/Indexing)Data Mart (Post-process/Indexing)Data Mart (Post-process/Indexing)Data Mart (Post-process/Indexing)

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Post-Proc.&Indexing

M. Anvari Page 59

Goals of WarehouseGoals of WarehouseGoals of WarehouseGoals of Warehouse

1. Performance (Canned queries, MD Analysis, Ad hoc, 1. Performance (Canned queries, MD Analysis, Ad hoc, Impact on Operational System)Impact on Operational System)

2. Flexibility (MD Flex, Ad hoc, Change data structure)2. Flexibility (MD Flex, Ad hoc, Change data structure)

3. Scalability (No. of Users, Volume of Data)3. Scalability (No. of Users, Volume of Data)

4. Ease of Use (Location, Formulation, Navigation, 4. Ease of Use (Location, Formulation, Navigation, Manipulation)Manipulation)

5. Data Quality (Consistent, Correct, Timely, Integrated)5. Data Quality (Consistent, Correct, Timely, Integrated)

6. Connection to the Detail Business Transactions 6. Connection to the Detail Business Transactions

1. Performance (Canned queries, MD Analysis, Ad hoc, 1. Performance (Canned queries, MD Analysis, Ad hoc, Impact on Operational System)Impact on Operational System)

2. Flexibility (MD Flex, Ad hoc, Change data structure)2. Flexibility (MD Flex, Ad hoc, Change data structure)

3. Scalability (No. of Users, Volume of Data)3. Scalability (No. of Users, Volume of Data)

4. Ease of Use (Location, Formulation, Navigation, 4. Ease of Use (Location, Formulation, Navigation, Manipulation)Manipulation)

5. Data Quality (Consistent, Correct, Timely, Integrated)5. Data Quality (Consistent, Correct, Timely, Integrated)

6. Connection to the Detail Business Transactions 6. Connection to the Detail Business Transactions

M. Anvari Page 60

Virtual WarehouseVirtual WarehouseVirtual WarehouseVirtual Warehouse

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 61

Virtual WarehouseVirtual WarehouseVirtual WarehouseVirtual Warehouse

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 62

Virtual WarehouseVirtual WarehouseVirtual WarehouseVirtual Warehouse

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

A Virtual Data WarehouseA Virtual Data Warehouseapproach is often chosenapproach is often chosenwhen there are infrequent when there are infrequent demands for data and demands for data and management wants to management wants to determine if/how users will determine if/how users will use operational data.use operational data.

M. Anvari Page 63

Virtual WarehouseVirtual WarehouseVirtual WarehouseVirtual Warehouse

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

One of the weaknesses of One of the weaknesses of a Virtual Data Warehousea Virtual Data Warehouseapproach is that user approach is that user queries are made against queries are made against operational DBs.operational DBs.

One way to minimize this One way to minimize this problem is to build a problem is to build a “Query Monitor” to check “Query Monitor” to check the performance the performance characteristics of a query characteristics of a query before executing it.before executing it.

M. Anvari Page 64

Distributed Data WarehouseDistributed Data WarehouseDistributed Data WarehouseDistributed Data Warehouse

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

M. Anvari Page 65

Distributed Data WarehouseDistributed Data WarehouseDistributed Data WarehouseDistributed Data Warehouse

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

A Distributed Data A Distributed Data Warehouse is similar in most Warehouse is similar in most respects to a Central Data respects to a Central Data Warehouse, except that the Warehouse, except that the data is distributed to data is distributed to separate mini-Data separate mini-Data Warehouses (Data Marts )Warehouses (Data Marts )on local or specialized on local or specialized serversservers

M. Anvari Page 66

Information Access ToolsInformation Access ToolsInformation Access ToolsInformation Access Tools

Desktop DBsDesktop DBs

SpreadsheetsSpreadsheets

4GL/Desktop Query Tools4GL/Desktop Query Tools

Decision Support Systems (DSS)Decision Support Systems (DSS)

Multi-dimensional DBs (MDDs)Multi-dimensional DBs (MDDs)

OLAP (On-line Analytical ProcessingOLAP (On-line Analytical Processing

Executive Information Systems (EIS)Executive Information Systems (EIS)

Data Visualization Tools Data Visualization Tools

Data Mining ToolsData Mining Tools

Business Modeling and Simulation ToolsBusiness Modeling and Simulation Tools

Desktop DBsDesktop DBs

SpreadsheetsSpreadsheets

4GL/Desktop Query Tools4GL/Desktop Query Tools

Decision Support Systems (DSS)Decision Support Systems (DSS)

Multi-dimensional DBs (MDDs)Multi-dimensional DBs (MDDs)

OLAP (On-line Analytical ProcessingOLAP (On-line Analytical Processing

Executive Information Systems (EIS)Executive Information Systems (EIS)

Data Visualization Tools Data Visualization Tools

Data Mining ToolsData Mining Tools

Business Modeling and Simulation ToolsBusiness Modeling and Simulation Tools

M. Anvari Page 67

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Data Warehousing Tools and Data Warehousing Tools and TechnologyTechnologyData Warehousing Tools and Data Warehousing Tools and TechnologyTechnology

Desktop Data Bases:Desktop Data Bases:

•Structured for Database ManipulationStructured for Database Manipulation•Provides facility for selecting, andProvides facility for selecting, and loading of Desktop DBs from loading of Desktop DBs from Informational DBs Informational DBs•Provides ability to Create HighlyProvides ability to Create Highly “Personalized” Informational Systems “Personalized” Informational Systems

ExamplesExamples•AccessAccess•ParadoxParadox•dBase/FoxPro/ClipperdBase/FoxPro/Clipper

M. Anvari Page 68

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Spreadsheets:Spreadsheets:

•Structured to get any subset of Structured to get any subset of Information Information •Ability to Interface with standardAbility to Interface with standard Spreadsheet tools ( Spreadsheet tools (

ExamplesExamples• ExcelExcel• 1-2-31-2-3• Quatro ProQuatro Pro

M. Anvari Page 69

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Ad Hoc Query Systems:Ad Hoc Query Systems:

•Tailored for Flexible ReportingTailored for Flexible Reporting•Ability to do Sophisticated Analysis Ability to do Sophisticated Analysis Functions Functions•Aimed a a variety of users from casual toAimed a a variety of users from casual to the power user the power user

ExamplesExamples•Focus for Windows (IBI)Focus for Windows (IBI)•SASSAS•Business ObjectsBusiness Objects•GQL (Anadyne)GQL (Anadyne)•Esperant (Software AG)Esperant (Software AG)•Forrest & Trees (Platinum)Forrest & Trees (Platinum)•Visualizer (IBM)Visualizer (IBM)•Impromptu (Cognos)Impromptu (Cognos)•Beacon (Prodea)Beacon (Prodea)

M. Anvari Page 70

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Multi-dimensional Databases (MDDB)Multi-dimensional Databases (MDDB)OLAP (On-line analytical processing):OLAP (On-line analytical processing):

•Highly Structured DataHighly Structured Data•Tailored for Financial ModelingTailored for Financial Modeling•Tailored for “Power Users”Tailored for “Power Users”•Ability to do Sophisticated Ability to do Sophisticated Financial “What-if” Analysis Financial “What-if” Analysis•Ability to “drill-down” from high-level toAbility to “drill-down” from high-level to Detail Data Detail Data

ExamplesExamples• Acumate (Kenan Tech.)Acumate (Kenan Tech.)• Beacon (Prodea)Beacon (Prodea)• CrossTarget (Dimensional Insight) CrossTarget (Dimensional Insight) • eSSbase (Arbor)eSSbase (Arbor)• Oracle Express (Oracle)Oracle Express (Oracle)

M. Anvari Page 71

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Executive Information Systems (EIS):Executive Information Systems (EIS):

•Highly Structured DataHighly Structured Data•Tailored for Non-technical UsersTailored for Non-technical Users•Ability to “slice and dice” dataAbility to “slice and dice” data•Ability to “drill-down”Ability to “drill-down”

ExamplesExamples• Commander OLAP ServerCommander OLAP Server• Pilot (Lightship)Pilot (Lightship)• VBVB• PowerbuilderPowerbuilder

M. Anvari Page 72

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Data Visualization:Data Visualization:

• Automatic Categorization Automatic Categorization • Visualization of Multi-dimensional dataVisualization of Multi-dimensional data• Automatic Analysis and/or IndexingAutomatic Analysis and/or Indexing

ExamplesExamples• WinViz (IBI)WinViz (IBI)• dbExpress (Computer Concepts)dbExpress (Computer Concepts)• Data Explorer (IBM)Data Explorer (IBM)• ARC Info/ARC ViewARC Info/ARC View• Strategic MappingStrategic Mapping

M. Anvari Page 73

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Data Mining:Data Mining:

•High Speed Analysis of Detail DataHigh Speed Analysis of Detail Data•Constructs Business PatternsConstructs Business Patterns•Provides Statistical SupportProvides Statistical Support

ExamplesExamples• IBM beta-testIBM beta-test• Information HarvesterInformation Harvester• IDISIDIS• d.b.Expressd.b.Express• DataMindDataMind

M. Anvari Page 74

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Enterprise Network Computer Enterprise Network Computer ArchitectureArchitectureEnterprise Network Computer Enterprise Network Computer ArchitectureArchitecture

Business Modeling and Simulation:Business Modeling and Simulation:

•Business Feedback ModelBusiness Feedback Model•Direct ManipulationDirect Manipulation•Business GamingBusiness Gaming•Management/Operations TrainingManagement/Operations Training

ExamplesExamples• SimRefinerySimRefinery• SimTelephoneSimTelephone• iThinkiThink• MicroworldsMicroworlds

M. Anvari Page 75

3. Meta-data Repository Layer3. Meta-data Repository Layer3. Meta-data Repository Layer3. Meta-data Repository Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Data Dictionary/Data Dictionary/RepositoryRepository

• Meta-data ModelingMeta-data Modeling• Meta-data UpdatingMeta-data Updating• Meta-dataMeta-data

ExamplesExamples o Platinumo Platinum o Rochadeo Rochade o MSPo MSP o Data Atlas (IBM)o Data Atlas (IBM) o MS/TIo MS/TI

M. Anvari Page 76

3. Process (Systems) Management3. Process (Systems) Management3. Process (Systems) Management3. Process (Systems) Management

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Process Process ManagementManagement

• SchedulingScheduling• ExecutionExecution• SubscriptionSubscription

ExamplesExamples o Data Harvestero Data Harvester o Data Hubo Data Hub o Detect and Alerto Detect and Alert(Comshare)(Comshare)

M. Anvari Page 77

3. Post-processing/Indexing Layer3. Post-processing/Indexing Layer3. Post-processing/Indexing Layer3. Post-processing/Indexing Layer

Data Directory(Repository)

InformationAccess

Data Warehouse

DataStaging

Application Messaging

OperationalDBs

Lake Er ie

L a ke O n ta ri o

PennsylvaniaCT

MA

NJ

DE

RI

MD

Maine

North

NHVT

New York

Virginia W

est

Virg

inia

Process Management

Data Directory Functions

External DBs

DataAccess

Post-processing/Post-processing/IndexingIndexing

ExamplesExamples•Sybase IQ AcceleratorSybase IQ Accelerator•OMNIdexOMNIdex•Oracle 7.3Oracle 7.3•eSSbaseeSSbase•IRI ExpressIRI Express