Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | harshali-y-patil |
View: | 36 times |
Download: | 0 times |
04/19/23 Girish Tere, Lecturer (CS), TCSC
1
M. Sc. (CS/IT) Part IPaper IV
Data Warehousing and Mining
Text Books: Paulraj Ponnian, “Data Warehousing
Fundamentals”, John Wiley. W.H. Inmon, “Building the Data Warehouses”, Wiley
Dreamtech R. Kimpall, “The Data Warehouse Toolkit”, John
Wiley Ralph Kimball, “The Data Warehouse Lifecycle
toolkit”, John Wiley
04/19/23 Girish Tere, Lecturer (CS), TCSC
2
The need for DW Understand the desperate need for
strategic information Recognize the information crises at every
enterprise Distinguish between operational and
informational systems Past attempts to provide strategic
information The solution – Data Warehousing
04/19/23 Girish Tere, Lecturer (CS), TCSC
3
Introduction What is your role in IT? Your IT experience Applications to run business What they do? What they provide? What executives requires? Where is the strategic information
required?
04/19/23 Girish Tere, Lecturer (CS), TCSC
4
Organization’s use of DW Retail
Customer Loyalty Market Planning
Financial Risk Management Fraud Detection
Airlines Root Profitability Yield Managemnt
Manufacturing Cost Reduction Logistics Management
Utilities Asset Management Resource
Management Government
Manpower Planning Cost Control
04/19/23 Girish Tere, Lecturer (CS), TCSC
5
Understand the desperate need for strategic information Who needs strategic information in
an Enterprise? What is strategic information? Examples of Business Objectives
Retain the present customer base Increase the customer base by 15%
over the next 5 years Gain market share by 10% in next 3
years
04/19/23 Girish Tere, Lecturer (CS), TCSC
6
Examples of Business Objectives (cont…)
Improve product quality levels in the top five product groups
Enhance customer service level in shipments
Bring three new products to market in 2 years
Increase sales by 15% in the North East Division
04/19/23 Girish Tere, Lecturer (CS), TCSC
7
Strategic Information (SI) Is it for running the day-to-day
operation of the business? What is SI? Characteristics of SI
Characteristics of SI Integrated Must have a single,
enterprise-wide view
Data Integrity Information must be accurate and must conform to business rules
Accessible Easily accessible with intuitive access paths, and responsive for analysis
Credible Every business factor must have unique value
Timely Information must be available within the stipulated time period
04/19/23 Girish Tere, Lecturer (CS), TCSC
9
The Information Crisis How much data is stored and available? Where is all this data? On which platforms? On one PC or across the network? Facts are Organization have lots of data IT resources and systems are not
affective to use this data as SI
04/19/23 Girish Tere, Lecturer (CS), TCSC
10
Real Problem Most companies are faced with information
crisis not because of lack of sufficient data, but because the available data is not readily usable for strategic decision making.
Why is this so?We need information integrated from all systems.
Operational data is event drivenOperational data is not directly suitable for
review from different viewpoints
04/19/23 Girish Tere, Lecturer (CS), TCSC
11
Technology Trends Name of Computer Department in
Company “DP”, “MIS”, “IS”, “IT” Phenomenon growth of IT in areas
like Computing Technology Human/Machine Interface Processing Options
What technology SI needs?
04/19/23 Girish Tere, Lecturer (CS), TCSC
12
Technology Trends (cont…) The user will ask a question and
get the results… This interactive process continues Why making provision of SI is
feasible now?
04/19/23 Girish Tere, Lecturer (CS), TCSC
13
Opportunities and Risks What are the opportunities
available to companies resulting from the possible use of SI?
What are threats and risks resulting from lack of SI available in companies?
04/19/23 Girish Tere, Lecturer (CS), TCSC
14
Some Opportunities … SI required for Reliance
Telecommunication industry SI required for ICICI Bank SI required for Mediclaim companies SI required for Apna Bazar A Community based pharmacy
company
04/19/23 Girish Tere, Lecturer (CS), TCSC
15
Some Risks … A car rental company (fleet
management) A multinational company - Supplier
of systems and components to automobile industry (Inconsistent data)
04/19/23 Girish Tere, Lecturer (CS), TCSC
16
Failures of past DSS Example – A Chennai Branch is not … … You have to gather the data from
multiple applications and start from scratch.
In order to understand the reasons for the failures of IT to provide SI in the past, we need to consider how IT was attempting to do this all these years.
04/19/23 Girish Tere, Lecturer (CS), TCSC
17
Past DSSs Ad- Hoc reports Special Extract Programs Small applications Information Centers DSS EIS (only programmed screens and
reports were available)
04/19/23 Girish Tere, Lecturer (CS), TCSC
18
Inability to provide information Figure 1.4 IT receives too many ad hoc requests,
resulting in a large overload. Requests keep changing Users ask for more and more reports Users have to depend on IT to provide the
information You need very flexible and conductive
environment for providing info for making strategic decisions. IT has been unable to provide such an environment.
04/19/23 Girish Tere, Lecturer (CS), TCSC
19
Operational vs DSS What is the basic reason for the
failure of all the previous attempts by IT to provide SI?
Do we need different types of systems?
04/19/23 Girish Tere, Lecturer (CS), TCSC
20
Making the wheels ofBusiness Turn OLTP Systems Used to run the day-to-day core
business of company
04/19/23 Girish Tere, Lecturer (CS), TCSC
21
Get the data inMaking the wheels of business turn
Take an order Process a claim Make a shipment Generate an invoice Receive cash Reserve an Airline ticket
04/19/23 Girish Tere, Lecturer (CS), TCSC
22
Get the information outWatching the wheels of business turn
Show me the top-selling products Show me the problem regions Tell me why (drill down) Let me see other data (drill across) Show me highest margins Alert me when a district sells below
target
04/19/23 Girish Tere, Lecturer (CS), TCSC
23
We need to design and build informational systems That serve different purposes Whose scopes are different Whose data content is different Where the data usage patterns are
different Where the data access types are
different
04/19/23 Girish Tere, Lecturer (CS), TCSC
24
M. Sc. (CS/IT) Part IPaper IV
Data Warehousing and Mining Text Books: 1. Paulraj Ponnian, “Data Warehousing Fundamentals”, John Wiley. 2. W.H. Inmon, “Building the Data Warehouses”, Wiley Dreamtech 3. R. Kimpall, “The Data Warehouse Toolkit”, John Wiley 4. Ralph Kimball, “The Data Warehouse Lifecycle toolkit”, John Wiley
04/19/23 Girish Tere, Lecturer (CS), TCSC
25
The need for DW Understand the desperate need for
strategic information Recognize the information crises at every
enterprise Distinguish between operational and
informational systems Past attempts to provide strategic
information The solution – Data Warehousing
04/19/23 Girish Tere, Lecturer (CS), TCSC
26
Introduction What is your role in IT? Your IT experience Applications to run business What they do? What they provide? What executives requires? Where is the strategic information
required?
04/19/23 Girish Tere, Lecturer (CS), TCSC
27
Organization’s use of DW Retail
Customer Loyalty Market Planning
Financial Risk Management Fraud Detection
Airlines Root Profitability Yield Managemnt
Manufacturing Cost Reduction Logistics Management
Utilities Asset Management Resource
Management Government
Manpower Planning Cost Control
04/19/23 Girish Tere, Lecturer (CS), TCSC
28
Understand the desperate need for strategic information Who needs strategic information in
an Enterprise? What is strategic information? Examples of Business Objectives
Retain the present customer base Increase the customer base by 15%
over the next 5 years Gain market share by 10% in next 3
years
04/19/23 Girish Tere, Lecturer (CS), TCSC
29
Examples of Business Objectives (cont…)
Improve product quality levels in the top five product groups
Enhance customer service level in shipments
Bring three new products to market in 2 years
Increase sales by 15% in the North East Division
04/19/23 Girish Tere, Lecturer (CS), TCSC
30
Strategic Information (SI) Is it for running the day-to-day
operation of the business? What is SI? Characteristics of SI
Characteristics of SI
Integrated Must have a single, enterprise-wide view
Data Integrity Information must be accurate and must conform to business rules
Accessible Easily accessible with intuitive access paths, and responsive for analysis
Credible Every business factor must have unique value
Timely Information must be available within the stipulated time period
04/19/23 Girish Tere, Lecturer (CS), TCSC
32
The Information Crisis How much data is stored and available? Where is all this data? On which platforms? On one PC or across the network? Facts are Organization have lots of data IT resources and systems are not
affective to use this data as SI
04/19/23 Girish Tere, Lecturer (CS), TCSC
33
Real Problem Most companies are faced with information
crisis not because of lack of sufficient data, but because the available data is not readily usable for strategic decision making.
Why is this so?We need information integrated from all systems.
Operational data is event drivenOperational data is not directly suitable for
review from different viewpoints
04/19/23 Girish Tere, Lecturer (CS), TCSC
34
Technology Trends Name of Computer Department in
Company “DP”, “MIS”, “IS”, “IT” Phenomenon growth of IT in areas
like Computing Technology Human/Machine Interface Processing Options
What technology SI needs?
04/19/23 Girish Tere, Lecturer (CS), TCSC
35
Technology Trends (cont…) The user will ask a question and
get the results… This interactive process continues Why making provision of SI is
feasible now?
04/19/23 Girish Tere, Lecturer (CS), TCSC
36
Operational and Informational Systems
Data Content Current values Archived, derived, summarized
Data Structure Optimized for transactions
Optimized for complex queries
Access Frequency High Medium to low
Access Type Read, update, delete Read
Usage Predictable, Repetitive
Ad hoc, random, heuristic
Response Time msecs Many seconds
Users Large numbers Relatively small numbers
04/19/23 Girish Tere, Lecturer (CS), TCSC
37
DW – The correct solution We need different types of DSS to
provide SI Information required for strategic
decision making is not available in operational systems
New environment is required for analysis, deciding trends and monitoring performance
04/19/23 Girish Tere, Lecturer (CS), TCSC
38
Features of new environment : Database designed for analytical tasks Data from multiple applications Easy to use and helping to long interactive
sessions by users Read-intensive data usage Direct interaction with the system by the users
without help from IT staff Content updated periodically and stable Content to include current and historical data Ability for users to run queries and get results
online Ability for users to make reports
04/19/23 Girish Tere, Lecturer (CS), TCSC
39
Processing requirements in the new environment (analytical processing requirements)
Running of simple queries and reports against current and historical data
Ability to perform “what if” analysis Ability to query, analyze and again
make query – continue this process as many as times required
Realize historical trends, mistakes and apply/correct them for future results
04/19/23 Girish Tere, Lecturer (CS), TCSC
40
BI at DW The needed environment is DW It is kept separate from the system
environment supporting the day-to-day operations
DW contains BI.
04/19/23 Girish Tere, Lecturer (CS), TCSC
41
Basic business process
Data transformation
DataWarehouse
Key measurements, business dimensions
OperationalSystems
Extraction,Cleansing,
aggregation
04/19/23 Girish Tere, Lecturer (CS), TCSC
42
E.g. of BI at DW DW containing units of sales stored
along business dimensions Important : Data staging area
04/19/23 Girish Tere, Lecturer (CS), TCSC
43
Definition of DW - DW is an informational environment that
Provides an integrated and total view of the enterprise
Makes the enterprise’s current and historical information easily available for decision making
Makes decision-support transactions possible without burdening operational systems
Renders consistently organization’s information Presents a flexible and interactive source of
strategic information
04/19/23 Girish Tere, Lecturer (CS), TCSC
44
DW concept Is not to generate fresh data Is to make use of large existing
data and to transform it into forms suitable for providing SI
Take all the data you already have in the organization, clean and transform it, and then use it to provide SI
04/19/23 Girish Tere, Lecturer (CS), TCSC
45
DW – An Environment,Not a Product It is a user-centric and user-driven
environment An ideal environment for data analysis and
decision support Constantly changing, flexible and
interactive Useful for the ask-answer-ask-again pattern Provides the ability to discover answers to
complex, unpredictable questions
04/19/23 Girish Tere, Lecturer (CS), TCSC
46
The basic concept of DW is: Take all the data from the operational
systems Where necessary, include relevant data
from outside, such as industry benchmark indicators
Integrate all the data from the various sources
Remove inconsistencies and transform the data
Store the data in formats suitable for easy access for decision making
04/19/23 Girish Tere, Lecturer (CS), TCSC
47
DW involves following functions Data extraction Loading the data Transforming the data Storing the data Providing UI
04/19/23 Girish Tere, Lecturer (CS), TCSC
48
Technologies used in DW Data Quality
Data Modeling Data Acquisition Data Management Metadata Management
Administration Analysis Applications Development Tools Storage Management
04/19/23 Girish Tere, Lecturer (CS), TCSC
49
Match the columns1. information crisis2. SI3. operational systems4. information center5. DW6. order processing7. EIS8. data staging area9. extract programs10. IT
A. OLTP applicationB. Produce ad hoc reportsC. explosive growthD. despite lots of dataE. data cleaned and
transformedF. users go to get
informationG. used for decision makingH. environment, not productI. for day-to-day operationsJ. Simple, easy to use
04/19/23 Girish Tere, Lecturer (CS), TCSC
50
Class Test1. What do you mean by SI? For a commercial bank, name five
types of strategic objectives.2. Do you agree that a typical retail store collects huge
volumes of data through its operational systems? Name three types of transaction data likely to be collected by a retail store in large volumes during its daily operations.
3. Why were all the past attempts by IT to provide SI failures? List three concrete reasons and explain.
4. Differentiate between operational systems and informational systems.
5. List characteristics of the computing environment needed to provide SI.
6. What types of processing take place in a DW?7. A DW is an environment, not a product. Discuss.
04/19/23 Girish Tere, Lecturer (CS), TCSC
51
Class Test (cont…)8. You are the IT Director of a nationwide insurance company.
Write a memo to the VP explaining the types of opportunities that can be realized with What do you mean by SI? For a commercial bank, name five types of strategic objectives.
9. For an airlines company, how can SI increase the number of frequent flyers? Discuss giving specific details.