Date post: | 22-Dec-2015 |
Category: |
Documents |
Upload: | aleesha-hawkins |
View: | 214 times |
Download: | 1 times |
1
DalhousieUniversity
CogNovaTechnologies
Business Business IntelligenceIntelligence
through through Data MiningData Mining
withwith
Daniel L. SilverDaniel L. Silver
Copyright (c), 1999All Rights Reserved
2
DalhousieUniversity
CogNovaTechnologies
About myself ...About myself ...
Ph.D. in Comp. Sci./Machine Learning, UWOPh.D. in Comp. Sci./Machine Learning, UWO Chair-Associate, Business Informatics, Chair-Associate, Business Informatics, Faculty of Faculty of
Management, Dalhousie University Management, Dalhousie University Founder of Founder of CogNova TechnologiesCogNova Technologies (London, 1993) (London, 1993) London Health Science Center, 3M, London Life, MT&T, London Health Science Center, 3M, London Life, MT&T,
NSPI, QEII Health Science CenterNSPI, QEII Health Science Center
My Objective ...My Objective ... To discuss data warehousing and data mining
within the context of knowledge management and business intelligence.
3
DalhousieUniversity
CogNovaTechnologies
CogNova Technologies OffersCogNova Technologies Offers Consultation - Consultation - situation analysis and requirements situation analysis and requirements
definition, selection of third party systems, project definition, selection of third party systems, project management, and trouble shooting management, and trouble shooting
Services - Services - installation and application of third party installation and application of third party software, data analysis and model generation using CogNova software, data analysis and model generation using CogNova proprietary systems, summary and analysis of resultsproprietary systems, summary and analysis of results
Education - Education - courses and seminars on the theory and courses and seminars on the theory and application of data mining technologies, and the knowledge application of data mining technologies, and the knowledge discovery processdiscovery process
Research - Research - investigation and development of advanced investigation and development of advanced machine learning systems and the application of KDD machine learning systems and the application of KDD practicespractices
4
DalhousieUniversity
CogNovaTechnologies
OutlineOutline IntroductionIntroduction Knowledge Management Knowledge Management
and Business Intelligenceand Business Intelligence Knowledge Discovery ProcessKnowledge Discovery Process Data Warehousing and Data Data Warehousing and Data
MiningMining Opportunities, Benefits, CostsOpportunities, Benefits, Costs
5
DalhousieUniversity
CogNovaTechnologies
Introduction - The Buzz Introduction - The Buzz WordsWords
Hype vs. RealityHype vs. Reality Knowledge ManagementKnowledge Management Business IntelligenceBusiness Intelligence Data Warehouse, Corp. Repository, Data Warehouse, Corp. Repository,
Data MartData Mart Knowledge Creation or DiscoveryKnowledge Creation or Discovery Data MiningData Mining
6
DalhousieUniversity
CogNovaTechnologies
Introduction - MotivationIntroduction - Motivation
Organization
GlobalOpportunities
Customer Demands
RegulatoryChange
TechnologicalChange
EmployeeTurn-over
Competition
7
DalhousieUniversity
CogNovaTechnologies
Introduction - RationaleIntroduction - Rationale
Management ofOrganizational
Knowledge
Gov’t Reg.
CompetitorsCustomersChannels
PartnersSuppliers
Employees
ProductsServices
8
DalhousieUniversity
CogNovaTechnologies
The Knowledge Management The Knowledge Management CycleCycle
INFORMATIONStorage
ProcessingCommunication
Knowledge Consolidation
Observationand Analysis
Testing and Application
Theory Generation
Environmental data
ProblemsOpportunities
ApproachMethodsResults
Information
““Business Intelligence”Business Intelligence”
9
DalhousieUniversity
CogNovaTechnologies
KM and Business KM and Business IntelligenceIntelligenceWhy should it matter to you?Why should it matter to you? Knowledge becoming substantial asset Knowledge becoming substantial asset Maximum sharing of informationMaximum sharing of information Employees leave, business value remainsEmployees leave, business value remains Betterment of internal and external Betterment of internal and external
structures, personal competenciesstructures, personal competencies Competitive advantage - leading Competitive advantage - leading
organizations now adopting organizations now adopting
10
DalhousieUniversity
CogNovaTechnologies
KM and Business KM and Business IntelligenceIntelligenceKey Solution Components:Key Solution Components: Internet / Intranet & GroupwareInternet / Intranet & Groupware Document management systemsDocument management systems EDI - Electronic Data InterchangeEDI - Electronic Data Interchange E-Commerce methodsE-Commerce methods Data Warehousing Data Warehousing Data MiningData Mining
11
DalhousieUniversity
CogNovaTechnologies
Knowledge ManagementKnowledge Management information information => <= => <= peoplepeopleTechnology Technology
CentredCentred Info. TechnologistsInfo. Technologists info. and comp. info. and comp.
sciences, database, sciences, database, telecomm., analysistelecomm., analysis
KM = objectsKM = objects explicit knowledge - explicit knowledge -
easily encodedeasily encoded
People CentredPeople Centred Org. TheoristsOrg. Theorists org. behavior, org. behavior,
group dynamics, group dynamics, HCI, psychologyHCI, psychology
KM = processKM = process tacit knowledge - tacit knowledge -
difficult to encodedifficult to encode
12
DalhousieUniversity
CogNovaTechnologies
Knowledge ManagementKnowledge Management
Intellectual CapitalIntellectual CapitalHuman Capital Human Capital = Knowledge + Capabilities + = Knowledge + Capabilities +
SkillSkill
Structural Capital Structural Capital = Everything that remains = Everything that remains after the employees go home after the employees go home
Intellectual Capital Intellectual Capital = Human Capital + = Human Capital + Structural CapitalStructural Capital
Intellectual Capital Intellectual Capital = Market Value - Book = Market Value - Book Value (e.g. Microsoft’s MV = 15 * BV)Value (e.g. Microsoft’s MV = 15 * BV)
13
DalhousieUniversity
CogNovaTechnologies
Knowledge ManagementKnowledge ManagementThe Invisible Balance SheetThe Invisible Balance Sheet
Assets Liability & S.H. Equity
CashAccounts ReceivableEquipmentProperty
Short-term Loans
Long-term Debt
S.H. EquityTan
gibl
e
External Structure
Internal Structure
Competence
InvisibleShare Holder
Equity
ObligationInta
ngib
le
Boo
k V
alue
Mar
ket V
alue
14
DalhousieUniversity
CogNovaTechnologies
KM and Business KM and Business IntelligenceIntelligenceGardner says ....Gardner says .... Leaders Leaders - will move on intangible - will move on intangible
benefitsbenefits Followers Followers - will move only on - will move only on
tangible tangible savings/profitssavings/profits
Others Others - will wait and try to catch up- will wait and try to catch up
15
DalhousieUniversity
CogNovaTechnologies
KM and Business KM and Business IntelligenceIntelligence
HYPEHYPE KM is primarily KM is primarily
technology centred:technology centred:– Data Data
Warehousing Warehousing – Data Mining Data Mining – IntranetsIntranets– GroupwareGroupware
REALITYREALITY KM is primarily a KM is primarily a
people centred people centred philosophy which philosophy which necessarily necessarily involves and will involves and will promote the use promote the use of such of such technologiestechnologies
16
DalhousieUniversity
CogNovaTechnologies
Knowledge ManagementKnowledge ManagementAccess to Recent InformationAccess to Recent Information
Books: Books: ””Working Knowledge : How Working Knowledge : How Organizations Manage What They KnowOrganizations Manage What They Know”” T. T. Davenport & L. Prusak Davenport & L. Prusak (http://www.amazon.com/exec/obidos/ASI)(http://www.amazon.com/exec/obidos/ASI)
The Web:The Web:– http://www.brint.com/km/http://www.brint.com/km/– www.sveiby.com.auwww.sveiby.com.au– knowledge management mail-list:knowledge management mail-list:
17
DalhousieUniversity
CogNovaTechnologies
““We are drowning in We are drowning in information, but starving for information, but starving for
knowledge.” knowledge.” John NaisbettJohn Naisbettauthor of Megatrendsauthor of Megatrends
Knowledge Discovery Knowledge Discovery throughthrough
Data WarehousingData Warehousing andand
Data MiningData Mining
18
DalhousieUniversity
CogNovaTechnologies
Knowledge Discovery and Data Knowledge Discovery and Data MiningMining
What is KDD? What is KDD? A ProcessA Process The selection and processing of data for:The selection and processing of data for:
– the identification of novel, accurate, the identification of novel, accurate, and useful patterns, and and useful patterns, and
– the modeling of real-world the modeling of real-world phenomenon.phenomenon.
Data Warehousing Data Warehousing andand Data mining Data mining are are major components of the KDD processmajor components of the KDD process
19
DalhousieUniversity
CogNovaTechnologies
The KnowledgeThe Knowledge Discovery Discovery ProcessProcess
Selection and Preprocessing
Data Mining
Interpretation and Evaluation
Data Warehousing
Knowledge
p(x)=0.02
Warehouse
Internal and External Data Sources
Patterns & Models
Prepared Data
ConsolidatedData
20
DalhousieUniversity
CogNovaTechnologies
Knowledge Discovery in Knowledge Discovery in ContextContext
CogNovaTechnologies
9
The KDD ProcessThe KDD Process
Selection and Preprocessing
Data Mining
Interpretation and Evaluation
Data Consolidation
Knowledge
p(x)=0.02
Warehouse
Data Sources
Patterns & Models
Prepared Data
ConsolidatedData
IdentifyProblem or Opportunity
Measure Effectof Action
Act onKnowledge
“The VirtuousCycle”
Knowledge
ResultsNew Insight
Problem
21
DalhousieUniversity
CogNovaTechnologies
Why? … Why? … RelationshipRelationship
MarketingMarketinga.k.aa.k.a
Customer Relationship Customer Relationship ManagementManagement
Marketing Embraces KM, DW, Marketing Embraces KM, DW, DMDM
Marketing
TraditionalMarketing
MIS
DataWarehousingData Mining
22
DalhousieUniversity
CogNovaTechnologies
What is Relationship What is Relationship Marketing all about?Marketing all about?
Knowing your customers Knowing your customers on an individual basison an individual basis
Maximizing life-time Maximizing life-time value not individual value not individual sales sales
Developing and Developing and maintaining a mutually maintaining a mutually beneficial relationshipbeneficial relationship
Acquire, retain, win-back Acquire, retain, win-back desirable customersdesirable customers
Arbuckle’sMarket
“ The Corner Store ”
23
DalhousieUniversity
CogNovaTechnologies
Knowledge DiscoveryKnowledge Discovery
What can KDD do for an organization?What can KDD do for an organization?
Impact on MarketingImpact on Marketing Target marketing at a credit card companyTarget marketing at a credit card company Consumer usage analysis at a telecomm Consumer usage analysis at a telecomm
providerprovider Loyalty assessment at a service bureauLoyalty assessment at a service bureau Quality of service analysis at an appliance Quality of service analysis at an appliance
chainchain
24
DalhousieUniversity
CogNovaTechnologies
The KnowledgeThe Knowledge Discovery Discovery ProcessProcess
Selection and Preprocessing
Data Mining
Interpretation and Evaluation
Data Warehousing
Knowledge
p(x)=0.02
Warehouse
Internal and External Data Sources
Patterns & Models
Prepared Data
ConsolidatedData
25
DalhousieUniversity
CogNovaTechnologies
Data WarehousingData Warehousing
From data sources to consolidated data From data sources to consolidated data repositoryrepository
RDBMS
Legacy DBMS
Flat Files
DataConsolidationand Cleansing
Warehouseor Datamart
Object/Relation DBMS Object/Relation DBMS
Multidimensional DBMS Multidimensional DBMS External
Analysis and Info Sharing
26
DalhousieUniversity
CogNovaTechnologies
Data WarehousingData Warehousing
Operational DBOperational DB Application Application
orientedoriented CurrentCurrent DetailsDetails Changes Changes
continuallycontinually
Data WarehouseData Warehouse Subject OrientedSubject Oriented Current + Current +
historicalhistorical Details + Details +
SummariesSummaries Stable Stable Major DW Framework suppliers / consultants:
DMR, IBM, SHL, NCR; SAS, Oracle, Sybase
27
DalhousieUniversity
CogNovaTechnologies
Relationship between DW Relationship between DW and DM?and DM?
Source of consolidated
data
Rationalefor data
consolidation
Data Warehousing
AnalysisQuery/Reporting
OLAPData Mining
Strategic Tactical
28
DalhousieUniversity
CogNovaTechnologies
Data WarehousingData Warehousing Must be business benefits drivenMust be business benefits driven It’s not a project .. It’s a way of lifeIt’s not a project .. It’s a way of life Keys to success are top-down strategy with Keys to success are top-down strategy with
bottom-up tactical deployment:bottom-up tactical deployment:– communicate vision of Data Warehousecommunicate vision of Data Warehouse– construct departmental Data Marts construct departmental Data Marts – evolve to enterprise Data Warehouseevolve to enterprise Data Warehouse
Rapid change in technology and business Rapid change in technology and business requirements -> requirements ->
demands short deployment cycles demands short deployment cycles
29
DalhousieUniversity
CogNovaTechnologies
Data WarehousingData Warehousing
HYPEHYPE Corporate data Corporate data
stored within a stored within a DW will solve all DW will solve all your business your business problemsproblems
REALITYREALITY The identification The identification
of business of business problems is the problems is the first step - DW, first step - DW, DM are solutionsDM are solutions
Analysis and DW Analysis and DW will necessarily will necessarily mature in parallelmature in parallel
30
DalhousieUniversity
CogNovaTechnologies
Data WarehousingData Warehousing
Access to Recent InformationAccess to Recent Information Text Books:Text Books:
– W.H. Inmon, Claudia ImhoffW.H. Inmon, Claudia Imhoff Web Pages:Web Pages:
– DWI - The Data Warehouse InstituteDWI - The Data Warehouse Institutewww.dw-institute.comwww.dw-institute.com
– DW Information CentreDW Information Centrepwp.starnetic.com/larrygpwp.starnetic.com/larryg
31
DalhousieUniversity
CogNovaTechnologies
The KnowledgeThe Knowledge Discovery Discovery ProcessProcess
Selection and Preprocessing
Data Mining
Interpretation and Evaluation
Data Warehousing
Knowledge
p(x)=0.02
Warehouse
Internal and External Data Sources
Patterns & Models
Prepared Data
ConsolidatedData
32
DalhousieUniversity
CogNovaTechnologies
Knowledge Discovery Knowledge Discovery ProcessProcess
Core Problems & Approaches Core Problems & Approaches Problems:Problems:
– identificationidentification of relevant data of relevant data– representationrepresentation of data of data– searchsearch for valid pattern or model for valid pattern or model
Approaches:Approaches:– top-down top-down verification verification by expertby expert– interactive interactive visualization visualization of data/modelsof data/models
– * bottom-up* bottom-up induction induction from data *from data *
Probabilityof sale
Income
Age
DataMining
On-LineAnalyticalProcessing
33
DalhousieUniversity
CogNovaTechnologies
OLAP: OLAP: On-Line Analytical On-Line Analytical ProcessingProcessing
OLAP FunctionalityOLAP Functionality Dimension selection Dimension selection
– slice & diceslice & dice RotationRotation
– allows change in perspectiveallows change in perspective
FiltrationFiltration – value range selectionvalue range selection
HierarchiesHierarchies– drill-downs to lower levels drill-downs to lower levels – roll-ups to higher levelsroll-ups to higher levels
OLAPcube
Year by Month
Product Classby Product Name
SalesRegion
Profit Values
34
DalhousieUniversity
CogNovaTechnologies
Top-down VerificationTop-down VerificationTechnologyTechnology
DEMODEMO
Cognos - PowerPlayCognos - PowerPlayAn On-line Analytical Processing An On-line Analytical Processing
(OLAP) System(OLAP) System
35
DalhousieUniversity
CogNovaTechnologies
Overview of Data Mining Overview of Data Mining MethodsMethods
Discovery of patternsDiscovery of patterns – clustering systems clustering systems
e.g. customer segmentatione.g. customer segmentation Predictive modelingPredictive modeling
– regression, neural networksregression, neural networks
e.g. target marketing, risk assessmente.g. target marketing, risk assessment Descriptive modelingDescriptive modeling
– inductive decision treesinductive decision trees
e.g. client characterizatione.g. client characterization
Prob.of Sale
Age
if age > 45 and income < $32k then ...
Age
MaritalStatus
36
DalhousieUniversity
CogNovaTechnologies
Data Mining TechnologyData Mining Technology
DEMODEMO
Angoss - Angoss - KnowledgeSEEKERKnowledgeSEEKER
An inductive decision tree/ruleAn inductive decision tree/rule
systemsystem
37
DalhousieUniversity
CogNovaTechnologies
Data Mining ExampleData Mining Example
Health CareHealth CareSituation: Situation: A life style data on 360 A life style data on 360
personspersons
Problem:Problem: Characterize those most Characterize those most likely to have high/low blood likely to have high/low blood
pressure.pressure.
Solution:Solution: Inductive Decision Tree Inductive Decision Tree
38
DalhousieUniversity
CogNovaTechnologies
Application Areas and Application Areas and OpportunitiesOpportunities
Finance: Finance: investment support, portfolio managementinvestment support, portfolio management Banking & Insurance: Banking & Insurance: credit approval, risk assessmentcredit approval, risk assessment Marketing: Marketing: segmentation, customer targeting, ...segmentation, customer targeting, ... Science and medicine: Science and medicine: hypothesis discovery, hypothesis discovery,
prediction, classification, diagnosis prediction, classification, diagnosis Security: Security: bomb, iceberg, and fraud detectionbomb, iceberg, and fraud detection Manufacturing: Manufacturing: process modeling, quality control,process modeling, quality control,
resource allocation resource allocation Engineering: Engineering: simulation and analysis, pattern simulation and analysis, pattern
recognition, signal processingrecognition, signal processing Internet: Internet: smart search engines, web marketing smart search engines, web marketing
39
DalhousieUniversity
CogNovaTechnologies
The Current Status and The Current Status and TrendsTrends
Standards and methodology lag technologyStandards and methodology lag technology Many products:Many products:
– micro DM packages (Cognos, Angoss)micro DM packages (Cognos, Angoss)– macro - integrated suites (SAS, IBM)macro - integrated suites (SAS, IBM)
Software costs have risen 1000% over 2 yearsSoftware costs have risen 1000% over 2 years Beware - major players yet to be determinedBeware - major players yet to be determined KDD experts fear the hype being generatedKDD experts fear the hype being generated Legal and ethical issues on the horizonLegal and ethical issues on the horizon Internet - “the” sink and source of dataInternet - “the” sink and source of data
40
DalhousieUniversity
CogNovaTechnologies
Integrated Knowledge Discovery Integrated Knowledge Discovery SuitesSuites
Graphical User Interface
DataConsolidation
Selectionand
Preprocessing
DataMining
Interpretationand Evaluation
Warehouse KnowledgeData Sources
41
DalhousieUniversity
CogNovaTechnologies
Benefits of KDDBenefits of KDD Maximum utility from corporate dataMaximum utility from corporate data
– discovery of new knowledgediscovery of new knowledge– generation of modelsgeneration of models
Important feedback to data warehousing effortImportant feedback to data warehousing effort– identification and justification of essential dataidentification and justification of essential data
Reduction of application dev ’t backlogReduction of application dev ’t backlog– model development model development vs. vs. software developmentsoftware development
Effect on bottom line of organizationEffect on bottom line of organization– cost reduction, increased productivity, risk cost reduction, increased productivity, risk
avoidance … competitive advantageavoidance … competitive advantage
42
DalhousieUniversity
CogNovaTechnologies
Requirements and Costs of Requirements and Costs of KDDKDD
HardwareHardware - - computationally intensivecomputationally intensive SoftwareSoftware - - micro < $20k, integrated suites < $300kmicro < $20k, integrated suites < $300k DataData - internal collection, surveys, external sources- internal collection, surveys, external sources Human resourcesHuman resources
– DB/DP/DC expertise to consolidate and preprocess DB/DP/DC expertise to consolidate and preprocess datadata
– Machine learning and stats competenceMachine learning and stats competence– Application knowledge & project mgmtApplication knowledge & project mgmt
70% 70% of the effort is expended on the data of the effort is expended on the data consolidation and preprocessing activitiesconsolidation and preprocessing activities
43
DalhousieUniversity
CogNovaTechnologies
KDD and Data MiningKDD and Data Mining
HYPEHYPE Expensive Expensive
hardware and hardware and software is always software is always requiredrequired
DM is now turn-DM is now turn-key key “just give it “just give it the data”the data”
REALITYREALITY Micro $2k-$10k Micro $2k-$10k
DM packages can DM packages can produce resultsproduce results
DM is data DM is data analysis - requires analysis - requires business sense business sense plus statistics and plus statistics and AI skillsAI skills
44
DalhousieUniversity
CogNovaTechnologies
Access to Recent Access to Recent InformationInformation Book: Book: Data Mining Techniques for Data Mining Techniques for
Marketing, Sales and Customer Support, Marketing, Sales and Customer Support, by M. Berry & G. Linoff, Wiley & Sonsby M. Berry & G. Linoff, Wiley & Sons
Journal: Journal: Data Mining and Knowledge Data Mining and Knowledge DiscoveryDiscovery, Kluwer Publishing, Kluwer Publishing
Conference: Conference: KDD’99KDD’99 Web-pages: Web-pages: Bus. Informatics KDD page Bus. Informatics KDD page
http://www.mgmt.dal.ca/ChrBusInf/knowdishttp://www.mgmt.dal.ca/ChrBusInf/knowdisKnowledge Discovery MineKnowledge Discovery Mine
http://www.kdnuggets.comhttp://www.kdnuggets.com
45
DalhousieUniversity
CogNovaTechnologies
THE ENDTHE END
[email protected]@dal.cawww3.ns.sympatico.ca/~dsilverwww3.ns.sympatico.ca/~dsilver