Business Intelligence Data Mining Techniques As Tools for Business Intelligence.

Post on 16-Jan-2016

225 views 0 download

Tags:

transcript

Business IntelligenceBusiness Intelligence

Data Mining Techniques As Tools for Data Mining Techniques As Tools for

Business IntelligenceBusiness Intelligence

2

IntroductionIntroduction

Motivation: Why data mining?

What is data mining?

Data Mining: On what kind of data?

Data mining functionality

Are all the patterns interesting?

Classification of data mining systems

3

What Is Data Mining?What Is Data Mining?

Data mining (knowledge discovery in databases): Extraction of interesting (non-trivial, implicit,

previously unknown and potentially useful) information or patterns from data in large databases

Alternative names and their “inside stories”: Data mining: a misnomer? Knowledge discovery(mining) in databases

(KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.

4

Why Data Mining? — Why Data Mining? — Potential ApplicationsPotential Applications

Database analysis and decision support Market analysis and management

target marketing, customer relation management, market basket analysis, cross selling, market segmentation

Risk analysis and management Forecasting, customer retention, improved

underwriting, quality control, competitive analysis Fraud detection and management

Other Applications Text mining (news group, email, documents) and

Web analysis. Intelligent query answering

5

Market Analysis and Management Market Analysis and Management (1)(1)

Where are the data sources for analysis? Credit card transactions, loyalty cards, discount coupons,

customer complaint calls, plus (public) lifestyle studies

Target marketing Find clusters of “model” customers who share the same

characteristics: interest, income level, spending habits, etc.

Determine customer purchasing patterns over time Conversion of single to a joint bank account: marriage, etc.

Cross-market analysis Associations/co-relations between product sales Prediction based on the association information

6

Market Analysis and Management Market Analysis and Management (2)(2)

Customer profiling data mining can tell you what types of customers

buy what products (clustering or classification) Identifying customer requirements

identifying the best products for different customers

use prediction to find what factors will attract new customers

Provides summary information various multidimensional summary reports statistical summary information (data central

tendency and variation)

7

Corporate Analysis and Risk Corporate Analysis and Risk ManagementManagement Finance planning and asset evaluation

cash flow analysis and prediction contingent claim analysis to evaluate assets cross-sectional and time series analysis (financial-

ratio, trend analysis, etc.) Resource planning:

summarize and compare the resources and spending

Competition: monitor competitors and market directions group customers into classes and a class-based

pricing procedure set pricing strategy in a highly competitive market

8

Fraud Detection and Management Fraud Detection and Management (1)(1) Applications

widely used in health care, retail, credit card services, telecommunications (phone card fraud), etc.

Approach use historical data to build models of fraudulent behavior

and use data mining to help identify similar instances Examples

auto insurance: detect a group of people who stage accidents to collect on insurance

money laundering: detect suspicious money transactions (US Treasury's Financial Crimes Enforcement Network)

medical insurance: detect professional patients and ring of doctors and ring of references

9

Fraud Detection and Management Fraud Detection and Management (2)(2) Detecting inappropriate medical treatment

Australian Health Insurance Commission identifies that in many cases blanket screening tests were requested (save Australian $1m/yr).

Detecting telephone fraud Telephone call model: destination of the call,

duration, time of day or week. Analyze patterns that deviate from an expected norm.

British Telecom identified discrete groups of callers with frequent intra-group calls, especially mobile phones, and broke a multimillion dollar fraud.

Retail Analysts estimate that 38% of retail shrink is due to

dishonest employees.

10

Data mining: the core of knowledge discovery process.

Data Mining: A KDD ProcessData Mining: A KDD Process

DB-03

DB-01DB-01

DATA SOURCES

DATAWAREHOUSEData

Pre-Processing

DataSelection

Data Integration

Task RelevantData

DataMining

100%

90%

80%

70%

60%

50%

40%

30%

40%

50%

DM Models

ModelEvaluation

KNOWLEDGE

Feedback: Knowledge Integration

11

Learning the application domain: relevant prior knowledge and goals of application

Creating a target data set: data selection Data cleaning and preprocessing: (may take 60% of effort!) Data reduction and transformation:

Find useful features, dimensionality/variable reduction, invariant representation.

Choosing functions of data mining Summarization, classification, regression, association,

clustering. Choosing the mining algorithm(s) Data mining: search for patterns of interest Pattern evaluation and knowledge presentation

Visualization, transformation, removing redundant patterns, etc.

Deployement: Use of discovered knowledge

Steps of a KDD ProcessSteps of a KDD Process

12

Standardized Data Mining Standardized Data Mining ProcessesProcesses

Step 1: Business Understanding

Determine the business objectives

Assess the situation Determine the data

mining goals Produce a project plan

Cross-Industry Standard Process for Data Mining CRISP-DM

13

Standardized Data Mining Standardized Data Mining ProcessesProcesses

Step 2: Data Understanding

Collect the initial data

Describe the data Explore the data Verify the data

Cross-Industry Standard Process for Data Mining CRISP-DM

14

Standardized Data Mining Standardized Data Mining ProcessesProcesses

Step 3: Data Preparation Select data Clean data Construct data Integrate data Format data

Cross-Industry Standard Process for Data Mining CRISP-DM

15

Standardized Data Mining Standardized Data Mining ProcessesProcesses

Step 4: Modeling Select the modeling

technique Generate test

design Build the model Assess the model

Cross-Industry Standard Process for Data Mining CRISP-DM

16

Standardized Data Mining Standardized Data Mining ProcessesProcesses

Step 5: Evaluation Evaluate results Review process Determine next step

Cross-Industry Standard Process for Data Mining CRISP-DM

17

Standardized Data Mining Standardized Data Mining ProcessesProcesses

Step 6: Deployment Plan deployment Plan monitoring and

maintenance Produce final report Review the project

Cross-Industry Standard Process for Data Mining CRISP-DM

18

Architecture of a Architecture of a Typical Data Mining SystemTypical Data Mining System

USER

INTERFACE

Best Data Mining Tool

DATA

WAREHOUSE

Statistical Components:

. Data Cleaning

. Data Transformation

. Exploratory Analysis

. Factor Analysis

. ...

Data Mining Components:

. Decision Trees

. Association Rules

. Clustering

. Visualization

. ...

Output

Input

User

DataSources

DomainKnowledge

Base

Data . Cleaning . Integration . Transformatin

Data . Cleaning . Integration . Transformatin

19

Data Mining Functionalities (1)Data Mining Functionalities (1)

Concept description: Characterization and discrimination

Generalize, summarize, and contrast data characteristics, e.g., dry vs. wet regions

Association (correlation and causality) Multi-dimensional vs. single-dimensional association age(X, “20..29”) ^ income(X, “20..29K”) buys(X,

“PC”) [support = 2%, confidence = 60%] contains(T, “computer”) contains(x, “software”)

[1%, 75%]

20

Classification and Prediction Finding models (functions) that describe and

distinguish classes or concepts for future prediction E.g., classify countries based on climate, or classify

cars based on gas mileage Presentation: decision-tree, classification rule, ANN Prediction: Predict some unknown or missing numerical

values Cluster analysis

Class label is unknown: Group data to form new classes, e.g., cluster houses to find distribution patterns

Clustering based on the principle: maximizing the intra-class similarity and minimizing the interclass similarity

Data Mining Functionalities (2)Data Mining Functionalities (2)

21

Data Mining Functionalities (3)Data Mining Functionalities (3)

Outlier analysis Outlier: a data object that does not comply with the

general behavior of the data It can be considered as noise or exception but is

quite useful in fraud detection, rare events analysis

Trend and evolution analysis Trend and deviation: regression analysis Sequential pattern mining, periodicity analysis Similarity-based analysis

Other pattern-directed or statistical analyses

22

Data Mining: Data Mining: Combination of Multiple DisciplinesCombination of Multiple Disciplines

STATISTICS

MACHINELEARNING

DATABASETECHNOLOGY

VISUALIMAGING

OTHERDISCIPLINES

INFORMATIONSCIENCE

DATAMINING

23

A Multi-Dimensional View of Data A Multi-Dimensional View of Data Mining ClassificationMining Classification Databases to be mined

Relational, transactional, object-oriented, object-relational, active, spatial, time-series, text, multi-media, heterogeneous, legacy, WWW, etc.

Knowledge to be extracted Characterization, discrimination, association, classification,

clustering, trend, deviation and outlier analysis, etc. Multiple/integrated functions and mining at multiple levels

Techniques to utilized Database-oriented, data warehouse (OLAP), machine

learning, statistics, visualization, neural network, etc. Applications adapted

Retail, telecommunication, banking, fraud analysis, DNA mining, stock market analysis, Web mining, Weblog analysis, etc.

24

Poll: Which data mining Poll: Which data mining technique..?technique..?

25

1. Association1. Association

Market Basket Analysis

26

Association RulesAssociation Rules

A.K.A. Association rule mining Mining association rules from transactional

databases using Apriory algorithm Other Methods…

Mining multilevel association rules from transactional databases

Mining multidimensional association rules from transactional databases and data warehouse

From association mining to correlation analysis Constraint-based association mining

27

What Is Association Mining?What Is Association Mining?

Association rule mining: Finding frequent patterns, associations,

correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories.

Applications: Market basket analysis, cross-marketing, catalog

design, loss-leader analysis, clustering, classification, etc.

Examples: Rule form: “Body Head [support, confidence]”

buys(x, “diapers”) buys(x, “beers”) [0.5%, 60%] major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%,

75%]

28

2. What is Cluster Analysis?2. What is Cluster Analysis?

Cluster: a collection of data objects… Similar to one another within the same

cluster Dissimilar to the objects in other clusters

Cluster analysis Grouping a set of data objects into clusters

Clustering is unsupervised classification: no predefined classes

Typical applications As a stand-alone tool to get insight into data

distribution As a preprocessing step for other algorithms

29

General Applications of ClusteringGeneral Applications of Clustering

Pattern Recognition Spatial Data Analysis Image Processing Economic Science WWW

Document classification Cluster Weblog data to discover

groups of similar access patterns

30

Examples of Clustering ApplicationsExamples of Clustering Applications

Marketing: Help marketers discover distinct groups in their customer bases, and then use this knowledge to develop targeted marketing programs

Land use: Identification of areas of similar land use in an earth observation database

Insurance: Identifying groups of motor insurance policy holders with a high average claim cost

City-planning: Identifying groups of houses according to their house type, value, and geographical location

Earth-quake studies: Observed earth quake epicenters should be clustered along continent faults

31

The K-Means Clustering MethodThe K-Means Clustering Method

Example

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

32

Decision tree… A flow-chart-like tree structure Internal node denotes a test on an attribute Branch represents an outcome of the test Leaf nodes represent class labels or class distribution

Decision tree generation consists of two phases Tree construction

At start, all the training examples are at the root Examples are recursively partitioned based on selected

attributes Tree pruning

Identify and remove branches that reflect noise or outliers

Use of decision tree: Classifying an unknown sample

3. Decision Tree Induction3. Decision Tree Induction

33

age income student credit_rating buys_computer<=30 high no fair no<=30 high no excellent no31…40 high no fair yes>40 medium no fair yes>40 low yes fair yes>40 low yes excellent no31…40 low yes excellent yes<=30 medium no fair no<=30 low yes fair yes>40 medium yes fair yes<=30 medium yes excellent yes31…40 medium no excellent yes31…40 high yes fair yes>40 medium no excellent no

This follows an example from Quinlan’s ID3

Training DatasetTraining Dataset

34

age?

overcast

student? credit rating?

no yes fairexcellent

<=30 >40

no noyes yes

yes

30..40

Output:Output:A Decision Tree for Credit ApprovalA Decision Tree for Credit Approval

35

4. Neural Networks4. Neural Networks

Advantages prediction accuracy is generally high robust, works when training examples

contain errors output may be discrete, real-valued, or a

vector of several discrete or real-valued attributes

fast evaluation of the learned target function Criticism

long training time difficult to understand the learned function not easy to incorporate domain knowledge

36

The n-dimensional input vector x is mapped into variable y by means of the scalar product and a nonlinear function mapping

k-

f

weighted sum

Inputvector x

output y

Activationfunction

weightvector w

w0

w1

wn

x0

x1

xn

A NeuronA Neuron

37

Multi-Layer PerceptronMulti-Layer Perceptron

INPUTLAYER

(4 Neurons)

HIDDENLAYER I(4 PEs)

HIDDENLAYER II(3 PEs)

OUTPUTLAYER

(2 Neurons)

38

Applications of Neural NetworksApplications of Neural Networks

Financial Decision making Fraud Detection Bankruptcy Problem Weather Forcasting Feature Detection Voice Recognition