+ All Categories
Home > Data & Analytics > Data Mining : Concepts

Data Mining : Concepts

Date post: 17-Jul-2015
Category:
Upload: pragya-pandey
View: 140 times
Download: 1 times
Share this document with a friend
Popular Tags:
16
Transcript

WHAT IS DATA MINING

ELEMENTS

TECHNIQUES

APPLICATIONS

Data mining (knowledge discovery from data)

Extraction of interesting (non-trivial, implicit, previously unknown

and potentially useful) patterns or knowledge from huge amount of

data

Alternative names

Knowledge discovery (mining) in databases (KDD), knowledge

extraction, data/pattern analysis, data archeology, data dredging,

information harvesting, business intelligence, etc.

DATA

MININGELEMENTS TECHNIQUES APPLICATIONS

KDD

DATA

MININGELEMENTS TECHNIQUES APPLICATIONS

Data Mining – Core

of Knowledge

Discovery Process

(KDD)

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

Data Relationships Sequential

Patterns

ClustersData Mining Techniques

Decision Trees

Neural Networks

Regression

Association Rules

Nearest Neighbor Method

Genetic Algorithm

Artificial Intelligence

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

DA

TA

R

EL

AT

ION

SH

IPS

Sequential Patterns

Finding statistically relevant patterns between data

examples where discrete values delivered in sequence.

Problems addressed

Building efficient databases, indexes for sequence

information, extracting frequently occurring patterns,

comparing sequences, recovering missing sequence

members.

Application:

{Retail Environment}

Anticipating customer behavior for prediction of future

customer purchasing habits.

Increase profit, Decrease cost : Proper management of

shelf space allocation & products display.

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

DA

TA

R

EL

AT

ION

SH

IPS

Sequential Patterns {Eg.:}

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

DA

TA

R

EL

AT

ION

SH

IPS

Clusters

Placing data elements into

related groups without advance

knowledge of the group definitions.

Popular clustering

techniques: K-means,

Expectation Maximization (EM)

Problems addressed

Find natural groupings

Preprocess data to

identify homogeneous groups

on which to build supervised

models.

Anomaly detection

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

DA

TA

R

EL

AT

ION

SH

IPS

Clusters

Application:

Plant and animal ecology

Make spatial and temporal comparisons of

communities of organisms in heterogeneous environments

Medical imaging

differentiate between different types

of tissue and blood in a three-dimensional image

Business and marketing

Partition the general population of consumers for use

in market segmentation, product positioning, new product

development and Selecting test markets.

Decision Trees

In decision tree technique, the root of the decision tree is a simple

question or condition that has multiple answers.

Each answer then leads to a set of questions or conditions that help

us determine the data so that we can make the final decision based

on it.

For example, we use the following decision tree to determine

whether or not to play tennis

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

Starting at root node, if the outlook

is overcast then we should

definitely play tennis.

If it is rainy, we should only play

tennis if the wind is week.

If it is sunny then we should play

tennis in case the humidity is

normal

Neural Networks

Set of connected input/output units and each connection has a

weight present with it. During the learning phase, network learns by

adjusting weights so as to be able to predict the correct class labels

of the input tuples.

Well suited for continuous valued inputs andoutputs

Used to extract patterns and detect trends that are too complex to be

noticed by.

Neural networks are best at identifying patterns or trends in data and

well suited for prediction of forecasting needs.

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

Example :

Handwritten character reorganization, for training a computer to

pronounce English text and many real world business problems and

have already been successfully applied in many industries.

Regression

Regression technique can be adapted for predication

Regression analysis can be used to model the relationship between

one or more independent variables and dependent variables. In data

mining independent variables are attributes already known and

response variables are what we want to predict.

However, it cannot be used for areas involving complex variables like

in sales volumes, stock prices and product failure rates.

Types of regression methods

Linear Regression

Multivariate Linear Regression

Nonlinear Regression

Multivariate Nonlinear Regress

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

Association Rules

In association, a pattern is discovered based on a relationship

between items in the same transaction.

E.g. Rule Form : “Body Head [support, confidence]”

Application

Retailers are using association technique to research

customer’s buying habits. Based on historical sale data, retailers might

find out that customers always buy crisps when they buy beers, and

therefore they can put beers and crisps next to each other to save time

for customer and increase sales.

DATA

MININGELEMENTS

TECHNIQUES/

ALGORITHMSAPPLICATIONS

Types

Multilevel association rule

Multidimensional association

rule

Quantitative association rule

DATA

MININGELEMENTS TECHNIQUES APPLICATIONS

Study of frequent flyer data from an Indian Airline

Data selected, prepared : 3 most common sectors flown & points

redeemed for.

(Note :Incomplete/Inaccurate Data supplied by airlines)

Data Mining results:

Patterns about customers flying between metropolitan cities

Customers that flew between Mumbai-Delhi also flew to other cities

like Mumbai-Chennai, Mumbai-Kolkata & Mumbai Bangalore.

Customers flying Bangalore-Hyderabad also flew Delhi-Bangalore

Those who flew Bagdogra - Guwahati did not fly back; instead flew to

Delhi

DATA

MININGELEMENTS TECHNIQUES APPLICATIONS

Banking information systems contains huge volumes of data both

operational and historical.

Data mining can assist critical decision making processes in a bank.

Areas of application:

Marketing

Risk management and

default detection

Fraud detection

Customer relationship

management

Money laundering detection

Wikipedia

http://en.wikipedia.org/wiki/Sequential_Pattern_Mini

ng

http://searchbusinessintelligence.techtarget.in/

Indian Journal of Computer Science & Engg

Introduction to Data Mining with Case Studies by

G.K. Gupta

DATA

MININGELEMENTS TECHNIQUES APPLICATIONS


Recommended