Data mining concepts

Post on 21-Dec-2014

208 views 0 download

Tags:

description

 

transcript

Data-Mining Concepts

Group 7

What is Data Mining ?

Mining and discovery of new information in terms of patterns orrules from vast amounts of data.

The process of discovering meaningful new correlations, patterns and trends by sifting through large amounts of data stored in repositoties, using pattern recognition technologies as well as statical and methematics techniques.

Why we mine Data ?

Commercial View Point :-Lots of data is being collected and warehoused . Computers have become cheaper and more powerful. Competitive Pressure is Strong .

Scientific View Point :-Data collected and stored at enormous speeds (GB/hour).Traditional techniques infeasible for raw data.Data mining may help scientists.

On what kind of Data...?

• Relational databases• Data warehouses• Transactional databases• Advanced database systems:

Object-relationalSpacial and TemporalTime-seriesMultimedia, textWWW

What are the goals of Data mining?

• Prediction e.g. sales volume, earthquakes• Identification e.g. existence of genes, system intrusions• Classification of different categories e.g. discount seeking shoppers or loyal regular shoppers in a supermarket• Optimization of limited resources such as time, space, money or materials and maximization of outputs such as sales or profits

What are the applications of Data-

Mining ?

● Marketing Analysis of consumer behavior Advertising campaigns Targeted mailings Segmentation of customers, stores,

or products

● Finance Creditworthiness of clients Performance analysis of finance

investments Fraud detection

● Manufacturing Optimization of resources Optimization of manufacturing

processes Product design based on customer

requirements

● Health Care Discovering patterns in X-ray images Analyzing side effects of drugs Effectiveness of treatments

What are the present commercial tools for

Data Mining ?

SASData to knowledge

Oracle data-miner

ClementineIntelligent miner

How to build a data mining model? An important concept is

that building a mining model is part of a larger process.

Clearly define the business problem.

1. Defining the

problem.

consolidate and clean the data that was identified in the Defining the Problem step.

2. Preparing Data

.

Explore the prepared data

3.Exploring Data

Before you build a model, you must randomly separate the prepared data into separate training and testing datasets. You use the training dataset to build the model, and the testing dataset to test the accuracy of the model by creating prediction queries.

4.Building Models

5. Exploring and validating models Explore the models that you

have built and test their effectiveness.

6. Deploying and updating models Deploy to a production

environment the models that performed the best.

What are the major issues in Data-Mining

concept ?

Mining different kinds of knowledge in databases Interactive mining of knowledge at multiple levels of

abstraction Incorporation of background knowledge Data mining query languages and ad-hoc data mining Expression and visualization of data mining results Handling noise and incomplete data Pattern evaluation: the interestingness problem Integration of the discovered knowledge with existing

knowledge: A knowledge fusion problem Protection of data security, integrity, and privacy

How will be the future of Data-Mining concept?

● Active research is ongoing Neural Networks Regression Analysis Genetic Algorithms● Data mining is used in many areas today. We cannot even begin to imagine what the future holds in its womb!

Thank You !