+ All Categories
Home > Education > introduction to data warehousing and mining

introduction to data warehousing and mining

Date post: 29-Nov-2014
Category:
Upload: rajesh-chandra
View: 541 times
Download: 3 times
Share this document with a friend
Description:
data warehousing and mining introduction class from kl university
17
DATAWAREHOUSING AND MINING BY G.RAJESH CHANDRA
Transcript
Page 1: introduction to data warehousing and mining

DATAWAREHOUSING AND MINING

BYG.RAJESH CHANDRA

Page 2: introduction to data warehousing and mining

EVOLUTION OF DATABASE TECHNOLOGY 1960s (Primitive File Processing)

Data collection, database creation, IMS and network DBMS 1970s to early 1980s (DBMS)

Relational data model, relational DBMS implementation ,SQL, OLTP,User Interfaces.etc

1980s: to Present (Advanced Data Bases) RDBMS, advanced data models (extended-relational, OO,

deductive, etc.) Application-oriented DBMS (spatial, scientific, engineering, etc.)

1990s: (Advanced Data Analysis) Data mining, data warehousing, multimedia databases, and Web

databases 2000s

Stream data management and mining Data mining and its applications Web technology (XML, data integration) and global information

systems

Page 3: introduction to data warehousing and mining

WHY MINE DATA? COMMERCIAL VIEWPOINT

Lots of data is being collected and warehoused Web data, e-commerce purchases at department/

grocery stores Bank/Credit Card

transactions

Competitive Pressure is Strong Provide better, customized services for an edge (e.g. in

Customer Relationship Management)

Page 4: introduction to data warehousing and mining

WHAT IS DATA MINING…..? Data mining (sometimes called data

Discovery or Knowledge Discovery Data) is the process of analyzing data from different perspectives and summarizing it into useful information.

• Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data

Page 5: introduction to data warehousing and mining

WHY MINE DATA? SCIENTIFIC VIEWPOINT Data collected and stored at

enormous speeds (GB/hour) remote sensors on a satellite telescopes scanning the skies microarrays generating gene

expression data scientific simulations

generating terabytes of data Traditional techniques infeasible for

raw data Data mining may help scientists

in classifying and segmenting data in Hypothesis Formation

Page 6: introduction to data warehousing and mining

EXAMPLES: WHAT IS (NOT) DATA MINING?

What is not Data Mining?

– Look up phone number in phone directory

– Query a Web

search engine for information about “Amazon”

What is Data Mining?

– Certain names are more prevalent in certain US locations (O’Brien, O’Rurke, O’Reilly… in Boston area)

– Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,)

Page 7: introduction to data warehousing and mining

DATA MINING IS ALSO CALLED AS..?

• Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.

• Real Time Example Gold Mining

Page 8: introduction to data warehousing and mining

DATA WARE HOUSE = COLLECTION OF DATA BASES

Page 9: introduction to data warehousing and mining

WE HAVE TO USE DIFFERENT METHODS

Page 10: introduction to data warehousing and mining

RAW DATA =DATA BASES + NOISE DATA

Page 11: introduction to data warehousing and mining

DATA SELECTION AND TRANSFORMATION

Page 12: introduction to data warehousing and mining

DATA CLEANING AND INTEGRATION

Page 13: introduction to data warehousing and mining

DATA MINING

Page 14: introduction to data warehousing and mining

PATTERN EVALUATION

Page 15: introduction to data warehousing and mining

KNOWLEDGE REPRASENTATION

Page 16: introduction to data warehousing and mining

KNOWLEDGE REPRASENTATION

Page 17: introduction to data warehousing and mining

April 9, 2023

KNOWLEDGE DISCOVERY (KDD) PROCESS

Data mining—core of knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation


Recommended