+ All Categories
Home > Documents > Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent...

Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent...

Date post: 01-Apr-2015
Category:
Upload: nyasia-ricard
View: 212 times
Download: 0 times
Share this document with a friend
26
www.monash.edu.au Advanced Topics in Data Mining and Research Advanced Topics in Data Mining and Research Directions Directions CSE5610 Intelligent Software Systems Semester 1, 2006
Transcript
Page 1: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

Advanced Topics in Data Mining and Research DirectionsAdvanced Topics in Data Mining and Research Directions

CSE5610 Intelligent Software Systems

Semester 1, 2006

Page 2: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

2

Outline

• Mining Different Data Types

– Spatial, Temporal, Time Series, Data Streams, Multimedia, XML, Web, Text etc.

• Distributed Data Mining (DDM)

• Mobile & Ubiquitous Data Mining (UDM)

• Data Mining E-Services

• Anytime, Anywhere Data Mining E-Services

Page 3: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

3

Generations of Data Mining

• Four Generations of Data Mining Systems – Robert Grossman

• First Generation

– Stand Alone, Centralised, Single Algorithm

• Second Generation

– Integration with databases, support for high-dimensionality, complex data types

• Third Generation

– Distribution and Heterogeniety

• Fourth Generation

– Support for mining embedded, mobile and ubiquitous data sources

Page 4: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

Distributed Data Mining

Page 5: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

5

Distributed Data Mining

• Inherently distributed data

• MNC + Global Markets

• => Physical/geographical separation of users from the data sources

• Traditional data mining model involving the co-location of users, data and computational resources is inadequate

Page 6: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

6

Distributed Data Mining (DDM)

• The inherent distribution of data and other resources as a result of organisations being distributed.

• The large volumes of data, the transfer of which results in exorbitant communication costs.

• The need to mine heterogeneous data, the integration of which is both non-trivial and expensive.

• The performance and scalability bottle necks of data mining.

Page 7: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

7

Distributed Data Mining (DDM)

• DDM = Data Mining (DM) + Knowledge Integration (KI)

• DM - Performing traditional knowledge discovery at each distributed data site.

• KI - Merging the results generated from the individual sites into a body of cohesive and unified knowledge.

Page 8: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

8

Parallel Data Mining (PDM)

• Principal distinction between DDM & Parallel DM– parallel mining involves parallel processors

with or without shared memory

• Parallel data mining also includes development of parallel versions of traditional data mining techniques.

• Can be integration – DecisionCentre

Page 9: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

9

DDM – Algorithms & Architectures

• Research in distributed data mining can be divided into two broad categories [Fu01]:

• Data Mining Algorithms. – focus on efficient techniques for knowledge

integration.

• Distributed Data Mining Architectures.– focus on development of distributed data mining

architectures

– emphasizes the processes and technologies that support construction of software systems to perform distributed data mining

Page 10: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

10

Taxonomy of DDM Architectures

Distributed DataMining Systems

Client-Server Agents

Stationary Mobile

Architectures

Self-directedmigration

Page 11: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

11

Classification – DDM Systems

DDM Architectural Models DDM Systems

Client-server DecisionCentre [CDG99], IntelliMiner [PaS99, PaS01], InterAct [PaD02]

Agents Mobile Agent Stationary Agent

JAM [SPT97], Infosleuth [UMG98, MUU99], BODHI [KPH99], Papyrus [Ram98], PADMA [KHS97a, KHS97b]

Page 12: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

12

Client-Server DDM

PC Workstation Laptop

Data Mining Sever

DataServer 2

DataTransfer

UserData Mining

Request

DataMiningResults

DataServer 1

Page 13: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

13

Mobile Agent Model for DDM

PC Workstation

Task Controlling Agent

USERS

Agent SystemData MiningResult Agent

Data MiningResult Agent

DirectoryService

KnowledgeIntegration Agent

Data Resource Agents

DataServer 1

DataServer 1

Laptop

Data Mining Agents

Page 14: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

14

Hybrid Model for DDM

DDM Server

Agent Centre

DataSource 1

DataSource2

DataSource n

ClientServer

AgentAgent

Optimiser

Page 15: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

Ubiquitous Data Mining

Page 16: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

16

Ubiquitous Data Mining (UDM)

• Mining data in a resource-constrained environment to support the time critical information needs of mobile users

• Typical Characteristics– Mobile User – frequent disconnections– Handheld Device -

> Resource constraints – memory, battery, processor, screen real-estate

– Time critical– Real-time & On-line – Data Streams

• Example Scenarios

• Many Challenges

Page 17: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

17

Current Research

• Kargupta’s Group– MobiMine

• @CSSE, Monash Univ.– AgentUDM

– Adapative, Cost-efficient & Light-weight data mining techniques for data streams

> Mohamed Medhat > LWC, LWF & LWClass

> Watch this space!!!

Page 18: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

Data Mining E-Services

Page 19: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

19

Data Mining E-Services

• “…data analysis and mining functions themselves will be offered as business intelligence e-services that accept operational data from clients and return models or rules”

Umesh Dayal, 2001

•Why? – Knowledge is a key resource – Cost of data mining infrastructure

Page 20: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

20

Data Mining E-Services

• Current Commercial Landscape– Several ASPs -> DigiMine, Information Discovery,

WhiteCross Systems, ListAnalyst.com etc. etc.

– Mode of Operation

• Hybrid Model & Data Mining ASPs– Optimise Response Time

> Leads to improved throughput

– QoS Estimation

– Location Preferences of Clients

Page 21: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

21

Data Mining E-Services

• Current Commercial Landscape– Several ASPs -> DigiMine, Information Discovery,

WhiteCross Systems, ListAnalyst.com etc. etc.

– Mode of Operation

• Hybrid Model & Data Mining ASPs– Optimise Response Time

> Leads to improved throughput

– QoS Estimation

– Location Preferences of Clients

Page 22: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

Anytime, Anywhere Data Mining E-Services

Page 23: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

23

My Thoughts

• Data is a commodity, Analysis is a service

• Access anytime, anywhere• By anyone…

– From large corporations to small business to individuals

• From home buyers to mobile salespersons to grocery shoppers…

Page 24: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

24

My Thoughts

• A preliminary model for delivery– Datacentric Grids

High Performance Servers

MiningAlgorithms

ModelRepository

Mobile AgentManagement

System

Model Query

Compute NewModel Request

+Remote User

Data

Compute NewModel Request

+User Data

Compute NewModelRequest

Compute NewModelRequest + UserComputation

Data Repository

Data1

Data2

Datan

PrivateDatacentric

Grid

Compute NewModel Request+ User Data +

UserComputation

Datacentric Grid Management Module

Page 25: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

References

Page 26: Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

www.monash.edu.au

26

References

• http://www.csse.monash.edu.au/projects/MobileComponents/projects/dame/

• http://www.csse.monash.edu.au/~shonali/research.html

• http://www.csee.umbc.edu/~hillol/DDMBIB/

• http://www.csee.umbc.edu/~hillol/diadic.html

• http://www.csse.monash.edu.au/~mgaber/main.html


Recommended