+ All Categories
Home > Documents > Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu...

Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu...

Date post: 04-Jan-2016
Category:
Upload: sherilyn-owen
View: 212 times
Download: 0 times
Share this document with a friend
Popular Tags:
23
Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu [email protected] Yanqing Zhang, Scott Owen, Sushil Prasad and Raj Sunderraman Department of Computer Science Georgia State University George Vachtsevanos School of Electrical and Computer Engineering Georgia Institute of Technology
Transcript
Page 1: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Intelligent Internet Agents for Distributed Data Mining

{yzhang, sowen, sprasad, raj}@[email protected]

Yanqing Zhang, Scott Owen, Sushil Prasad and Raj Sunderraman

Department of Computer Science

Georgia State University

George Vachtsevanos

School of Electrical and Computer Engineering

Georgia Institute of Technology

Page 2: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Outline• Motivation

• Architecture of Intelligent Internet Agents

• Program Libraries of Intelligent Middleware

• Smart Web Search Agents

• Intelligent Soft Computing Agents

• Benefits

• Deliverables

• Conclusion

Page 3: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Motivation• Distributed Web KDD: Useful information and

knowledge mined in distributed Web databases

• QoS (Efficiency, Web Speed, User Time) : Huge amounts of useless data flow on the Internet

• From Data Web to Information Web: Upgrade a current data-flow-oriented Internet to a future information-flow-oriented Internet

• Intelligent Web Middleware: with reusable, portable and scalable intelligent functionality

• Smart E-Business: Use intelligent Web agents to do better E-Business on the Internet

Page 4: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Architecture of Intelligent Internet Agents

Application Layer: E-Commerce, E-Education, other E-B

Intelligent Layer: Data Mining, Soft Computing, ES, etc

Network Layer: Backbone, gigaPoPs, other hardware

Page 5: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Program Libraries of Intelligent Middleware1. Binary Association Rule Generator2. Fuzzy Association Rule Generator3. Neural-Net-based Data Classifier and Pattern Generator4. Fuzzy c-means Program for Data Clustering5. Genetic Algorithms for Data Refinement and Optimization6. Granular Neural Nets for Linguistic Data Mining7. XML-based Smart Web Search Sub-Programs8. Connection Programs between Database and Middle Layer9. Local Cache Database Manager10. Local Cache Informationbase Manager11. Basic GUI Programs12. Client-Server Creation and Communication Programs13. Distributed Operation Manager14. Distributed Data Mining Synchronization, 15. Web Customer Log Miner, .….. , and so on.

Page 6: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Smart Web Search Agents• Data Search Engines >> Information Search Agents

- Traditional searching on the Web is done using one of the following three:

- Directories (Yahoo, Lycos, etc) - Search Engines (AltaVista, NorthernLight, etc) - Metasearch Engines (MetaCrawler,

SavvySearch, AskJeeves, etc) All of these involve keyword searches;

Drawback: not easily personalized, too many results (although many give

relevancy factors)

Page 7: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

- Smart Search Agents will provide

- more personalized searches

- domain-based search,

- more efficient searches

Page 8: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Smart Search Agents will employ - local cache databases (containing

frequently asked queries/results; possibly updated periodically - nightly!)

- local cache information base (containing mined information and discovered knowledge for efficient personal use)

- domain-based agents (e.g. Job Search; Sports-NBA Stats, Bibliography-Digital Libraries)

Page 9: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Some initial results:• M. Nagarajan, Metagenie - A metasearch engine for

multi-databases, M.S. thesis, GSU (July 1999) Domains: Jobs, Books• S. Ahmed, EXACT-FINDER: A cache-based meta-search

engine, M.S. thesis, GSU (May 2000) Local cache database storing personalized frequently

asked queries and results, updated periodically•  R. Sunderraman, ReQueSS: Relational Querying of semi-

structured data, ICDE 2000 (demo session), San Diego, CA, March 2000.

• X. Li, Querying unified sources of Web data, M.S. thesis, GSU (July 1999)

Data wrappers for Web sources (NBA stats/box scores, DBLP Bibliography database)

Page 10: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Intelligent Tools for E-Business• Computational Intelligence, Neural Networks,

Fuzzy Logic, Genetic Algorithms, Hybrid Systems

• Learning Algorithms, Heuristic Searching

• Data Analysis and Modeling, Data Fusion and Mining, Knowledge Discovery

• Prediction & Time Series Analysis

• Information Retrieval, Intelligent User Interface

• Intelligent Agents, Distributed IA and Multi-Agents, Cooperative Knowledge-based Systems

Page 11: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Enhancing E-Business Process Through Data Mining

• Quality of discovered knowledge– Having right data– Having appropriate

data mining tools!!!

D a ta M in in g( Kn o w led g e d is c o v er y )

D AT A W ar eh o u s e

D AT A W ar eh o u s e

D AT A W ar eh o u s e

F ailu r e P atte r n s

Su cces s P at t ern s

F A IL U R E P at t ern s

SU C C E SS P at t ern s

• Traditional Data Mining Tools

– Simple query and reporting

– Visualization driven data exploration tools, OLAP

– Discovery process is user driven

Page 12: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Intelligent Data Mining Tools

• Automate the process of discovering patterns/knowledge in data

• Require hypothesis, exploration• Derive business knowledge (patterns) from data• Combine business knowledge of users with

results of discovery algorithms

D AT A W ar eh o u s e

D AT A W ar eh o u s e

D AT A W ar eh o u s e

F ailu r e P a tte r n s

Su cces s P at t ern s

F A IL U R E P at t ern s

SU C C E SS P at t ern s

Page 13: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Intelligent Information Agents

• The Data Mining Problem:– Clustering/ Classification– Association– Sequencing

• Viewed as an Optimization Problem

• Tools: Genetic Algorithms

Page 14: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Fuzzy Rules Discovering• Rules discovering : The discovery of associations

between business events, i.e. which items are purchased together

• In order to do flexible querying and intelligent searching, fuzzy query is developed to uncover potential valuable knowledge

• Fuzzy Query uses fuzzy terms like tall, small, and near to define linguistic concepts and formulate a query

• Automated search for fuzzy Rules is carried out by the discovery of fuzzy clusters or segmentation in data

Page 15: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Fuzzy Decision Making:Match Users with Dynamic Products, Services, and Pricing

Loss Ratio(Risk)

Response

Persistency(Retention)

Low Medium High

Lo

w

Med

ium

Hig

h

Low Medium

High

Low RiskHigh ResponseHigh Retention

->Customer: Preferred

Pricing: according to Life-time Value

Cross-Selling: BundleExtra Liability Insurance

(Risk-Response-Retention ( R ) Model)3

Example of 3 Service Provider’s Features

Page 16: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Measuring Performance of Intelligent Agents

• Accuracy : distance or variance measure of IAs’ performance from their goal, i.e. Fuzzy Entropy

• Speed : latency of response

• Cost : resources consumed, consequences of failures

• Benefit : payoff for goals achieved

...BenefitwCostwSpeedwAccuracyw IAP 4321 ...BenefitwCostwSpeedwAccuracyw IAP 4321

Page 17: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Performance Assessment, Learning and Optimization

D AT A W ar eh o u s e

D AT A W ar eh o u s e

D AT A W ar eh o u s e

F ailu r e P a tte r n s

Su cces s P at t ern s

F A IL U R E P at t ern s

SU C C E SS P at t ern s

Learning/Adaptation

Learning/Adaptation

Performance Evaluation Module

Performance Evaluation Module

Goals/Objectives

Goals/Objectives

Page 18: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Examples• Product Information Clustering

– Use a GA as the Heuristic Search Engine– Apply the GA selection and inversion operators– Evaluate information content– Estimate system entropy– Apply reinforcement learning strategy

• Dynamic Pricing– In addition to above steps, explore association

and sequencing relations

Page 19: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

The “New Technology” Paradigm

InternetRelatedTechnologies

Euphoria/Optimism Reality

Back to Basics

Time

Page 20: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

INFORMATION IS SELLING NOW!

Intelligent Agents will give your information product bargaining power

Page 21: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Benefits• Better QoS:

- Web users get information (not raw data)

- Smart agents can make decisions for users

- Smart agents can save users’ surfing time

• Faster Internet:

- Information flows on the Internet quickly (e.g., 1k information << 100 k raw data)

- Reduce data redundancy on the Internet

- Reduce Web communication congestion

Page 22: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Deliverables

• Intelligent Middle Layer

- Data Mining Program Libraries

- Soft Computing Program Libraries (e.g., Neural Networks, Fuzzy Logic, Genetic Algorithms, Neuro-fuzzy Systems)

• Application Layer - Smart Web Search Agents

- Intelligent Soft Computing Agents

Page 23: Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad.

Conclusion

• To make the future Internet more intelligent and more efficient, it is necessary to design relevant "Intelligent Middleware" between network hardware and high-level Web application systems.

• We will first design basic intelligent middle layer with basic intelligent functionality, and then implement two Web application systems for distributed data mining and E-Business.


Recommended