Date post: | 11-May-2015 |
Category: |
Technology |
Upload: | emanuele-storti |
View: | 173 times |
Download: | 0 times |
A Semantic-Aided Designer for Knowledge Discovery
Claudia Diamantini, Domenico Potena, Emanuele Storti
CTS2011, Philadelphia, May 23-27
Università Politecnica delle MarcheDepartment of Computer Science, Management and AutomationAncona, Italy
CTS2011, May 23-27
Introduction
A Semantic-Aided Designer for KD
Organizations need methods and technologies to analyze huge amounts of data, to support decisional processes
Knowledge Discovery in Databases (KDD) is the process of identifying valid, novel, potentially useful patterns in data
many steps, iterations interaction user knowledge
Data explosion & KDD
CTS2011, May 23-27
Introduction
DB / DWHadministrator
domain expert
DMspecialistsKDD
specialists
1st generation of IDAs (Intelligent Data Analysis systems):● local frameworks● single-user● predefined set of tools (little extensibility)
How to support the design of a KDD project in an open, distributed and collaborative scenario?
2nd generation: distribution of tools & computational aspects
Evolution of organizations: distribution of user, collaboration
A Semantic-Aided Designer for KD
tools should be easily and dinamically added in the platform they should be accessible, searchable, executable via standard API suggestions about the best tool sequences support for tool setup and process execution
CTS2011, May 23-27
Issues
Heterogeneity & tool distribution
Many KDD and Data Mining tools available for any domain/task, many possible combinations
Heterogeneous interfaces programming languages, OSs, transfer protocols,..
Complex to use process design, data preparation, precondition satisfaction, I/O interpretation
A Semantic-Aided Designer for KD
collaborative design of KDD processes tool/process sharing and annotation easy join of new partners in Virtual Teams
CTS2011, May 23-27
Issues
User distribution
Distributed organizations:● multiple branch enterprises● E-Science project
Collaboration: ● source of complexity● distributed computation: several users can succeed where a single user is likely to fail
A Semantic-Aided Designer for KD
Basic Services
Services for any KDD task:every KDD tool is wrapped as a Web Service, deployed on the publisher's server, and published in a common repository
C4.5 tool C4.5 service
CTS2011, May 23-27
Methodology
Service Oriented Architecture
Support Services
Back-end services:● access control ● data transfer● service publishing● UDDI registry
High-level functionalities:● service discovery● interface matchmaking● process composition
A Semantic-Aided Designer for KD
Separation of information in 3 abstraction layers
Tools/services are annotated through XML descriptors: details about interfaces and QoS Algorithms are formally described in a KDD ontology, which contains an algorithm taxonomy and high level information about their tasks, methods and functionalities
CTS2011, May 23-27
Methodology
Semantic descriptors for Basic Services
A Semantic-Aided Designer for KD
KDD tools
KDD algorithms ID3
Benefits: loose-coupling, reusability Support services rely on such layers:
service discovery interface matchmaking process composition
CTS2011, May 23-27
Methodology
KDD services
A Semantic-Aided Designer for KD
Benefits: loose-coupling, reusability Support services rely on such layers:
service discovery interface matchmaking process composition
CTS2011, May 23-27
Methodology
Labeled Dataset
KDD services
KDD ontology
abc C4.5_v.2.0
C4.5algorithm
Remove missing values algorithm
A Semantic-Aided Designer for KD
A web-based tool aimed at supporting users in collaborative KDD process design
CTS2011, May 23-27
KDDesigner
A Semantic-Aided Designer for KD
Service discovery
Retrieval of KDD services satisfying user requirements
CTS2011, May 23-27 A Semantic-Aided Designer for KD
Service discovery
Retrieval of KDD services satisfying user requirements
1
2
3
4
CTS2011, May 23-27
KDDONTO
A Semantic-Aided Designer for KD
Service discovery
Retrieval of KDD services satisfying user requirements
2
3
1
4
CTS2011, May 23-27 A Semantic-Aided Designer for KD
CTS2011, May 23-27
Process design
A Semantic-Aided Designer for KD
CTS2011, May 23-27
Interface matchmaking
Verification of data compatibility in an I/O connection
A Semantic-Aided Designer for KD
CTS2011, May 23-27
Interface matchmaking
Matchmaker service checks the validity of the match ● syntactic compatibility comparison between service descriptors (I/O primitive datatype and syntax)
KDD services abc
A Semantic-Aided Designer for KD
● Output: cost of match
same format?same primitive datatype?
CTS2011, May 23-27
Interface matchmaking
Matchmaker service checks the validity of the match ● syntactic compatibility comparison between service descriptors (I/O primitive datatype and syntax)
same concept? subconcept?part-of concept?
KDD services
KDD ontology
abc
A Semantic-Aided Designer for KD
● semantic compatibility comparison between ontological annotations of the services (kind of match between I/O, preconditions/postconditions... and many more)
● Output: cost of match
x y
CTS2011, May 23-27
Semi-automatic composition
KDDComposer: advanced service for composition
Input● user dataset● a set of requirements (max num algorithms, computational complexity, max cost of match)● user goal (classification, regression, ...)
Output A ranked list of abstract processes(suggestions about processes useful to solve the user problem)
A Semantic-Aided Designer for KD
CTS2011, May 23-27
Collaboration
● collaborative process edit/annotation (wiki-style)● versioning system● team management and add of new users● manual parameter setting
A Semantic-Aided Designer for KD
Open environment and heterogeneous tools● different interfaces: need of a common representation
(service)● abstraction for an high-level description of tools (algorithm)● semantics for interoperability and high-level functionalities
CTS2011, May 23-27
Conclusion
Future work● extension with new support services● process export in more workflow languages● more collaborative features (real-time editor)
A Semantic-Aided Designer for KD
SOA for KDD● Basic Services and Support Services● KDD Designer: a semantic-aided designer for KDD