Post on 13-Jan-2016
transcript
11
An Improved Discovery Engine for Efficient and Intelligent discovery of Web
Service with publication facility
Vandan Tewari1 . N Dagdee2 . Inderjeet Singh1 . Nipur Garg1 . Preeti Soni1
1. Shri G.S. Institute of Technology & Science, Indore 2. Shri S.D. Bansal Institute of Tech. & Science, Indore
22
Contents
BackgroundRelated work on Web Service DiscoveryAddressed IssuesOur proposal
Proposed architecture Proposed Algorithm Modules implemented
Test Case and ResultsConclusion & Future Enhancements
33
Background
SOA (Service Oriented Architecture)
Service Oriented architecture is the latest evolution of distributed computing which enables software components to be exposed as services.
Web Service A web service is a stand alone software component
designed to support interoperable machine-to-machine
interaction over a network.
44
Web Service & SOA
An example scenario of web service
Find Publish
Bind
55
Related work on Web Service Discovery Available sources for service discovery & their respective drawbacks.A. Centralized Service Broker (UBRs) *
Single point failure.Performance Bottlenecks
B. Federated Registries*Inconsistent policies to be employed so real time search is inefficient.No advance search facility is available.
* Ref.[E.Al-Masri, www2008,pp.795-804]
66
Continued…
C. Search Engine **Inability to distinguish between web page & web service document (WSDL) leads to data irrelevancy.
D. Web Crawler Engine***Problem of service overload still exists.
** Ref. [K.Sivashanmugam et.al. ISWC,pp. 270-278,2004]
***[E. Al-Masri, Q. H. Mahmoud, IEEE ICWS 2007, pp.1104-1111]
77
Technical limitations of UDDI
Passivity of UDDI since service revocation is voluntary, it results in passive data in UDDI.
Absence of QoS parameters for Web Services.
Absence of web service life cycle management.
Ref. [K.Sivashanmugam et.al. IEEE,ISWC ,2004,pp. 270-278]
88
Addressed Issues
How to deal with passivity of UBRs to increase service availability.
Due to service overload if UBRs are overflowing with services, difficulty in discovering appropriate services.
Suggesting appropriate services to the service requester based on service feedbacks and frequency of usage.
99
Our Proposal
A “Discovery cum Publishing Engine” has been designed which increases the service availability by removing passive web services from UBR and improve service search time by applying data mining techniques on the contents of UBR and also uses past user service feedbacks and usage frequency to suggest appropriate services to the service consumer .
1010
Assumptions
• Domain of trust among UBRs is already established.
Test case is developed on small set of experimental data.
Predefined classification scheme is used based on “Location parameter” of Travel service.
1111
Discovery cum Publishing Engine
Validation Module
Publish Manager
Search Manager
Add Review
Service consumer Service Provider
UBR1 UBR2 UBR3 UBRn
Discover Publish
Bind
Proposed Architecture of Our System
Crawl Crawl Crawl
1212
Modules Implemented
A.Publish Manager
B. Search ManagerUBRs Crawl Module Search Module Dynamic IP Module Cluster Module
C. Validate Module WSDL Parser Module Delete Module
D. Add Review Module
1313
Working of Proposed System
1414
Mechanism of Dynamic IP Module
Update the IP Table of Engine dynamically :
Step 1 : Starts crawling on initial seeds.
Step 2 : From each initial seed it finds out the IP addresses of the service providers.
Step 3 : From each provider it fetches the IP Addresses of UBRs in which they have published their other web services .
Step 4 : Those fetched IP Addresses will be compared with initial seeds, if any new IP is identified it will be stored in its local IP Table ; rest will be overlooked in order to avoid redundancy.
1515
Proposed Algorithm for Publishing
Publish Manager
Step 1: Start
Step 2: Select UBRs IP Address where publishing is required.
Step 3: Accept details of web services along with its location that acts as a predefined class from the service provider
Step 4: Classify the web service based on its location which acts as a class.
Step 5: Store the details of web service information into selected UBR in a particular class to which that service belongs.
Step 6: Stop
1616
Classification Scheme followed
Location /Class No. of Published WS
Location/Class No. of Published WS
AnnapurnaRoad 2 Pardesi Pura 2
HukumchandMarg 2 RajendraNagar 2
Indore Ho 2 Rajwada 2
Khajrana 3 Shivji Nagar 1
Khatipura 2 Southtokoganj 3
LaxmibaiNagar 1 VallabhNagar 3
Malwa Mill 1 Vijay Nagar 3
M G Road 4 Y N Road 2
Navlakha 2 YashwantNRoad 2
Old Palasia 3 Total 42
Just an example scenario
Table 1.1 List of Class along with number of published web services.
1717
Proposed Algorithm for Searching
Step 1: Start Step 2: Enter keyword for which services to be searched (for ex. Travel i.e. choosing the super class.)Step 3: Select Location of service (Let it be denoted by a class and serve as centroid for selection of cluster).Step 4: Initialize IPTable for initial seeds of UBR. Do Step 4a. Call Dynamic IP Module. Step 4b: Call Cluster module to create cluster based on location attribute. Step 4c: If (Location is not chosen) Treat all classes in a single cluster.
goto step: 4d Else
If (Maximal distance <=min threshold) Put the location class in same cluster.
Select the cluster in which centroid belongs. Step 4d: For each location class in selected cluster, fetch all services belonging to each
of the class along with their frequency of usage data. Step 4e: Call the cluster module to create cluster based on service usage frequency.
1818
Continued… Step 4f: Parse WSDL document against access point URL for each discovered web
services i.e. validate web service.
Step 4g: If Web service is Active Store it locally Else Fetch service Key against that access point URL from UBRs and pass it to delete module that store it locally for future use and delete the web service from respective UBR.
Until all IP Seeds are visited from UBR crawl queue .
Step 5: Add Service Reviews to each service of active service list which has been stored locally from virtual UBR on which engine resides.Step 6: Display the list of web service to the end user.Step 7: If User binds the service Ask the user to write a feedback of the used service. Accept details of user along with comment and rating to the service and store these details to extended service registry structure.Step 8 :end
1919
Agglomerative Algorithm for Complete-Link Clustering
It looks for cliques.
Find the maximal distance between any clusters so that two clusters are merged if the maximum distance is less than or equal to the distance threshold.
Euclidean distance Between points p and q can be calculated as
20
Adjacency Matrix for Maximal distance (Based on the location attribute)
AR HM IH KH KP LN MM MG NK OP PP RN RW SN ST VN VJ YN YW
AR 0 2.5 4 10 9 7 6.5 6.8 4.5 7.2 7.8 0.5 6.7 7.2 5.2 6 11 7.5 7
HM 2.5 0 3.2 9.2 6.5 2 5 3.5 4.8 6.2 5.3 3 3 4 6 4.5 9 6.8 7.2
IH 4 3.2 0 6.2 5.8 2.3 3 1.8 2.5 2.8 4 5.5 2.9 2 1.5 1.8 6.5 3.2 2.9
KH 10 9.2 6.2 0 7 4 2.9 3.5 8.8 4.5 2 9.5 3.9 3.3 5.5 4.2 2 4.5 6.5
KP 9 6.5 5.8 7 0 4.5 3 3.2 7.5 5.1 2.1 11 3.5 3.8 5.5 4.9 2.8 3.1 6.9
LN 7 2 2.3 4 4.5 0 3.9 2.8 4.8 4.5 2.5 7.8 1.5 3.2 5 3.8 5.5 4 5.8
MM 6.5 5 3 2.9 3 3.9 0 1 5.2 2 0.8 7.8 2 0.8 3.2 1.2 3 0.5 4.5
MG 6.8 3.5 1.8 3.5 3.2 2.8 1 0 4 2 2.3 7.5 1.5 0.8 2.9 1 5.5 1.8 4.5
NK 4.5 4.8 2.5 8.8 7.5 4.8 5.2 4 0 4.5 6 6 5 4.5 2 3 7 5.5 1.5
OP 7.2 6.2 2.8 4.5 5.1 4.5 2 2 4.5 0 3.8 8.5 3.5 2.2 2 1 4.8 2.5 3.2
PP 7.8 5.3 4 2 2.1 2.5 0.8 2.3 6 3.8 0 9 2.5 1.5 4.8 2.8 3 2.5 6
RN 0.5 3 5.5 9.5 11 7.8 7.8 7.5 6 8.5 9 0 7.2 8.3 6.2 7 12 8.5 8
RW 6.7 3 2.9 3.9 3.5 1.5 2 1.5 5 3.5 2.5 7.2 0 2.1 3.8 3 4 2.5 5.2
SN 7.2 4 2 3.3 3.8 3.2 0.8 0.8 4.5 2.2 1.5 8.3 2.1 0 4 1 4.2 1.5 5
ST 5.2 6 1.5 5.5 5.5 5 3.2 2.9 2 2 4.8 6.2 3.8 4 0 1.8 6 3.5 1.2
VN 6 4.5 1.8 4.2 4.9 3.8 1.2 1 3 1 2.8 7 3 1 1.8 0 4.7 0.8 2.5
VJ 11 9 6.5 2 2.8 5.5 3 5.5 7 4.8 3 12 4 4.2 6 4.7 0 5 5.8
YN 7.5 6.8 3.2 4.5 3.1 4 0.5 1.8 5.5 2.5 2.5 8.5 2.5 1.5 3.5 0.8 5 0 3
YW 7 7.2 2.9 6.5 6.9 5.8 4.5 4.5 1.5 3.2 6 8 5.2 5 1.2 2.5 5.8 3 0
2121
Mechanism for service rating
Extended service registry design is proposed. The schema design of this template table is as follows.
The data regarding the frequency of invocation of services is also kept in the virtual root registry in the proposed architecture and is to be published by the service provider periodically.Calculate average rating for a service considering an equal share of user reviews as well as frequency of usage of service.
Sname Person Name
e-id Review Rating
2222
How service usage frequency is used for clustering the services: An Example
Ts1 Ts2 Ts3 Ts4 Ts5
User A 5 2 3 0 1
User B 1 3 2 0 2
User C 6 1 5 1 1
User D 8 2 4 0 2
User E 5 3 2 1 0
25 11 16 2 6
Ts1 Ts2 Ts3 Ts4 Ts5
Ts1 0 8.83 5.56 11.4 10.1
Ts2 8.83 0 4.79 4.58 3.3
Ts3 5.56 4.79 0 6.78 5.29
Ts4 11.4 4.58 6.78 0 3.16
Ts5 10.1 3.3 5.29 3.16 0
Choosing threshold t = 6
Clusters formed are ( Ts1, Ts2, Ts3) ,Ts4, Ts5
User will be presented with first cluster since average invocation frequency is highest for this cluster
Here Ts1, Ts2, Ts3, Ts4,Ts5 are representing the travel services used by various users.
2323
Test case and Results
If user searches for Travel web services at a location like Annapurna Road
2424
Continued…
List of search result for Annapurna Road
2525
If user not choose any location then list of search results are
If User wants to execute the service he will have to click on service name
Continued…
2626
Here user can rate as well as review the service they used
Continued…
2727
The proposed Engine can…
Reduce population of passive web services from UBR using validation mechanism.
Crawl over an IP list which can grow dynamically.
Narrow down the search space of UBR.
Suggest the services to the user based on user feedbacks and service usage frequency
Provides web service publication facility to user.
2828
Conclusion
A Discovery cum Publishing Engine for searching web services has been proposed which uses service ranking techniques for efficient and effective web service discovery. We have used Data Mining Techniques to narrow down the search space in UBRs. In addition ,an extended design of service registry has been proposed which stores service feedback and service usage frequency along with the service information, which has been used to rank services within a selected cluster.
This work may further be generalized if instead of taking a services attribute ,we consider non functional parameters or semantics of services for applying the data mining techniques.
2929
References
• E. Al-Masri, and Q. H. Mahmoud, “WSCE: A crawler engine for large-scale discovery of web services”, In Proceedings of IEEE ICWS pp.1104-1111, 2007.
• K.Sivashanmugam,K.Verma and A Seth.Discovery of web services in a federated environment. In proceedings of ISWC,pp270-278,2004.
• Yan Li , Yao Liu, Liangjie Zhang, Ge Li, Bing Xie, Jiasu Sun , An Exploratory study of Web Services on the internet,. In ICWS 2007(IEEE).
• E.Al-Masri and Q.H. Mahmoud, Crawling Multiple UDDI Business Registries, Proc. 16th Int’l World Wide Web Conf., ACM.
• E.Al-Masri,Q.H.Mahmoud, Discovering Web Services in Search Engine, WWW 2007, May8-12, 2007,Banff,Alberta,Canada.
• “Data Mining Introductory and Advanced Topics” by Margaret H. Dunham & S. Sridhar.
3030