Project - UG - BTech IT - Cluster based Approach for Service Discovery using Pattern Recognition

1

CHAPTER 1

INTRODUCTION

1.1 ABOUT THE PROJECT

The need for B2B Integration and the limitations of conventional middleware in B2B

Integration raised the need for a novel technology – a need addressed by web services. Web

services are components with specific functionalities that can be integrated into more complex

real time distributed applications. The description of the services is stored in the registry which

allows the designers to register a new service and allow service users to search for and locate

services. The service registry contains web services which include predefined types that are

generally specified by service providers. The UDDI registry specifications have two main

goals with respect to service discovery: first, to support developers in finding information

about services, so that they know how to write clients that interact with those services. Second,

enable the dynamic binding, by allowing clients to query the registry and obtain references to

services of interest.

Service discovery can be done at also in design-time, by surfing the directory and

identifying the most relevant services, and at run-time, for these dynamic binding techniques

are being used. The proper discovery, configuration, and communication between devices and

services with each other is enabled by Service discovery process. The relevant services which

are discovered do not fulfill user needs all-time. The number of web services continues to

grow and side by side the business environment keeps demanding newer applications that

have to be rolled out according to very tight schedule. Relatively large number of web services

and the distribution of similar services might be listed in different categories in the registry

infrastructure which makes it difficult to find appropriate services.

Therefore, rather than classifying the services based on their providers, they must be

categorized on their functional semantics. Services categorized on their functional semantics

will facilitate the organization of similar services together. Majority of the service description

that exists so far are syntactic in nature. The same syntax might be used at different place for

various purposes. Actually when a service is requested only small amount of services that are

an exact syntactical match of the request is selected. And the selected services may perform

different functions rather than the requested functions and the discovery process is limited by

its dependence of human involvement for selecting right service based on its functionality.

2

Pattern recognition algorithms generally aim to provide a reasonable answer for all

possible inputs and to perform "most likely" matching of the inputs, taking into account their

statistical variation. This is opposed to pattern matching algorithms, which look for exact

matches in the input with pre-existing patterns. A common example of a pattern-matching

algorithm is regular expression matching, which looks for patterns of a given sort in textual

data and is included in the search capabilities of many text editors and word processors. In

contrast to pattern recognition, pattern matching is generally not considered a type of

machine learning, although pattern-matching algorithms can sometimes succeed in providing

similar-quality output to the sort provided by pattern-recognition algorithms.

Subsequent practitioners of Semantic Web have also been making possible of their

predictions and actions in a variety of ways. For instance, with the application of Semantic

Web technologies, it is possible to automate operations, say, from completing all that you

need for a travel to updating of your personal records. Semantic Web then can be defined as a

web of information on the Internet and Intranet that contains characteristics of annotation

which enables accessing of precise information that you need.

Clustering has been the driving force behind many of the world's most powerful

scientific supercomputers for many years and is now being used increasingly as a cost-

effective way to provide high-performance, high availability computing for a wide variety of

commercial workloads such as business intelligence, engineering design, financial analysis,

digital media and petroleum exploration. The description of web services should be described

using WSDL in order to make the approach to be generic. The project actually provides or

addresses a clustered approach for service discovery based on their functionality and

providing similar service using pattern matching, thus satisfying user needs.

The proposed approach deal with the issue of service discovery provided non-explicit

service description semantics that match a particular service request. The propose a system

that involves semantic-based service categorization which is performed at the UDDI with a

key for achieving the service categorization at functional level based on an ontology skeleton.

Also, clustering is used for literally systemizing the web services based on functionality

which is achieved by using analytic algorithm. An efficient matching for the relevant services

is achieved by the enhancing the service request semantically and involves expanding the

additional functionality (obtained from ontology) that are related for the requested service.

The pattern recognition algorithm is used to select appropriate service from the cluster

formation of related (grouped) web services.

3

1.2 DESCRIPTION OF WEB SERVICE

A Web service is a method of communication between two electronic devices over

the World Wide Web .The W3C defines a "Web service" as "a software system designed to

support interoperable machine-to-machine interaction over a network". It has an interface

described in a machine-process able format (specifically Web Services Description

Language, known by the acronym WSDL). Other systems interact with the Web service in a

manner prescribed by its description using SOAP messages, typically conveyed

using HTTP with an XML serialization in conjunction with other Web-related standards."The

W3C also states, "We can identify two major classes of Web services, REST-compliant Web

services, in which the primary purpose of the service is to manipulate XML representations

of Web resources using a uniform set of "stateless" operations; and arbitrary Web services, in

which the service may expose an arbitrary set of operations."

Web services describes a standardized way of integrating Web-based applications

using the XML, SOAP, WSDL and UDDI open standards over an Internet protocol

backbone. Web-applications are built around the Web browser standards and can be used by

any browser on any platform. Web services use XML to code and to decode data, and SOAP

to transport it. By using Web services, our application can publish its function or message to

the rest of the world with Web services you can exchange data between different applications

and different platforms.

Characteristics of Web Services:

XML-based

Web Services uses XML at data representation and data transportation layers. Using

XML eliminates any networking, operating system, or platform binding. So Web Services

based applications are highly interoperable application at their core level.

Loosely coupled

A consumer of a web service is not tied to that web service directly. The web service

interface can change over time without compromising the client's ability to interact with the

service. A tightly coupled system implies that the client and server logic are closely tied to

one another, implying that if one interface changes, the other must also be updated.

4

Coarse-grained

Object-oriented technologies such as Java expose their services through individual

methods. An individual method is too fine an operation to provide any useful capability at a

corporate level. Building a Java program from scratch requires the creation of several fine-

grained methods that are then composed into a coarse-grained service that is consumed by

either a client or another service. Businesses and the interfaces that they expose should be

coarse-grained. Web services technology provides a natural way of defining coarse-grained

services that access the right amount of business logic.

Ability to be synchronous or asynchronous

Synchronicity refers to the binding of the client to the execution of the service. In

synchronous invocations, the client blocks and waits for the service to complete its operation

before continuing. Asynchronous operations allow a client to invoke a service and then

execute other functions. Asynchronous clients retrieve their result at a later point in time,

while synchronous clients receive their result when the service has completed. Asynchronous

capability is a key factor in enabling loosely coupled systems.

Supports Remote Procedure Calls (RPCs)

Web services allow clients to invoke procedures, functions, and methods on remote

objects using an XML-based protocol. Remote procedures expose input and output

parameters that a web service must support. Component development through Enterprise

JavaBeans (EJBs) and .NET Components has increasingly become a part of architectures and

enterprise deployments over the past couple of years. Both technologies are distributed and

accessible through a variety of RPC mechanisms. A web service supports

Supports document exchange

One of the key advantages of XML is its generic way of representing not only data,

but also complex documents. These documents can be simple, such as when representing a

current address, or they can be complex, representing an entire book or RFQ. Web services

support the transparent exchange of documents to facilitate business integration.

The World Wide Web is increasingly being used for communication between

applications. The programmatic interfaces made available over the web for application-to-

application communication are often referred to as web services. There are many types of

applications that can be considered web services but interoperability between applications is

enhanced most by the use of familiar technologies such as XML and HTTP. These

5

technologies allow applications using differing languages and platforms to interface in a

familiar way. Web services are distributed application components that are externally

available. You can use them to integrate computer applications that are written in different

languages and run on different platforms. Web services are language and platform

independent because vendors have agreed on common web service standards.

Web services are client and server applications that communicate over the World

Wide Web’s (WWW) Hyper Text Transfer Protocol (HTTP). As described by the World

Wide Web Consortium (W3C), web services provide a standard means of interoperating

between software applications running on a variety of platforms and frameworks. Web

Services are thought of to be a means to provide easily accessible services over a network.

They should be simply usable regardless of the underlying network structure or

configuration, operating system, communication mechanism or implementing language.

Web services are Loosely coupled, reusable software components that semantically

encapsulate discrete functionality and are distributed and programmatically accessible over

standard Internet protocols. A web service is any piece of software that makes itself available

over the internet and uses a standardized XML messaging system. XML is used to encode all

communications to a web service. For example, a client invokes a web service by sending an

XML message, then waits for a corresponding XML response. Because all communication is

in XML, web services are not tied to any one operating system or programming language--

Java can talk with Perl; Windows applications can talk with Unix applications.

Web Services are self-contained, modular, distributed, dynamic applications that can

be described, published, located, or invoked over the network to create products, processes,

and supply chains. These applications can be local, distributed, or Web-based. Web services

are built on top of open standards such as TCP/IP, HTTP, Java, HTML, and XML. Web

services are XML-based information exchange systems that use the Internet for direct

application-to-application interaction. These systems can include programs, objects,

messages, or documents.

A web service is a collection of open protocols and standards used for exchanging data

between applications or systems. Software applications written in various programming

languages and running on various platforms can use web services to exchange data over

computer networks like the Internet in a manner similar to inter-process communication on a

single computer. This interoperability (e.g., between Java and Python, or Windows and Linux

applications) is due to the use of open standards.

6

Figure 1.1 Web services architecture

The web services architecture permits the development of web services that

encapsulate all levels of business functionality. In other words, a web service can be very

simple, such as one that returns the current temperature, or it can be a complex application.

The architecture also allows multiple web services to be combined to create new

functionality.

The standards on which web service development is based are evolving technologies.

The primary players are SOAP (Simple Object Access Protocol), WSDL (Web Services

Description Language), UDDI (Universal Description, Discovery and Integration), and XML

(Extensible Markup Language).

1.3 WEB SERVICE STANDARDS

The standards on which web service development is based are evolving technologies.

The primary players are SOAP (Simple Object Access Protocol), WSDL (Web Services

Description Language), UDDI (Universal Description, Discovery and Integration), and XML

(Extensible Markup Language).

7

Simple Object Access Protocol (SOAP)

SOAP, originally defined as Simple Object Access Protocol, is a protocol

specification for exchanging structured information in the implementation of Web

Services in computer networks. It relies on Extensible Markup Language (XML) for its

message format, and usually relies on other Application Layer protocols, most

notably Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP), for

message negotiation and transmission.

Extensible Markup Language (XML)

Extensible Markup Language (XML) is a markup language that defines a set of rules

for encoding documents in a format that is both human-readable and machine-readable.It is

defined in the XML 1.0 Specification produced by the W3C, and several other related

specifications. Many Application Programming interfaces (APIs) have been developed for

software developers to use to process XML data, and several schema systems exist to aid in

the definition of ML-based languages.

Web Services Description Language (WSDL)

The Web Services Description Language is an XML-based language that is used for

describing the functionality offered by a Web service. A WSDL description of a web service

(also referred to as a WSDL file) provides a machine-readable description of how the service

can be called, what parameters it expects, and what data structures it returns. It thus serves a

roughly similar purpose as a method signature in a programming language. WSDL is often

used in combination with SOAP and an XML Schema to provide Web services over

the Internet.

Universal Description, Discovery and Integration (UDDI)

UDDI (Universal Description Discovery and Integration) is a platform-independent

framework for describing services, discovering businesses, and integrating business services

by using the Internet. UDDI stands for Universal Description, Discovery and Integration. It is

a directory for storing information about web services and described by WSDL. It

communicates via SOAP.

8

1.4 COMPONENTS OF WEB SERVICES

The six key components of a web service are:

Self-contained

Self-describing

Modular components

Published

Located

Invoked across web

1.5 FEATURES OF WEB SERVICES

Self-Contained: No additional software is required for web service.

Client-Side: A programming language with XML/HTML client support.

Server-Side: A web server and a SOAP server are needed.

Loosely-Coupled: Client and server only knows about messages - a simple coordination level

that allows for more flexible re-configuration.

Web-Enabled: Web Service are published, located and invoked across the web, using

established lightweight Internet standards.

Language-Independent and Interoperable: Client and server may be implemented in different

environments and different languages.

Composable: Web service can be aggregated using workflow techniques to perform higher-

level business functions.

Dynamically Bound: With UDDI and WSDL, the discovery and binding of web services can

be automated.

Programmatic Access: The web services approach does not provide a graphical user interface

but operates at the command level.

1.6 WEB SERVICE ARCHITECTURE

The web services architecture permits the development of web services that

encapsulate all levels of business functionality. In other words, a web service can be very

simple, such as one that returns the current temperature, or it can be a complex application.

9

The architecture also allows multiple web services to be combined to create new

functionality.

PUBLISH

The service provider defines a service description for the web service and publishes it

to a requestor or service discovery agency.

FIND

The service requestor uses a find operation to retrieve the service description locally

or from the discovery agency (service registry).

BIND

The service requestor uses the service description to bind with the service provider

and invoke or interact with the web service implementation

Figure 1.2 Web service roles and operation

10

CHAPTER 2

LITERATURE SURVEY

2.1 WEB SERVICES

Fig 2.1 Service Based Architecture

The above diagram is the representation of a service based model. In this model, the

web service resides on the web server. A client computer can request and consume this web

service and terminate the service as and when desired. The name "service" originates from

this model of functioning of the program. The client program needs to be provided with only

the url of the web service.

The client applications can be running on different platforms, or developed using

different tools. A major application of web service is to integrate such different applications

which may be working on heterogeneous platforms. The existence of heterogenous platforms

in an organization or among organizations is a common phenomenon nowadays in an

organization.

Any organization today works with multiple vendors, suppliers , contractors and other

entities. Each of these entities would have developed their own software systems based on

Microsoft technologies or on Sun microsystems or on IBM technologies. Each of these

software systems would have been developed over period of time with hundreds of thousands

of dollars investments. It will be almost impossible for any of them to change their systems

for compatibility. This is where web services comes into picture. Communicating amongst all

these entities without affecting their existence is made possible by web services.

11

Service-oriented architecture (SOA) is a software design methodology based on

structured collections of discrete software modules, known as services, that collectively

provide the complete functionality of a large or complex software application. Each service

that makes up an SOA application is designed to provide a tightly defined set of functions. As

a result, each service is built as a discrete piece of code. This makes it possible to reuse the

code in different ways throughout the application by changing only the way an individual

service interoperates with other services that make up the application, versus making code

changes to the service itself. SOA design principles are used during

software development and integration.

SOA generally provides a way for consumers of services, such as web-based

applications, to be aware of available SOA-based services. For example, several disparate

departments within a company may develop and deploy SOA services in different

implementation languages; their respective clients will benefit from a well-defined interface

to access them. XML is often used for interfacing with SOA services. JSON is also becoming

increasingly common.

SOA defines how to integrate widely disparate applications for a Web-based

environment and uses multiple implementation platforms. Rather than defining an API, SOA

defines the interface in terms of protocols and functionality. An endpoint is the entry point for

such a SOA implementation.

Service-orientation requires loose coupling of services with operating systems and

other technologies that underlie applications. SOA separates functions into distinct units, or

services, which developers make accessible over a network in order to allow users to

combine and reuse them in the production of applications. These services and their

corresponding consumers communicate with each other by passing data in a well-defined,

shared format, or by coordinating an activity between two or more services.

SOA can be seen in a continuum, from older concepts of distributed computing and

modular programming, through SOA, and on to current practices of mashups, SaaS,

and cloud computing (which some see as the offspring of SOA).

http://en.wikipedia.org/wiki/Software_design

http://en.wikipedia.org/wiki/Methodology

http://en.wikipedia.org/wiki/Function_(computer_science)

http://en.wikipedia.org/wiki/Modular_programming

http://en.wikipedia.org/wiki/Code_reuse

http://en.wikipedia.org/wiki/Interoperability

http://en.wikipedia.org/wiki/Systems_design

http://en.wikipedia.org/wiki/Systems_development

http://en.wikipedia.org/wiki/Systems_integration

http://en.wikipedia.org/wiki/Client_(computing)

http://en.wikipedia.org/wiki/XML

http://en.wikipedia.org/wiki/JSON

http://en.wikipedia.org/wiki/API

http://en.wikipedia.org/wiki/Service-orientation

http://en.wikipedia.org/wiki/Loose_coupling

http://en.wikipedia.org/wiki/Operating_system

http://en.wikipedia.org/wiki/Distributed_computing

http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)

http://en.wikipedia.org/wiki/SaaS

http://en.wikipedia.org/wiki/Cloud_computing

12

2.2 BENEFITS OF USING WEB SERVICES

Exposing the existing function on to network:

A Web service is a unit of managed code that can be remotely invoked using HTTP,

that is, it can be activated using HTTP requests. So, Web Services allows you to expose the

functionality of your existing code over the network. Once it is exposed on the network, other

application can use the functionality of your program.

Connecting Different Applications i.e Interoperability:

Web Services allows different applications to talk to each other and share data and

services among themselves. Other applications can also use the services of the web services.

For example VB or .NET application can talk to java web services and vice versa. So, Web

services is used to make the application platform and technology independent.

Standardized Protocol:

Web Services uses standardized industry standard protocol for the communication.

All the four layers (Service Transport, XML Messaging, Service Description and Service

Discovery layers) uses the well defined protocol in the Web Services protocol stack. This

standardization of protocol stack gives the business many advantages like wide range of

choices, reduction in the cost due to competition and increase in the quality.

Low Cost of communication:

Web Services uses SOAP over HTTP protocol for the communication, so you can use

your existing low cost internet for implementing Web Services. This solution is much less

costly compared to proprietary solutions like EDI/B2B. Beside SOAP over HTTP, Web

Services can also be implemented on other reliable transport mechanisms like FTP etc.

2.3 DISCOVERY

In a large number of applications, web servcies will be provided by on one hand and

used by on the other peers who have established relationships. Indeed, until a trust

infrastructure is fairly developed it is not reasonable to expect computers to do automatic

comparison shopping for very many services. Web services will probably (like the web in

1993, and the Semantic Web in 2003) spread first within the corporate firewall, where

security problems are minor and mistakes less embarassing than inter-enterprise or publically.

However, the goal is that so many web services should be available that it will be important

to be able to find them in all kinds of ways.

13

The UDDI project and the related work on description and query systems is aimed at

this. A positive aspect of UDDI is the definition of an ontology for web services. Problems

with it are that it is centralized by design, both in the single-tree ontology, and in the design

based fundamentally on a central registry, with inter-registry operation as a secondary thing.

From the semantic web point of view, web services are simply one aspect of the many things

which will be searched for. Indeed, the fact that a web service is provided may in fact be

rather incidental to the essential nature of the business item which is discovered -- a trader in

stocks, a seller of lawnmowers, and so on. The semantic web aims to describe any aspect of

anything, including the catalogs, parts, materials, services organization, relationships and

contracts. A query system which addresses web services only makes sense when smoothly

integrated with the rest of the web of enterprise knowledge. Web services provide access to

software systems over the Internet using standard protocols. In a minimalistic scenario there

exists at least a Web service provider that publishes some service such as a weather service

and a Web service consumer that uses this service. Web service discovery is the process of

finding a suitable Web service for given task.

2.4 PATTERN RECOGNITION

In machine learning, pattern recognition is the assignment of a label to a given input

value. An example of pattern recognition is classification, which attempts to assign each

input value to one of a given set of classes (for example, determine whether a given email is

"spam" or "non-spam"). However, pattern recognition is a more general problem that

encompasses other types of output as well. Other examples are regression, which assigns a

real-valued output to each input; sequence labelling, which assigns a class to each member of

a sequence of values (for example, part of speech tagging, which assigns a part of speech to

each word in an input sentence); and parsing, which assigns a parse tree to an input sentence,

describing the syntactic structure of the sentence.

Pattern recognition algorithms generally aim to provide a reasonable answer for all

possible inputs and to perform "most likely" matching of the inputs, taking into account their

statistical variation. This is opposed to pattern matching algorithms, which look for exact

matches in the input with pre-existing patterns. A common example of a pattern-matching

algorithm is regular expression matching, which looks for patterns of a given sort in textual

data and is included in the search capabilities of many text editors and word processors. In

contrast to pattern recognition, pattern matching is generally not considered a type of

http://en.wikipedia.org/wiki/Web_services

14

machine learning, although pattern-matching algorithms (especially with fairly general,

carefully tailored patterns) can sometimes succeed in providing similar-quality output to the

sort provided by pattern-recognition algorithms.

Pattern recognition is studied in many fields, including psychology, psychiatry,

ethnology, cognitive science, and traffic flow and computer science. Algorithms for pattern

recognition depend on the type of label output, on whether learning is supervised or

unsupervised, and on whether the algorithm is statistical or non-statistical in nature.

Statistical algorithms can further be categorized as generative or discriminative.

2.5 RELATED RESEARCHES

A CLUSTERING-BASED APPROACH FOR INTEGRATING DOCUMENT-

CATEGORY HEIRARCHIES

ABSTRACT

E-commerce applications generate and consume a tremendous amount of online

information, which is typically available as textual documents. Conceivably, organizations

and individuals generally use category sets or hierarchies to organize, archive, and access

their documents. Meanwhile, organizations and individuals constantly acquire relevant

documents from various Internet sources, each of which may organize its documents in a

category set or hierarchy different from that used by the acquiring organization or individual.

Consequently, the integration of source documents organized in a category hierarchy into an

existing category hierarchy deployed by the acquiring organization or individual becomes an

important issue in the e-commerce era. Existing category-integration techniques are mainly

designed to integrate document catalogs, each of which is organized non-hierarchically (i.e.,

in a flat set). The proposed system is a clustering-based category-hierarchy integration (CHI)

technique, which is an extension of the clustering-based category-integration (CCI)

technique. Our empirical evaluation results show that the proposed CHI technique appears to

improve the effectiveness of category-hierarchy integration compared with that attained by

non-hierarchical category-integration techniques, particularly in homogeneous and

comparable scenarios.

INFERENCE

This paper is intended to serve as a foundation for a continued research on category-

hierarchy integration. Additional research should be aligned strategically to enhance the

generalizability and the effectiveness of our proposed technique. First, as analysed in our

15

evaluation results, the non-hierarchical CCI and ENB techniques significantly outperform the

proposed CHI technique in the heterogeneous integration scenario. Therefore, the capability

of CHI to handle the integration of heterogeneous category hierarchies should be enhanced.

Second, our current evaluation study employs only one document corpus in a specific domain

(i.e., research articles). Evaluations of the proposed CHI technique using other document

corpora that pertain to more diversified domains (e.g., web pages and news stories) would

improve the generalizability of the evaluation results reported in this paper.

MULTI-TYPE FEATURES CO-SELECTION FOR WEB DOCUMENT

CLUSTERING

ABSTRACT

Feature selection has been widely applied in text categorization and clustering.

Compared to unsupervised selection, supervised feature selection is more successful in

filtering out noise in most cases. However, due to a lack of label information, clustering can

hardly exploit supervised selection. Some studies have proposed to solve this problem by

“pseudo class.” As empirical results show, this method is sensitive to selection criteria and

data sets. In this paper, they propose a novel feature co-selection for Web document

clustering, which is called Multi-type Features Co-selection for Clustering (MFCC). MFCC

uses intermediate clustering results in one type of feature space to help the selection in other

types of feature spaces. Our experiments show that for most selection criteria, MFCC reduces

effectively the noise introduced by “pseudo class,” and further improves clustering

performance.

INFERENCE

They have proposed Multi-type Features Co-selection for Clustering (MFCC), a novel

algorithm to exploit different types of features to perform Web document clustering. The

intermediate clustering result is used in one feature space as additional information to

enhance the feature selection in other spaces. Consequently, the better feature set co-selected

by heterogeneous features will produce better clusters in each space. After that, the better

intermediate result will further improve co-selection in the next iteration. Finally, feature co-

selection is implemented iteratively and can be well integrated into an iterative clustering

algorithm.

16

TOWARD INTEGRATING FEATURE SELECTION ALGORITHMS FOR

CLASSIFICATION AND CLUSTERING

ABSTRACT

This paper introduces concepts and algorithms of feature selection, surveys existing

feature selection algorithms for classification and clustering, groups and compares different

algorithms with a categorizing framework based on search strategies, evaluation criteria, and

data mining tasks, reveals un-attempted combinations, and provides guidelines in selecting

feature selection algorithms. With the categorizing framework, and continued our efforts

toward building an integrated system for intelligent feature selection. A unifying platform is

proposed as an intermediate step. An illustrative example is presented to show how existing

feature selection algorithms can be integrated into a meta algorithm that can take advantage

of individual algorithms. An added advantage of doing so is to help a user employ a suitable

algorithm without knowing details of each algorithm. Some real-world applications are

included to demonstrate the use of feature selection in data mining. It has been concluded that

this work by identifying trends and challenges of feature selection research and development.

INFERENCE

This survey provides a comprehensive overview of various aspects of feature

selection. They introduce two architectures —a categorizing framework and a unifying

platform. They categorize the large body of feature selection algorithms, reveal future

directions for developing new algorithms, and guide the selection of algorithms for intelligent

feature selection. The categorizing framework is developed from an algorithm designer’s

viewpoint that focuses on the technical details about the general procedures of feature

selection process. A new feature selection algorithm can be incorporated into the framework

according to the three dimensions. The unifying platform is developed from a user’s

viewpoint that covers the user’s knowledge about the domain and data for feature selection.

The unifying platform is one necessary step toward building an integrated system for

intelligent feature selection. The ultimate goal for intelligent feature selection is to create an

integrated system that will automatically recommend the most suitable algorithm(s) to the

user while hiding all technical details irrelevant to an application.

17

A PATTERN-RECOGNITION-BASED ALGORTHIM AND CASE STUDY FOR

CLUSTERING AND SELECTING BUSINESS SERVICES

ABSTRACT

Positioned as the backbone of service asset management console, a service registry

has to enable real-time and offline service selection in an effective manner. This paper

presents an analytic algorithm that is used to guide the architectural design of service

exploration in a service registry. Service assets are proposed to be framed into a well-

established categorical structure based on pattern recognition algorithm. This design aims to

provide systematic methodology and enablement architecture for analyzing, clustering, and

adapting heterogeneous services for dynamic application integration. The exploitation of

pattern recognition algorithm maps a large amount of services into a manageable feature

space, which consists of attributes that are related to static description and dynamic features,

such as historical QoS and service-level agreement. The proposed architecture and associated

service exploration methodology have been integrated into an industry strength service-

oriented architecture solution design platform. They have also presented a case study using

the developed platform to illustrate the proposed algorithm for business service clustering and

selection.

INFERENCE

In this paper, they have addressed the scalability issue arising from designing the

service asset management system. They have proposed a service organization structure,

referred to as multilevel service hierarchy, which can be used as the backbone storage

architecture underlying the service registry. Then it has been systematically studied the

construction and management schemes for this service hierarchy. The pattern recognition

algorithms have applied to exploit the embedded service attributes. This research is essential

to our ongoing effort in building a comprehensive service asset management platform for the

SOA service solution design. There are several issues left to be resolved.

18

SEMANTICS-BASED AUTOMATED SERVICE DISCOVERY

ABSTRACT

A vast majority of web services exist without explicit associated semantic

descriptions. As a result many services that are relevant to a specific user service request may

not be considered during service discovery. In this paper, addresses the issue of web service

discovery given non-explicit service description semantics that match a specific service

request. Our approach to semantic based web service discovery involves semantic-based

service categorization and semantic enhancement of the service request. The proposed system

provides a solution for achieving functional level service categorization based on an ontology

framework. Additionally, clustering is used for accurately classifying the web services based

on service functionality. The semantic-based categorization is performed offline at the

universal description discovery and integration (UDDI). The semantic enhancement of the

service request achieves a better matching with relevant services. The service request

enhancement involves expansion of additional terms (retrieved from ontology) that are

deemed relevant for the requested functionality. An efficient matching of the enhanced

service request with the retrieved service descriptions is achieved utilizing Latent Semantic

Indexing (LSI). Our experimental results validate the effectiveness and feasibility of the

proposed approach.

INFERENCE

In this paper, they present an integrated approach for automated service discovery.

Specifically, the approach addresses two major aspects related to semantic-based service

discovery: semantic-based service categorization and semantic-based service selection. For

semantic-based service categorization, and provides an ontology guided categorization of

web services into functional categories for service discovery. This leads to better service

discovery by matching the service request with an appropriate service description. For

semantic-based service selection, ontology linking is employed (semantic web) and LSI thus

extending the indexing procedure from solely syntactical information to a semantic level.

19

CHAPTER 3

PROBLEM DEFINITION

The end-user have special interests on web service and its functionality, several logics

and techniques were proposed by researchers for discovering the web service that satisfy the

user needs. In such way, researchers proposed the service discovery process using the

concepts of Dual Clustering, Service Matching and etc. Web services that are appropriate to a

user specific request are usually not considered in discovering the exact service since they are

present without explicit related semantic descriptions. In our approach, actually deal with the

issue of service discovery provided non-explicit service description semantics that match a

particular service request.

3.1 EXISTING SYSTEMS

3.1.1 Service Matching Technique

Similar web services identification process is now becoming increasingly an

important issue to make sure the accomplishment of dynamically integrated Web-service-

based applications. Ontology-based models are applied for the improvement of the searching

capabilities in Agent Systems. Matchmaking troubles happen when a service is being

requested and it comprises the distance calculation between the required service description

and also from the service registry. The problems in service and resource matching is being

keenly discussed at present as one of important new challenging job for the next generation

of semantic discovery approaches for Web agents and Web services. Correct ontology-based

matching tools to be used to effectively integrate Agent, Grid Services, and Web Service

technologies with each other. A categorization-based scheme is used to identify the identical

Web services that could function on diverse domain ontologies. By using ontology instance

categorization concept, the matching system states whether a given Web service is a possible

replacement and then it also adapts itself by the process of enhancing with the recognized

ontologies by means of the newly determined ontology instances.

20

3.1.2 Dual Clustering Technique

Increasingly, service requestors look for the ability to seek for existing Web services

in large Internet-based warehouse and the main objective is to retrieve services that match the

user’s requirements. With the repositories increasing in number of services and the

challenges in rapid finding the accurate ones, the call for clustering associated services

becomes evident to improve search engine results with a record of similar services for each

hit. Spatial clustering has various applications. In most conventional clustering issues, the

geometric attributes are taken into consideration for the similarity measurement. Users are

generally concerned about the non-geometric attributes in many real applications. The input

data set is partitioned into numerous compacted regions in conventional spatial clustering,

and data points which are comparable to one another in their non-geometric attributes may be

scattered over various regions, hence making the corresponding objective complicated to

reach. Dual clustering is remedy to this. Constraint domain specifies the application

dependent and attributes present in the optimization domain are those intricate in the

optimization of the objective function. The information in both domains is combined by the

ICC algorithm and clustering algorithm on the optimization domain is iteratively performed.

3.2 DRAWBACKS OF EXISTING SYSTEM

The existing systems although have benefits like combining the information from

both domain and matching the service for perfect replacement but they fail to provide the

related service in terms of functionality i.e in terms of meaning what the service is going to

perform. Most of the time the systems work based on the syntactic concept. Majority of the

service description that exists so far are syntactic in nature. The same syntax might be used at

different place for various purposes. Actually when a service is requested only small amount

of services that are an exact syntactical match of the request is selected. And the selected

services may perform different functions rather than the requested functions and the

discovery process is limited by its dependence of human involvement for selecting right

service based on its functionality.

21

3.3 PROPOSED SYSTEM

The existing system has the advantages like optimization of domains and replacement

of the web services but lacks in the important properties such as the functionality based

categorization and providing the appropriate service to the user. The proposed system deal

with the issue of service discovery provided non-explicit service description semantics that

match a particular service request.

The proposed system involves semantic-based service categorization which is

performed at the UDDI with a key for achieving the service categorization at functional level

based on an ontology skeleton. Our proposed work is to categorize the services based on their

functionality and they are clustered based on their category along with related ontology

concepts. The pattern recognition algorithm is used for retrieving the exact service that

matches the user request.

3.3.1 Semantic Categorization

This approach the process begins with the semantic categorization of services in the

UDDI where the ontology concepts are used. The semantic categorization is achieved by

adding a user-defined tag in WSDL file, so for a particular search keyword can make the

services to fall under a given category. A single service can be made to appear in different

categories by implementation of the ontology concepts that identifies the relationships. This

user defined tag is given by the service provider and it is based on the functionality of the

service. Web service description vectors are built and the markups and index entries

are removed. T he web service vector development generally includes the parsing

of the WSDL file forms part of the initial WSDL set and its parallel description and

also the related parameters. Web Service Vector Modification is being done by

enhancing the service vectors with the concepts from the core ontology resolves issues

related to synonyms and induces domain related concepts that provide the context.

First, the initial service vector is added with relevant concepts of ontology. The architecture

design of our proposed system as follows.

3.3.2 Clustering

The next step of this phase involves deleting irrelevant terms based on the ranking of

semantic relationships among the terms. Grouping of functionally similar services together is

performed by the process of clustering of service vectors. Clustering is actually done in

hierarchical manner facilitates classification of all the services, such that each secondary

22

cluster and the combinations of secondary clusters create a hierarchy—a structure that is

more informative than the set of clusters which are unstructured. The service categorization

is performed manually and only service request enhancement is performed during runtime,

thus increase in the timing delay will not be significant. The relevant ontology concepts are

associated to clusters during the creation. The association of concepts to each cluster help

web service discovery by mapping to functional categories. A cluster is defined as Ѳ i = cj

where, cj is the corresponding ontology concept. The ontology concepts deliver semantic for

web service categorization.

23

CHAPTER 4

SOFTWARE REQUIREMENTS SPECIFICATION

The software requirement specification document enlists all necessary

requirements for project development. To derive the requirements a clear and thorough

understanding of the products to be developed.

4.1 PURPOSE

The main purpose of the Software requirement specification is to provide

technical, functional and non functional features. A Semantic based web system is built

mainly based on the existing web service architecture and by implementing user defined tags.

4.2 SCOPE

The main aim is to build a Semantic based Web System built will reduce the

effort of the user by providing at a more appropriate service, thereby producing a system

which will tend to learn from the observed set of services that are provided. So when the

system is projected towards a new set of service it will be easier for them to provide it.

4.3 SYSTEM REQUIREMENT

4.3.1 Hardware Requirements

The most common set of requirements defined by any operating

system or software application is the physical computer resources called as hardware

requirements. A hardware requirements list is often accompanied by a hardware compatibility

list (HCL), especially in case of operating systems. An HCL lists tested, compatible, and

sometimes incompatible hardware devices for a particular operating system or application.

Our proposed system requires the following hardware requirements.

Processor : Intel Pentium D or higher

Random Access Memory : 256 MB

Secondary Memory : 100 MB

Display : Color Monitor

Keyboard : Windows OS Compatible

24

Mouse : Windows OS Compatible

Network : 512 Kbps Internet / 100 Mbps LAN

4.3.2 Software Requirements

Software requirements deal with defining software resources and prerequisites

that need to be installed on a computer to provide optimal functioning of an application.

In computing, a platform describes some sort of framework, either in hardware or software,

which allows software to run. Our proposed system requires the following software

requirements:

Operating System : Windows XP Service Pack 2 or Advance

Coding Environment : Java

Tool : Eclipse

Database : Microsoft Access

4.4 FEATURES OF PROPOSED SYSTEM

The proposed system differs from existing systems in the fact that it does provide the

exact service as requested by the user for any input regarding the auction. Here in our

approach is being started with the semantic categorization of services in the UDDI where the

ontology concepts are used. The semantic categorization is achieved by adding a user-defined

tag in WSDL file.

The next step of this phase involves deleting irrelevant terms based on the ranking of

semantic relationships among the terms. Grouping of functionally similar services together is

performed by the process of clustering of service vectors. The service has to be selected from

the cluster containing number of services. For searching the appropriate service the pattern

recognition algorithm is used.

4.5 LIMITATIOM OF PROPOSED SYSTEM

The proposed system can also be extending by work for web service composition.

Typically, multiple services have to be discovered so that they together match a service

request. It should be possible to utilize ontologies, and explicitly return the sequence of

individual service invocations to be performed in order to achieve the desired composite

25

service. When no full match is possible, a flexible matching approach could be created to

return partial matches and/or suggest additional inputs that would produce a full match by

capturing the dependencies among the matched services. This has several interesting research

issues. Another avenue for future work is to create an interactive, intelligent service

composer that is semantically guided to locate the target service components step by step.

26

CHAPTER 5

OBJECT ORIENTED ANALYSIS

5.1 USE CASE DIAGRAM

The following use case diagram depicts the way in which the input parameters are fed

into the database which is required to train the system.

Figure 5.1 Use Case Diagram

The activity diagram of the clustered based approach using pattern recognition System

consists of four main actors (admin, channel, portal, and database) and the event occurring

between them are listed above. The Admin takes control of the entire process and provide

credential to the channel and portal subscribers. These actor will establish various

relationship between them for activity occurring or retrieve the requested services as user

needed. The channel provider is generally used to register the channel and the portal is used

to validate the service.

Validate

Create Account

Rule Setting

IPTV portal

channel credential

package list

Login

Package Explorer

channel

Portal

Admin

27

5.2 SEQUENCE DIAGRAM

The sequence diagram is a kind of interaction diagram that shows how processes

operate with one another and in what order. Sequence diagrams are sometimes called Event-

trace diagrams, event scenarios, and timing diagrams. It shows how the system is trained with

the input data so as to make a more intellectual system that will be able to predict the

different stages in the renal disorder in a more accurate way.

Figure 5.2 Sequence Diagram

Sequence diagram of the clustered based approach using pattern recognition System consists

of four main actors (admin, channel, portal, and database) and the event occurring between

them are listed below hence sometimes called Event-trace diagrams, event scenarios, and

timing diagrams. These actor will establish various relationship between them for event

occurring or retrieve the requested services as user needed.

AdminAdmin PortalPortal ChannelChannel DatabaseDatabase

1: Login

2: Set Credential

3: Set Credential

4: Register Movies

5: Validate

6: Update XML

7: Rule Setting

8: Search Similar Service

9: Provide Similar Service

28

5.3 ACTIVITY DIAGRAM

An activity diagram illustrates the dynamic nature of a system by modeling the flow

of control from activity to activity. It depicts the overall flow of the training process and the

testing process to obtain a best probabilistic approach. An activity diagram illustrates the

dynamic nature of a system by modeling the flow of control from activity to activity. Because

an activity diagram is a special kind of state chart diagram, it uses some of the same modeling

conventions. The above activity diagram describes different modules. This diagram initially

will have a login page of channel, admin, portal etc. The purpose of channel register is that

we can add new or different user for the listed channels. After registering it the user can

retrieve the needed information through the internet. The purpose of portal is to validate

whether services get registered properly to the UDDI registry in the xml format.

Figure 5.3 Activity Diagram

29

5.4 CLASS DIAGRAM

The class diagram represents the classes with its attributes and operations. This

projects Explains the classes of admin, channel, portal and Package Exporter, where each and

every module has different functions and attributes. The admin has various functions like

channel credential, portal credential, UDDI registry and rule setting, where as the portal has a

function of validation or check constrain whether the services get registered properly in

UDDI in the xml format or WSDL file tag form. The package Exporter as a function of

retrieving the services that user needs.

Figure 5.4 Class Diagram

30

5.5 DEPLOYMENT DIAGRAM

The deployment diagram represent the integration of modules which are been

developed in our project. The bidders are linked with the bidding environment using the User

Interface. The user interface directs the user data to access the database through the modem.

The modem is present at both bidder end and seller end. This modem helps to act as a

medium for the communication medium through wired-mode or wireless mode for services

from internet.

The requestor modem directs the request message to the user interface at requestor

end. This interface is being used by the request to provide the services to the internet

environment. The requestor data also access the database and through the interfaces and the

modem. Thus the database in being the center of auction environment to provide the data

services to the bidders and the requestor through the gateway of the networking.

Figure 5.5 Deployment Diagram

31

CHAPTER 6

OBJECT ORIENTED (PROJECT) DESIGN

6.1 ARCHITECTURAL DESIGN

Architecture diagram shows the relationship between different components of system.

This diagram is very important to understand the overall concept of system. Architecture

diagram shows the relationship between different components of system. This diagram is

very important to understand the overall concept of system. The architecture diagram in the

software project is nothing but the diagrammatic representation of the internal features of the

project.

Figure 6.1 Architectural Design of the Proposed System

In our approach is being started with the indexing the WSDL dataset which are

associated with the semantically categorized services in the UDDI where the ontology

concepts are employed. By adding a user-defined tag in WSDL file the semantic

categorization is achieved, so for a particular search keyword can make the services to fall

under a given category. A single service can be made to appear in different categories by

implementation of the ontology concepts that identifies the relationships. This user defined

Web

Service

Registry

UDDI

Semantic Matching

WSDL Dataset

Service

Categorization

Clustering

WSDL Parameter

Association Rule

Associati

on

Pattern

Discovery

Semantic

Relation

Expanded

Request

Ontology

Concept

WS

Request

32

tag is given by the service provider and it is based on the functionality of the service. Web

service description vectors are built and the markups and index entries are removed.

T he web service vector development generally includes the parsing of the WSDL

file forms part of the initial WSDL set and its parallel description and also the related

parameters. Web Service Vector Modification is being done by enhancing the service

vectors with the concepts from the core ontology resolves issues related to synonyms

and induces domain related concepts that provide the context.

Clustering

Grouping of functionally similar services together is performed by the process of

clustering of service vectors. Clustering is actually done in hierarchical manner facilitates

classification of all the services, such that each secondary cluster and the combinations of

secondary clusters create a hierarchy—a structure that is more informative than the set of

clusters which are unstructured. The service categorization is performed manually and only

service request enhancement is performed during runtime, thus increase in the timing delay

will not be significant. The relevant ontology concepts are associated to clusters during the

creation.

Parameter-Based Service Refinement

This phase actually used to select service from the related group of services. Input,

output, and the description, of web service help service refinement process through narrowing

the set of appropriate services matching the service request. The statistical associations are

generally used to represent the relationship between the web service input and output

parameters. The parameters relationship pattern item set for all the web services within the

cluster is being built. Then the corresponding WSDL document is processed to retrieve t he

relevant service parameters. The weights are assigned for each o f the parameters by user

to refine the request. The ranking process is made more flexible by this. T h e binary

values are actually assigned to the ranking parameters which is an important task. The

association pattern mining phase generates a large number of association patterns and the

patterns having unrelated information that will negatively influence the service discovery

process have to be discarded.

33

Service Search

The next phase for our proposed system is to search the appropriate service based on the user

request. The service has to be selected from the cluster containing number of services. For

searching the appropriate service pattern recognition algorithm is used.

34

6.2 USER INTERFACE DESIGN

The user interface design expresses the interface between the user and the system. Using the

interface design, the user will interact to the system and get the service.

Fig 6.2 Index Page

Fig 6.3 Admin Login

35

Fig 6.4 Channel Registration

Fig 6.5 Portal Registration

36

Fig 6.6 Pattern Setting

Fig 6.7 Channel Login

37

Fig 6.8 Retrieve Service

Fig 6.9 Alert Message

38

6.3 DATA DESIGN

The data design provided in the project database to accept the given data and to

process it. In our project during the registration phase, the user name and the password is

registered ansd it is saved in database. Along with the username name and the password , the

mode is selected ,whether read or write mode.

Login table

S.no Column name Description Size Datatype Remark

1. U_name Username 15 Varchar Primary

2. Pwd Password 15 Varchar Notnull

Registration table

Table 6.10 Data Design

S.no Column name Description Size Datatype Remark

1 chan Channel Name 15 Varchar Primary

2 gen Genre 15 Varchar Not null

3 imdb IMDB 10 Varchar Not null

39

CHAPTER 7

DESIGN OF WORKING MODEL

7.1 CLUSTERING

Grouping of functionally similar services together is performed by the process of

clustering of service vectors. Clustering is actually done in hierarchical manner facilitates

classification of all the services, such that each secondary cluster and the combinations of

secondary clusters create a hierarchy—a structure that is more informative than the set of

clusters which are unstructured. The service categorization is performed manually and only

service request enhancement is performed during runtime, thus increase in the timing delay

will not be significant. The relevant ontology concepts are associated to clusters during the

creation. The association of concepts to each cluster help web service discovery by mapping

to functional categories. A cluster is defined as Ѳ i = cj where, cj is the corresponding ontology

concept. The ontology concepts deliver semantic for web service categorization. Then build a

set which contains all concepts that exist in at least one service description and remove the

concepts that makes again. This is followed by locating the places of the remaining

concepts in the concept hierarchy Hc. Each concept is checked for subsumes or subsumed

relationship with the elements of the set. The resultant super concept is then mapped to the

cluster. The process of association of the ontology concept to the cluster extends semantic

information in UDDI is done by the creation of tModels for the associated web services of

the cluster within the registry. The relateOntologyCluster algorithm is given above.

ALGORITHM

relateOntologyCluster

Input: Web Service Description clusters set Ѳ = {Ѳ1,Ѳ2.,…., Ѳn},

Min. Term Frequency Threshold µ

Output: Modified UDDI tModels

1. begin

2. For each Web Service cluster Ѳi do

3. Retrieve modified Web Service vector wsm € Ѳi do

4. Calculate term frequency ( tj , xj ) where tj ⊆ Ѳi

5. if xj < µ

40

6. delete tj

7. Map tj ≈ cj

8. Traverse Hc for upper ontology concept C

8. if the term concept is subsumed by the upper concept cj ⊆ C

9. CѲ = C

10. else

11. CѲ = cj

12. Map CѲ to Ѳi

13. for each Web Service wk € Ѳi

14. Update tModelm to include CѲ

15. end for

16. end

7.2 SERVICE SEARCH

The important phase for our proposed system is to search the appropriate service based on the

user request. The service has to be selected from the cluster containing number of services.

For searching the appropriate service pattern recognition algorithm is used[9]. The search

cluster is given below.

ALGORITHM

searchClusters(hN,S,FSi)

hN - a head node intended for the linked list containing of a collection of sibling

clusters.

S – the service to be searched in the cluster

FSi – feature section to be used for comparing similarity

1: hPtr = hN Next;

2: σ = α;

3: while hPtr ≠ Null

4: if σ > Dist(hPtr data,S|FSi)

5: σ = Dist(hPtr data,S|FSi);

41

6: Cl = hPtr data;

7: end if

8: hPtr = hPtr Next;

9: end while

10: return Cl

The above algorithm provides the method for searchClusters, which has to iterate the linked

list which is being headed by hN and the cluster having the minimum distance with S is also

found.

Dist(hPtr data,S|FSi) – distance between S and cluster

hPtr data – cluster is pointed using this

42

CHAPTER 8

EXPERIMENTAL DESIGN

The input design is the link between the information system and the user. It comprises

the developing specification and procedures for data preparation and those steps are

necessary to put transaction data in to a usable form for processing can be achieved by

inspecting the computer to read data from a written or printed document or it can occur by

having people keying the data directly into the system. The design of input focuses on

controlling the amount of input required, controlling the errors, avoiding delay, avoiding

extra steps and keeping the process simple. The input is designed in such a way so that it

provides security and ease of use with retaining the privacy. Input Design considered the

following things:

What data should be given as input?

How the data should be arranged or coded?

The dialog to guide the operating personnel in providing input.

Methods for preparing input validations and steps to follow when error occur.

OBJECTIVES

1. Input Design is the process of converting a user-oriented description of the input into a

computer-based system. This design is important to avoid errors in the data input process and

show the correct direction to the management for getting correct information from the

computerized system.

2. It is achieved by creating user-friendly screens for the data entry to handle large volume of

data. The goal of designing input is to make data entry easier and to be free from errors. The

data entry screen is designed in such a way that all the data manipulates can be performed. It

also provides record viewing facilities.

3. When the data is entered it will check for its validity. Data can be entered with the help of

screens. Appropriate messages are provided as when needed so that the user will not be in

maize of instant. Thus the objective of input design is to create an input layout that is easy to

follow.

43

OUTPUT DESIGN

A quality output is one, which meets the requirements of the end user and presents the

information clearly. In any system results of processing are communicated to the users and to

other system through outputs. In output design it is determined how the information is to be

displaced for immediate need and also the hard copy output. It is the most important and

direct source information to the user. Efficient and intelligent output design improves the

system’s relationship to help user decision-making.

1. Designing computer output should proceed in an organized, well thought out manner; the

right output must be developed while ensuring that each output element is designed so that

people will find the system can use easily and effectively. When analysis design computer

output, they should Identify the specific output that is needed to meet the requirements.

2. Select methods for presenting information.

3. Create document, report, or other formats that contain information produced by the system.

The output form of an information system should accomplish one or more of the following

objectives.

Convey information about past activities, current status or projections of the

Future.

Signal important events, opportunities, problems, or warnings.

Trigger an action.

Confirm an action.

The user registration module explains the design and implementation of user registration via

web based services. This module wills also communication established between client and

web based service. The semantic categorization of UDDI wherein we combine ontology with

an established hierarchical clustering methodology, following the service description vector

building process. For each term in the service description vector, a corresponding concept is

located in the relevant ontology. If there is a match, the concept is added to the description

vector. The next step is service selection from the relevant category of services using

parameter-based service refinement. Web service parameters, i.e., input, output, and

description, aid service refinement through narrowing the set of appropriate services

matching the service request. The relationship between web service input and output

parameters may be represented as statistical associations. These associations relay

information about the operation parameters that are frequently associated with each other.

The parameter-based refined set of web services is then matched against an enhanced service

44

request as part of Semantic Similarity-based Matching. A key part of this process involves

enhancing the service request. Our approach for web semantic similarity-based service

selection employs ontology-based request enhancement and LSI based service matching.

The basic idea of the proposed approach is to enhance the service request with

relevant ontology terms and then find the similarity measure of the semantically enhanced

service request with the web service description vectors generated in the service refinement

phase. A large number of web services structure a service oriented architecture and facilitate

the creation of distributed applications over the web. These web services offer various

functionalities in the areas of communications, data enhancement e-commerce, marketing,

utilities among others. Some of the web services are published and invoked in-house by

various organizations. These web services may be used for business applications, or in

government and military. However, this requires careful selection and composition of

appropriate web services. The web services within the service registry (UDDI) have

predefined categories that are specified by the service providers. Services may be listed under

different categories.

Login

Channel

End

Package Exporter(reterieve movie name)

Portal Admin

Vaalidate

Rule Setting

Channel Credential

IPTV Portal

Package List

Figure 8.1 Work Flow Diagram

45

CHAPTER 9

EXPERIMENTAL RESULTS AND DISCUSSIONS

The results improve with an increase in the number of clusters. These results validate

the scalability of our approach. This may be explained by an increase in the purity of clusters

with lesser number of service descriptions in comparison to that of a cluster with maximum

number of service descriptions for individual categories. Another aspect of our evaluation

deals with the frequency of service categorization for the entire UDDI. The perform service

categorization on an incremental basis. And assume that the ontology is not perfect and that

the ontology is updated to represent additional domain objects and their interrelationships.

Then the categorization must be performed every time a newer service is added to the UDDI.

However, periodic categorizations may be required if the service additions are frequent, as

can be expected in real-life situations with large user and provider communities.

However, it can be updated the service category by isolating the upper ontology

concept that remains unchanged and then recategorizing all the services that fall in its child

concepts. When evaluating the efficiency of our approach, there are a number of factors that

affect the timings obtained viz., the size of the underlying ontology and the number of service

to be categorized. For evaluating the analytical complexity of the proposed service

categorization approach, let n represents the total number of concepts that form the ontology

and m represents the total number of web services. For searching a specific concept in the

ontology, O(log n) search operations need to be performed. The add operation for including

the relevant concepts for each web service occurs in constant time. For this reason the

standard representation of our approach for service categorization would be O½mðlogn þ

nÞ_.

To maintain the balance between the generality and the specificity of terms in web

service descriptions. This is achieved by expansion of the term vectors with relevant ontology

concepts and subsequent reduction of terms from the web service descriptions. The results

follow those observed in the add set of experiments. The technique, where in ontology

concepts are added to all terms of web service descriptions followed by pruning, results in

increased generality. The best results can be compared to all techniques were observed in the

technique, where in ontology concepts are added to relevant terms of web service

descriptions followed by pruning. These results might provide an increase in specificity and

reduction of generality of the terms in web service description.

46

Semantic Similarity-Based Matching

For evaluating our approach for semantic similarity-based service discovery, to set out

to discover relevant services for an average of ten service requests. The initial discovery is

based on a smaller number of WSDL files with a focus on precision. The next discovery

experiment examines a larger section of WSDL files with a focus on maximizing recall. In

order to assess the impact of service request expansion with relevant terms from ontology

concepts on service discovery, then compare the cosine measure-based similarity scores of

the two different service selection methods; with enhanced service request and original

service request. The expanded service requests, thus, facilitate improved differentiation

between the appropriate services and the rest of the services on account of the higher score

differences indicating a better match to the service request. The ranking of the services

change as more dimensions are added to the service collection under consideration.

Performance

We have compared the time taken to match a Service Request with a web service

description within service sets that include 1) predefined categories, 2) semantic

categorization, and 3) entire service set or the set of uncategorized services. The basis of this

experiment is to validate our approach for an ontology guided web service Categorization. It

was observed that the time taken for service matching within pre-defined categories,

semantically categorized (our approach) and uncategorized services was 2.58, 3.65, and 406.8

seconds, respectively. We observe that our approach provides a balance in terms of quality of

the service selected and also the time taken for matching of an appropriate service. The

observed time for service discovery seems acceptable, especially given that most of the time

users will submit more incremental, and hence less time consuming requests. The time it

takes to load the system though could be improved. In the future, we plan to further evaluate

the scalability of our approach, along with detailed experimentation with actual users to fine

tune the way in which our integrated functionality is presented and to eventually evaluate the

full benefits of our approach from a performance and solution quality standpoint.

Deployment

In the existing architecture, the service provider/requestor accesses the UDDI through

an application server. To deploy our approach, it is needed to enhance this by incorporating a

semantic application server as well as an ontology repository. The Application Server now

47

executes our approach to select the most suitable services based on semantics processed by

the Semantic Application Server in conjunction with the ontology repository. The Semantic

Application Server should include an ontology reasoned (e.g., Racer) that utilizes description

logics to load and query ontologies to extract the relevant concepts for semantic

categorization of web service descriptions and enhancement of service requests. Since our

proposed work considers semantic functionality of web services for service discovery and

ranking, it is not explicitly address other QoS measures such as trust and reputation.

However, this can be easily incorporated as follows: for example, a trust and reputation

registry could be integrated with the UDDI server. Depending upon the number of web

services and service requests, it may need to use XML gateway devices to offload the work

of parsing and transformation of XML to reduce the computational burden.

48

CHAPTER 10

CONCLUSION

Thus the proposed system provides a novel approach which deals with the service

discovery and addresses two of the major aspects namely: categorization of services based on

their functional semantics and the clustering of the service using related ontology concepts.

The pattern recognition algorithm is implemented here to identify the appropriate service

from the cluster. Hence the proposed system satisfies the user needs by providing the

appropriate service as requested by the user.

In future, our approach can be extended to allow service requests that are formed

using specialized query languages and these requests are matched to semi annotated services

that are described using formats such as SAWSDL, OWL-S among others. Typically,

multiple services have to be discovered so that they together match a service request. It

should be possible to utilize ontologies, and explicitly return the sequence of individual

service invocations to be performed in order to achieve the desired composite service.

A flexible matching approach could be created to return partial matches and/or

suggest additional inputs that would produce a full match by capturing the dependencies

among the matched services. Another avenue for future work is to create an interactive,

intelligent service composer that is semantically guided to locate the target service

components step by step. It is also intended to extend our ontology framework and

investigate additional mapping tools to better express a service request to search for relevant

concepts. As part of the service discovery process, it explores associating semantic weights to

the retrieved set of web services for effective semantic ranking of the results.

49

REFERENCES

[1] Anton Naumenko, Sergiy Nikitin, Vagan Terziyan, 2006, “Service matching in agent

systems”, Springer Science+Business Media, LLC.

[2] Hui Xiong, Junjie Wu, and Jian Chen, 2009, “K-Means Clustering Versus Validation

Measures: A Data-Distribution Perspective”, Transactions on Systems, Man, and

Cybernetics—Part B: Cybernetics, Vol. 39, No. 2.

[3] M.A .Corella and P.Castells, 2006, “Semi-Automatic Semantic-Based Web Service

Classification,” Proc. Int’l Conf. Business Process Management Workshops (BPM ’06).

[4] www.ibm.com/developerworks/tutorials/wesws511pt1/wesws511 pt1-pdf.pdf

[5] Liang and Herman Lam, July 2008"Web Service Matching by Ontology Instance

Categorization", IEEE International Conference, Services Computing.

[6] Jian Wen, Zhoujun Li and Xiaohua Hu, 2007, “Ontology Based Clustering for Improving

Genomic IR”, International Symposium on Computer-Based Medical Systems, 2007. CBMS.

[7] Cheng-Ru Lin, Ken-Hao Liu, and Ming-Syan Chen, 2005, “Dual Clustering: Integrating

Data Clustering over Optimization and Constraint Domains”, IEEE Transactions on

Knowledge and Data Engineering, Vol. 17, No. 5.

[8] Alireza Zohali, DR.Kamran Zamanfiar, 2005 – 2009, “Matching Model for Semantic

Web Services Discovery”, Journal of Theoretical and Applied Information Technology,

JATIT.

[9] Liang-Jie Zhang, Shuxing Cheng, Carl K. Chang, and Qun Zhou, 2012, “A Pattern-

Recognition-Based Algorithm and Case Study for Clustering and Selecting Business

Services”, IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and

Humans, Vol. 42, No. 1.

[10] Tsang-Hsiang Cheng and Chih-Ping Wei, March, “A Clustering-Based Approach for

Integrating Document-Category Hierarchies”, IEEE Transactions on Systems, Man, and

Cybernetics—Part A: Systems and Humans, Vol. 38, No. 2.

[11] Aabhas V. Paliwal, Basit Shafiq, Jaideep Vaidya, Hui Xiong, and Nabil Adam, 2012,

“Semantics-Based Automated Service Discovery”, IEEE Transactions on Services

Computing, Vol. 5, No. 2.

[12] Michael Forbes, Jim Lawrence, Yu Lei, Raghu N. Kacker and D. Richard Kuhn, 2008,

“Refining the In-Parameter-Order Strategy for Constructing Covering Arrays”, Journal of

Research of the National Institute of Standards and Technology, Volume 113, Number 5.

http://www.ibm.com/developerworks/tutorials/wesws511pt1/wesws511%20pt1-pdf.pdf

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=4262610

50

[13] Fei You, Qingxi Hu, Yuan Yao, Gaochun Xu, Minglun Fang, 2009, “Study on Web

Service Matching and Composition Based on Ontology”, WRI World Congress on Computer

Science and Information Engineering, Volume 4.

[14] Jie Liu, Jigui Sun and Shengsheng Wang, 2006, “Pattern Recognition: An overview”,

IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.6.

51

PUBLICATIONS

TITLE : Cluster based Approach for Service Discovery using Pattern Recognition

Publication : International Journal of Advanced and Innovative Research.

ISSUE : Volume 2 Issue 2

Month : FEBRUARY 2013, PAGE 345

ISSUE ISBN : 2278-7844

52

APPENDIX-1

SAMPLE CODING

Linearprog.java

import java.sql.*;

import java.lang.*;

import java.awt.*;

import java.awt.event.*;

import javax.swing.*;

import java.sql.*;

import java.util.*;

import java.io.*;

class linearprog extends JFrame implements ActionListener

{

JLabel l1,l2,l3,l4,l5,l6,l7;

JTextField t1,t2,t3,t4,t5,t6,t7;

String str,str1,str2;

JButton b1,b2,b3,b4;

JComboBox c1;

JLabel lite,lerr;

JTextField ite,err;

JLabel lbid,lbidtime;

JTextField bid,bidtime;

double bamt,btime;

double dstr2;

double xavg=0; double yavg=0;

double alpha=0; double beta=0;

53

Connection con=null;

ResultSet rs=null;

public linearprog()

{

super("Neural Prediction- Online Auction");

Container c = getContentPane();

c.setLayout(new GridLayout(8,2,10,10));

JPanel inp = new JPanel(new GridLayout(1,9,10,10));

lite = new JLabel("Iterations");

ite = new JTextField("500");

inp.add(lite);

inp.add(ite);

lerr = new JLabel("Error Rate");

err = new JTextField("0.5");

inp.add(lerr);

inp.add(err);

String alg[]= {"Online Auction"};

c1 = new JComboBox(alg);

// inp.add(c1);

c.add(c1);

c.add(inp);

lbid = new JLabel("Bid Amount");

c.add(lbid);

bid = new JTextField();

c.add(bid);

lbidtime = new JLabel("Remaining Time");

54

c.add(lbidtime);

bidtime = new JTextField();

c.add(bidtime);

l2 = new JLabel("Alpha");

c.add(l2);

t2 = new JTextField();

c.add(t2);

l3 = new JLabel("Beta");

c.add(l3);


c.add(t3);

l4 = new JLabel("Predicted Price");

c.add(l4);


c.add(t4);

b1 = new JButton("Load");

c.add(b1);

b2= new JButton("Train");

c.add(b2);

b3 = new JButton("Calculate");

c.add(b3);

b4 = new JButton("Exit");

c.add(b4);

setSize(600,600);

55

b1.addActionListener(this);




setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

b2.enable(false);

b3.enable(false);

}

public void actionPerformed(ActionEvent e)

{

String s = e.getActionCommand();

if(s.equals("Load"))

{

btime = Double.parseDouble(bidtime.getText());

bamt = Double.parseDouble(bid.getText());

Dataset s1=new Dataset(bamt,btime);

s1.show();

b2.enable(true);

}

if(s.equals("Train"))

{

String s1=bid.getText();

int i = Integer.parseInt(s1.trim());

calavg(i);

b3.enable(true);

}

if(s.equals("Calculate"))

{

double pprice = bamt+ dstr2;

double i=Double.parseDouble(ite.getText());

t4.setText(String.valueOf(pprice));

}

56

if(s.equals("Exit"))

{

System.exit(0);

}

}

public void calavg(int x)

{

String error=null;

// tempx,tempy;

double tempx,tempy;

//int sx=0;int sy=0;

//int sxx=0;int sxy=0;

double sx=0;

double sy=0;

double sxx=0;

double sxy=0;

String s1=null; String s2=null;

try

{

System.out.println("\nAttempting to load JDBC Driver....");

Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");

System.out.println("JDBC Driver loaded...");

System.out.println("Connecting to database...");

con=DriverManager.getConnection("jdbc:odbc:ebay","","");

System.out.println("Database connection established");

}//end of try

catch (Exception sqle)

57

{System.out.println("Unable to load driver..."+sqle);}

try

{

//String queryString=("SELECT * FROM auction;");

String queryString=("SELECT * FROM auction where bid> "+bamt+";");

Statement stmt=con.createStatement();

rs=stmt.executeQuery(queryString);

while (rs.next())

{

s1=rs.getString("bid");

s2=rs.getString("bidderrate");

//tempx = Integer.parseInt(s1.trim());

//tempy = Integer.parseInt(s2.trim());

tempx = Double.parseDouble(s1.trim());

tempy = Double.parseDouble(s2.trim());

sx=sx+tempx;

sy=sy+tempy;

sxx=sxx+(tempx*tempx);

sxy=sxy+(tempx*tempy);

} //end of while

System.out.println("\n");

System.out.println("Training the Network...\n");

double t=10;

beta=((t*sxy)-(sx*sy))/((t*sxx)-(sx*sx));

58

double yavg=sy/t;

double xavg=sx/t;

alpha=yavg-(beta*xavg);

double y;

y=alpha+(beta*x);

double e = Double.parseDouble(err.getText());

double dstr= Math.abs(alpha) * e;

str= Double.toString(dstr);

double dstr1= Math.abs(beta) * e;

str1= Double.toString(dstr1);

dstr2= Math.abs(y) * e;

str2= Double.toString(dstr2);

// str1 = Double.toString(beta);

// str2 = Double.toString(y);

//System.out.println("Beta is " + beta);

t2.setText(str);

t3.setText(str1);

}//end of try

catch (SQLException sqle)

{

System.out.println("Some SQL error occured.");

}

}//end of function

59

public static void main(String args[])

{

linearprog t=new linearprog();

t.show();

}//end of main

}//end of class

60

Channel.jsp

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<%@taglib uri="/struts-tags" prefix="s"%>

<html>

<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

<meta name="description" content="" />

<meta name="keywords" content="" />

<title>Bootstrapping IPTV</title>

<link rel="stylesheet" type="text/css" href="style.css" />

<script type="text/javascript">

function call() {

var channelType = document.registration.channelType.value;

var userName = document.registration.userName.value;

var password = document.registration.passWord.value;

var mode = document.registration.mode.value;

if (channelType != '' && userName != '' && password != '' && mode != '')

document.registration.submit();

else

alert('Please Enter all the fields !!');

}

</script>

</head>

<body>

<div id="wrapper">

<div id="splash">

<img src="images/pic1.jpg" alt="" />

</div>

<div id="menu">

<ul>

<li><a id="href" href="registration.jsp">Channel's

61

Credential</a></li>

<li><a id="href" href="iptvPortal.jsp">IPTV Portal</a></li>

<li><a id="href" href="iptvPortalRule.jsp">Rule Settings</a></li>

<li><a id="href" href="login.jsp">Logout</a></li>

</ul>

</div>

<div id="page">

<div align="center">

<s:form name="registration" action="ChannelAction">

<table border="0" cellspacing="15">

<tr>

<td colspan=2 align="center" style="color: #447289;">

<h2>Channel Registration</h2>

</td>

</tr>

<tr>

<td><s:combobox label="Select a Channel"

list="#{'HBO':'HBO','StarMovies':'StarMovies','WorldMovies':'WorldMovies','ESPN':'ESPN'

}"

name="channelType" /></td>

</tr>

<tr>

<td><s:textfield label="User Name" name="userName"

id="userName"></s:textfield></td>

</tr>

<tr>

<td><s:password label="Password" name="password"

id="passWord"></s:password></td>

</tr>

<tr>

<td><s:radio label="Mode" name="mode" id="mode"

list="#{'1':'read','2':'write'}" value="1" /></td>

</tr>

<tr>

62

<td colspan="2" style="text-align: center"><input

id="param1" type="button" onclick="call()" value="Enter"></input>

</td>

</tr>

</table>

</s:form>

</div>

<br class="clearfix" />

</div>

</div>

</body>

</html>

Date post:	10-May-2015
Category:	Education
Upload:	santhan-r
View:	329 times
Download:	6 times

Project - UG - BTech IT - Cluster based Approach for Service Discovery using Pattern Recognition

Education