1
CHAPTER 1
INTRODUCTION
1.1 ABOUT THE PROJECT
The need for B2B Integration and the limitations of conventional middleware in B2B
Integration raised the need for a novel technology – a need addressed by web services. Web
services are components with specific functionalities that can be integrated into more complex
real time distributed applications. The description of the services is stored in the registry which
allows the designers to register a new service and allow service users to search for and locate
services. The service registry contains web services which include predefined types that are
generally specified by service providers. The UDDI registry specifications have two main
goals with respect to service discovery: first, to support developers in finding information
about services, so that they know how to write clients that interact with those services. Second,
enable the dynamic binding, by allowing clients to query the registry and obtain references to
services of interest.
Service discovery can be done at also in design-time, by surfing the directory and
identifying the most relevant services, and at run-time, for these dynamic binding techniques
are being used. The proper discovery, configuration, and communication between devices and
services with each other is enabled by Service discovery process. The relevant services which
are discovered do not fulfill user needs all-time. The number of web services continues to
grow and side by side the business environment keeps demanding newer applications that
have to be rolled out according to very tight schedule. Relatively large number of web services
and the distribution of similar services might be listed in different categories in the registry
infrastructure which makes it difficult to find appropriate services.
Therefore, rather than classifying the services based on their providers, they must be
categorized on their functional semantics. Services categorized on their functional semantics
will facilitate the organization of similar services together. Majority of the service description
that exists so far are syntactic in nature. The same syntax might be used at different place for
various purposes. Actually when a service is requested only small amount of services that are
an exact syntactical match of the request is selected. And the selected services may perform
different functions rather than the requested functions and the discovery process is limited by
its dependence of human involvement for selecting right service based on its functionality.
2
Pattern recognition algorithms generally aim to provide a reasonable answer for all
possible inputs and to perform "most likely" matching of the inputs, taking into account their
statistical variation. This is opposed to pattern matching algorithms, which look for exact
matches in the input with pre-existing patterns. A common example of a pattern-matching
algorithm is regular expression matching, which looks for patterns of a given sort in textual
data and is included in the search capabilities of many text editors and word processors. In
contrast to pattern recognition, pattern matching is generally not considered a type of
machine learning, although pattern-matching algorithms can sometimes succeed in providing
similar-quality output to the sort provided by pattern-recognition algorithms.
Subsequent practitioners of Semantic Web have also been making possible of their
predictions and actions in a variety of ways. For instance, with the application of Semantic
Web technologies, it is possible to automate operations, say, from completing all that you
need for a travel to updating of your personal records. Semantic Web then can be defined as a
web of information on the Internet and Intranet that contains characteristics of annotation
which enables accessing of precise information that you need.
Clustering has been the driving force behind many of the world's most powerful
scientific supercomputers for many years and is now being used increasingly as a cost-
effective way to provide high-performance, high availability computing for a wide variety of
commercial workloads such as business intelligence, engineering design, financial analysis,
digital media and petroleum exploration. The description of web services should be described
using WSDL in order to make the approach to be generic. The project actually provides or
addresses a clustered approach for service discovery based on their functionality and
providing similar service using pattern matching, thus satisfying user needs.
The proposed approach deal with the issue of service discovery provided non-explicit
service description semantics that match a particular service request. The propose a system
that involves semantic-based service categorization which is performed at the UDDI with a
key for achieving the service categorization at functional level based on an ontology skeleton.
Also, clustering is used for literally systemizing the web services based on functionality
which is achieved by using analytic algorithm. An efficient matching for the relevant services
is achieved by the enhancing the service request semantically and involves expanding the
additional functionality (obtained from ontology) that are related for the requested service.
The pattern recognition algorithm is used to select appropriate service from the cluster
formation of related (grouped) web services.
3
1.2 DESCRIPTION OF WEB SERVICE
A Web service is a method of communication between two electronic devices over
the World Wide Web .The W3C defines a "Web service" as "a software system designed to
support interoperable machine-to-machine interaction over a network". It has an interface
described in a machine-process able format (specifically Web Services Description
Language, known by the acronym WSDL). Other systems interact with the Web service in a
manner prescribed by its description using SOAP messages, typically conveyed
using HTTP with an XML serialization in conjunction with other Web-related standards."The
W3C also states, "We can identify two major classes of Web services, REST-compliant Web
services, in which the primary purpose of the service is to manipulate XML representations
of Web resources using a uniform set of "stateless" operations; and arbitrary Web services, in
which the service may expose an arbitrary set of operations."
Web services describes a standardized way of integrating Web-based applications
using the XML, SOAP, WSDL and UDDI open standards over an Internet protocol
backbone. Web-applications are built around the Web browser standards and can be used by
any browser on any platform. Web services use XML to code and to decode data, and SOAP
to transport it. By using Web services, our application can publish its function or message to
the rest of the world with Web services you can exchange data between different applications
and different platforms.
Characteristics of Web Services:
XML-based
Web Services uses XML at data representation and data transportation layers. Using
XML eliminates any networking, operating system, or platform binding. So Web Services
based applications are highly interoperable application at their core level.
Loosely coupled
A consumer of a web service is not tied to that web service directly. The web service
interface can change over time without compromising the client's ability to interact with the
service. A tightly coupled system implies that the client and server logic are closely tied to
one another, implying that if one interface changes, the other must also be updated.
4
Coarse-grained
Object-oriented technologies such as Java expose their services through individual
methods. An individual method is too fine an operation to provide any useful capability at a
corporate level. Building a Java program from scratch requires the creation of several fine-
grained methods that are then composed into a coarse-grained service that is consumed by
either a client or another service. Businesses and the interfaces that they expose should be
coarse-grained. Web services technology provides a natural way of defining coarse-grained
services that access the right amount of business logic.
Ability to be synchronous or asynchronous
Synchronicity refers to the binding of the client to the execution of the service. In
synchronous invocations, the client blocks and waits for the service to complete its operation
before continuing. Asynchronous operations allow a client to invoke a service and then
execute other functions. Asynchronous clients retrieve their result at a later point in time,
while synchronous clients receive their result when the service has completed. Asynchronous
capability is a key factor in enabling loosely coupled systems.
Supports Remote Procedure Calls (RPCs)
Web services allow clients to invoke procedures, functions, and methods on remote
objects using an XML-based protocol. Remote procedures expose input and output
parameters that a web service must support. Component development through Enterprise
JavaBeans (EJBs) and .NET Components has increasingly become a part of architectures and
enterprise deployments over the past couple of years. Both technologies are distributed and
accessible through a variety of RPC mechanisms. A web service supports
Supports document exchange
One of the key advantages of XML is its generic way of representing not only data,
but also complex documents. These documents can be simple, such as when representing a
current address, or they can be complex, representing an entire book or RFQ. Web services
support the transparent exchange of documents to facilitate business integration.
The World Wide Web is increasingly being used for communication between
applications. The programmatic interfaces made available over the web for application-to-
application communication are often referred to as web services. There are many types of
applications that can be considered web services but interoperability between applications is
enhanced most by the use of familiar technologies such as XML and HTTP. These
5
technologies allow applications using differing languages and platforms to interface in a
familiar way. Web services are distributed application components that are externally
available. You can use them to integrate computer applications that are written in different
languages and run on different platforms. Web services are language and platform
independent because vendors have agreed on common web service standards.
Web services are client and server applications that communicate over the World
Wide Web’s (WWW) Hyper Text Transfer Protocol (HTTP). As described by the World
Wide Web Consortium (W3C), web services provide a standard means of interoperating
between software applications running on a variety of platforms and frameworks. Web
Services are thought of to be a means to provide easily accessible services over a network.
They should be simply usable regardless of the underlying network structure or
configuration, operating system, communication mechanism or implementing language.
Web services are Loosely coupled, reusable software components that semantically
encapsulate discrete functionality and are distributed and programmatically accessible over
standard Internet protocols. A web service is any piece of software that makes itself available
over the internet and uses a standardized XML messaging system. XML is used to encode all
communications to a web service. For example, a client invokes a web service by sending an
XML message, then waits for a corresponding XML response. Because all communication is
in XML, web services are not tied to any one operating system or programming language--
Java can talk with Perl; Windows applications can talk with Unix applications.
Web Services are self-contained, modular, distributed, dynamic applications that can
be described, published, located, or invoked over the network to create products, processes,
and supply chains. These applications can be local, distributed, or Web-based. Web services
are built on top of open standards such as TCP/IP, HTTP, Java, HTML, and XML. Web
services are XML-based information exchange systems that use the Internet for direct
application-to-application interaction. These systems can include programs, objects,
messages, or documents.
A web service is a collection of open protocols and standards used for exchanging data
between applications or systems. Software applications written in various programming
languages and running on various platforms can use web services to exchange data over
computer networks like the Internet in a manner similar to inter-process communication on a
single computer. This interoperability (e.g., between Java and Python, or Windows and Linux
applications) is due to the use of open standards.
6
Figure 1.1 Web services architecture
The web services architecture permits the development of web services that
encapsulate all levels of business functionality. In other words, a web service can be very
simple, such as one that returns the current temperature, or it can be a complex application.
The architecture also allows multiple web services to be combined to create new
functionality.
The standards on which web service development is based are evolving technologies.
The primary players are SOAP (Simple Object Access Protocol), WSDL (Web Services
Description Language), UDDI (Universal Description, Discovery and Integration), and XML
(Extensible Markup Language).
1.3 WEB SERVICE STANDARDS
The standards on which web service development is based are evolving technologies.
The primary players are SOAP (Simple Object Access Protocol), WSDL (Web Services
Description Language), UDDI (Universal Description, Discovery and Integration), and XML
(Extensible Markup Language).
7
Simple Object Access Protocol (SOAP)
SOAP, originally defined as Simple Object Access Protocol, is a protocol
specification for exchanging structured information in the implementation of Web
Services in computer networks. It relies on Extensible Markup Language (XML) for its
message format, and usually relies on other Application Layer protocols, most
notably Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP), for
message negotiation and transmission.
Extensible Markup Language (XML)
Extensible Markup Language (XML) is a markup language that defines a set of rules
for encoding documents in a format that is both human-readable and machine-readable.It is
defined in the XML 1.0 Specification produced by the W3C, and several other related
specifications. Many Application Programming interfaces (APIs) have been developed for
software developers to use to process XML data, and several schema systems exist to aid in
the definition of ML-based languages.
Web Services Description Language (WSDL)
The Web Services Description Language is an XML-based language that is used for
describing the functionality offered by a Web service. A WSDL description of a web service
(also referred to as a WSDL file) provides a machine-readable description of how the service
can be called, what parameters it expects, and what data structures it returns. It thus serves a
roughly similar purpose as a method signature in a programming language. WSDL is often
used in combination with SOAP and an XML Schema to provide Web services over
the Internet.
Universal Description, Discovery and Integration (UDDI)
UDDI (Universal Description Discovery and Integration) is a platform-independent
framework for describing services, discovering businesses, and integrating business services
by using the Internet. UDDI stands for Universal Description, Discovery and Integration. It is
a directory for storing information about web services and described by WSDL. It
communicates via SOAP.
8
1.4 COMPONENTS OF WEB SERVICES
The six key components of a web service are:
Self-contained
Self-describing
Modular components
Published
Located
Invoked across web
1.5 FEATURES OF WEB SERVICES
Self-Contained: No additional software is required for web service.
Client-Side: A programming language with XML/HTML client support.
Server-Side: A web server and a SOAP server are needed.
Loosely-Coupled: Client and server only knows about messages - a simple coordination level
that allows for more flexible re-configuration.
Web-Enabled: Web Service are published, located and invoked across the web, using
established lightweight Internet standards.
Language-Independent and Interoperable: Client and server may be implemented in different
environments and different languages.
Composable: Web service can be aggregated using workflow techniques to perform higher-
level business functions.
Dynamically Bound: With UDDI and WSDL, the discovery and binding of web services can
be automated.
Programmatic Access: The web services approach does not provide a graphical user interface
but operates at the command level.
1.6 WEB SERVICE ARCHITECTURE
The web services architecture permits the development of web services that
encapsulate all levels of business functionality. In other words, a web service can be very
simple, such as one that returns the current temperature, or it can be a complex application.
9
The architecture also allows multiple web services to be combined to create new
functionality.
PUBLISH
The service provider defines a service description for the web service and publishes it
to a requestor or service discovery agency.
FIND
The service requestor uses a find operation to retrieve the service description locally
or from the discovery agency (service registry).
BIND
The service requestor uses the service description to bind with the service provider
and invoke or interact with the web service implementation
Figure 1.2 Web service roles and operation
10
CHAPTER 2
LITERATURE SURVEY
2.1 WEB SERVICES
Fig 2.1 Service Based Architecture
The above diagram is the representation of a service based model. In this model, the
web service resides on the web server. A client computer can request and consume this web
service and terminate the service as and when desired. The name "service" originates from
this model of functioning of the program. The client program needs to be provided with only
the url of the web service.
The client applications can be running on different platforms, or developed using
different tools. A major application of web service is to integrate such different applications
which may be working on heterogeneous platforms. The existence of heterogenous platforms
in an organization or among organizations is a common phenomenon nowadays in an
organization.
Any organization today works with multiple vendors, suppliers , contractors and other
entities. Each of these entities would have developed their own software systems based on
Microsoft technologies or on Sun microsystems or on IBM technologies. Each of these
software systems would have been developed over period of time with hundreds of thousands
of dollars investments. It will be almost impossible for any of them to change their systems
for compatibility. This is where web services comes into picture. Communicating amongst all
these entities without affecting their existence is made possible by web services.
11
Service-oriented architecture (SOA) is a software design methodology based on
structured collections of discrete software modules, known as services, that collectively
provide the complete functionality of a large or complex software application. Each service
that makes up an SOA application is designed to provide a tightly defined set of functions. As
a result, each service is built as a discrete piece of code. This makes it possible to reuse the
code in different ways throughout the application by changing only the way an individual
service interoperates with other services that make up the application, versus making code
changes to the service itself. SOA design principles are used during
software development and integration.
SOA generally provides a way for consumers of services, such as web-based
applications, to be aware of available SOA-based services. For example, several disparate
departments within a company may develop and deploy SOA services in different
implementation languages; their respective clients will benefit from a well-defined interface
to access them. XML is often used for interfacing with SOA services. JSON is also becoming
increasingly common.
SOA defines how to integrate widely disparate applications for a Web-based
environment and uses multiple implementation platforms. Rather than defining an API, SOA
defines the interface in terms of protocols and functionality. An endpoint is the entry point for
such a SOA implementation.
Service-orientation requires loose coupling of services with operating systems and
other technologies that underlie applications. SOA separates functions into distinct units, or
services, which developers make accessible over a network in order to allow users to
combine and reuse them in the production of applications. These services and their
corresponding consumers communicate with each other by passing data in a well-defined,
shared format, or by coordinating an activity between two or more services.
SOA can be seen in a continuum, from older concepts of distributed computing and
modular programming, through SOA, and on to current practices of mashups, SaaS,
and cloud computing (which some see as the offspring of SOA).
12
2.2 BENEFITS OF USING WEB SERVICES
Exposing the existing function on to network:
A Web service is a unit of managed code that can be remotely invoked using HTTP,
that is, it can be activated using HTTP requests. So, Web Services allows you to expose the
functionality of your existing code over the network. Once it is exposed on the network, other
application can use the functionality of your program.
Connecting Different Applications i.e Interoperability:
Web Services allows different applications to talk to each other and share data and
services among themselves. Other applications can also use the services of the web services.
For example VB or .NET application can talk to java web services and vice versa. So, Web
services is used to make the application platform and technology independent.
Standardized Protocol:
Web Services uses standardized industry standard protocol for the communication.
All the four layers (Service Transport, XML Messaging, Service Description and Service
Discovery layers) uses the well defined protocol in the Web Services protocol stack. This
standardization of protocol stack gives the business many advantages like wide range of
choices, reduction in the cost due to competition and increase in the quality.
Low Cost of communication:
Web Services uses SOAP over HTTP protocol for the communication, so you can use
your existing low cost internet for implementing Web Services. This solution is much less
costly compared to proprietary solutions like EDI/B2B. Beside SOAP over HTTP, Web
Services can also be implemented on other reliable transport mechanisms like FTP etc.
2.3 DISCOVERY
In a large number of applications, web servcies will be provided by on one hand and
used by on the other peers who have established relationships. Indeed, until a trust
infrastructure is fairly developed it is not reasonable to expect computers to do automatic
comparison shopping for very many services. Web services will probably (like the web in
1993, and the Semantic Web in 2003) spread first within the corporate firewall, where
security problems are minor and mistakes less embarassing than inter-enterprise or publically.
However, the goal is that so many web services should be available that it will be important
to be able to find them in all kinds of ways.
13
The UDDI project and the related work on description and query systems is aimed at
this. A positive aspect of UDDI is the definition of an ontology for web services. Problems
with it are that it is centralized by design, both in the single-tree ontology, and in the design
based fundamentally on a central registry, with inter-registry operation as a secondary thing.
From the semantic web point of view, web services are simply one aspect of the many things
which will be searched for. Indeed, the fact that a web service is provided may in fact be
rather incidental to the essential nature of the business item which is discovered -- a trader in
stocks, a seller of lawnmowers, and so on. The semantic web aims to describe any aspect of
anything, including the catalogs, parts, materials, services organization, relationships and
contracts. A query system which addresses web services only makes sense when smoothly
integrated with the rest of the web of enterprise knowledge. Web services provide access to
software systems over the Internet using standard protocols. In a minimalistic scenario there
exists at least a Web service provider that publishes some service such as a weather service
and a Web service consumer that uses this service. Web service discovery is the process of
finding a suitable Web service for given task.
2.4 PATTERN RECOGNITION
In machine learning, pattern recognition is the assignment of a label to a given input
value. An example of pattern recognition is classification, which attempts to assign each
input value to one of a given set of classes (for example, determine whether a given email is
"spam" or "non-spam"). However, pattern recognition is a more general problem that
encompasses other types of output as well. Other examples are regression, which assigns a
real-valued output to each input; sequence labelling, which assigns a class to each member of
a sequence of values (for example, part of speech tagging, which assigns a part of speech to
each word in an input sentence); and parsing, which assigns a parse tree to an input sentence,
describing the syntactic structure of the sentence.
Pattern recognition algorithms generally aim to provide a reasonable answer for all
possible inputs and to perform "most likely" matching of the inputs, taking into account their
statistical variation. This is opposed to pattern matching algorithms, which look for exact
matches in the input with pre-existing patterns. A common example of a pattern-matching
algorithm is regular expression matching, which looks for patterns of a given sort in textual
data and is included in the search capabilities of many text editors and word processors. In
contrast to pattern recognition, pattern matching is generally not considered a type of
14
machine learning, although pattern-matching algorithms (especially with fairly general,
carefully tailored patterns) can sometimes succeed in providing similar-quality output to the
sort provided by pattern-recognition algorithms.
Pattern recognition is studied in many fields, including psychology, psychiatry,
ethnology, cognitive science, and traffic flow and computer science. Algorithms for pattern
recognition depend on the type of label output, on whether learning is supervised or
unsupervised, and on whether the algorithm is statistical or non-statistical in nature.
Statistical algorithms can further be categorized as generative or discriminative.
2.5 RELATED RESEARCHES
A CLUSTERING-BASED APPROACH FOR INTEGRATING DOCUMENT-
CATEGORY HEIRARCHIES
ABSTRACT
E-commerce applications generate and consume a tremendous amount of online
information, which is typically available as textual documents. Conceivably, organizations
and individuals generally use category sets or hierarchies to organize, archive, and access
their documents. Meanwhile, organizations and individuals constantly acquire relevant
documents from various Internet sources, each of which may organize its documents in a
category set or hierarchy different from that used by the acquiring organization or individual.
Consequently, the integration of source documents organized in a category hierarchy into an
existing category hierarchy deployed by the acquiring organization or individual becomes an
important issue in the e-commerce era. Existing category-integration techniques are mainly
designed to integrate document catalogs, each of which is organized non-hierarchically (i.e.,
in a flat set). The proposed system is a clustering-based category-hierarchy integration (CHI)
technique, which is an extension of the clustering-based category-integration (CCI)
technique. Our empirical evaluation results show that the proposed CHI technique appears to
improve the effectiveness of category-hierarchy integration compared with that attained by
non-hierarchical category-integration techniques, particularly in homogeneous and
comparable scenarios.
INFERENCE
This paper is intended to serve as a foundation for a continued research on category-
hierarchy integration. Additional research should be aligned strategically to enhance the
generalizability and the effectiveness of our proposed technique. First, as analysed in our
15
evaluation results, the non-hierarchical CCI and ENB techniques significantly outperform the
proposed CHI technique in the heterogeneous integration scenario. Therefore, the capability
of CHI to handle the integration of heterogeneous category hierarchies should be enhanced.
Second, our current evaluation study employs only one document corpus in a specific domain
(i.e., research articles). Evaluations of the proposed CHI technique using other document
corpora that pertain to more diversified domains (e.g., web pages and news stories) would
improve the generalizability of the evaluation results reported in this paper.
MULTI-TYPE FEATURES CO-SELECTION FOR WEB DOCUMENT
CLUSTERING
ABSTRACT
Feature selection has been widely applied in text categorization and clustering.
Compared to unsupervised selection, supervised feature selection is more successful in
filtering out noise in most cases. However, due to a lack of label information, clustering can
hardly exploit supervised selection. Some studies have proposed to solve this problem by
“pseudo class.” As empirical results show, this method is sensitive to selection criteria and
data sets. In this paper, they propose a novel feature co-selection for Web document
clustering, which is called Multi-type Features Co-selection for Clustering (MFCC). MFCC
uses intermediate clustering results in one type of feature space to help the selection in other
types of feature spaces. Our experiments show that for most selection criteria, MFCC reduces
effectively the noise introduced by “pseudo class,” and further improves clustering
performance.
INFERENCE
They have proposed Multi-type Features Co-selection for Clustering (MFCC), a novel
algorithm to exploit different types of features to perform Web document clustering. The
intermediate clustering result is used in one feature space as additional information to
enhance the feature selection in other spaces. Consequently, the better feature set co-selected
by heterogeneous features will produce better clusters in each space. After that, the better
intermediate result will further improve co-selection in the next iteration. Finally, feature co-
selection is implemented iteratively and can be well integrated into an iterative clustering
algorithm.
16
TOWARD INTEGRATING FEATURE SELECTION ALGORITHMS FOR
CLASSIFICATION AND CLUSTERING
ABSTRACT
This paper introduces concepts and algorithms of feature selection, surveys existing
feature selection algorithms for classification and clustering, groups and compares different
algorithms with a categorizing framework based on search strategies, evaluation criteria, and
data mining tasks, reveals un-attempted combinations, and provides guidelines in selecting
feature selection algorithms. With the categorizing framework, and continued our efforts
toward building an integrated system for intelligent feature selection. A unifying platform is
proposed as an intermediate step. An illustrative example is presented to show how existing
feature selection algorithms can be integrated into a meta algorithm that can take advantage
of individual algorithms. An added advantage of doing so is to help a user employ a suitable
algorithm without knowing details of each algorithm. Some real-world applications are
included to demonstrate the use of feature selection in data mining. It has been concluded that
this work by identifying trends and challenges of feature selection research and development.
INFERENCE
This survey provides a comprehensive overview of various aspects of feature
selection. They introduce two architectures —a categorizing framework and a unifying
platform. They categorize the large body of feature selection algorithms, reveal future
directions for developing new algorithms, and guide the selection of algorithms for intelligent
feature selection. The categorizing framework is developed from an algorithm designer’s
viewpoint that focuses on the technical details about the general procedures of feature
selection process. A new feature selection algorithm can be incorporated into the framework
according to the three dimensions. The unifying platform is developed from a user’s
viewpoint that covers the user’s knowledge about the domain and data for feature selection.
The unifying platform is one necessary step toward building an integrated system for
intelligent feature selection. The ultimate goal for intelligent feature selection is to create an
integrated system that will automatically recommend the most suitable algorithm(s) to the
user while hiding all technical details irrelevant to an application.
17
A PATTERN-RECOGNITION-BASED ALGORTHIM AND CASE STUDY FOR
CLUSTERING AND SELECTING BUSINESS SERVICES
ABSTRACT
Positioned as the backbone of service asset management console, a service registry
has to enable real-time and offline service selection in an effective manner. This paper
presents an analytic algorithm that is used to guide the architectural design of service
exploration in a service registry. Service assets are proposed to be framed into a well-
established categorical structure based on pattern recognition algorithm. This design aims to
provide systematic methodology and enablement architecture for analyzing, clustering, and
adapting heterogeneous services for dynamic application integration. The exploitation of
pattern recognition algorithm maps a large amount of services into a manageable feature
space, which consists of attributes that are related to static description and dynamic features,
such as historical QoS and service-level agreement. The proposed architecture and associated
service exploration methodology have been integrated into an industry strength service-
oriented architecture solution design platform. They have also presented a case study using
the developed platform to illustrate the proposed algorithm for business service clustering and
selection.
INFERENCE
In this paper, they have addressed the scalability issue arising from designing the
service asset management system. They have proposed a service organization structure,
referred to as multilevel service hierarchy, which can be used as the back- bone storage
architecture underlying the service registry. Then it has been systematically studied the
construction and management schemes for this service hierarchy. The pattern recognition
algorithms have applied to exploit the embedded service attributes. This research is essential
to our ongoing effort in building a comprehensive service asset management platform for the
SOA service solution design. There are several issues left to be resolved.
18
SEMANTICS-BASED AUTOMATED SERVICE DISCOVERY
ABSTRACT
A vast majority of web services exist without explicit associated semantic
descriptions. As a result many services that are relevant to a specific user service request may
not be considered during service discovery. In this paper, addresses the issue of web service
discovery given non-explicit service description semantics that match a specific service
request. Our approach to semantic based web service discovery involves semantic-based
service categorization and semantic enhancement of the service request. The proposed system
provides a solution for achieving functional level service categorization based on an ontology
framework. Additionally, clustering is used for accurately classifying the web services based
on service functionality. The semantic-based categorization is performed offline at the
universal description discovery and integration (UDDI). The semantic enhancement of the
service request achieves a better matching with relevant services. The service request
enhancement involves expansion of additional terms (retrieved from ontology) that are
deemed relevant for the requested functionality. An efficient matching of the enhanced
service request with the retrieved service descriptions is achieved utilizing Latent Semantic
Indexing (LSI). Our experimental results validate the effectiveness and feasibility of the
proposed approach.
INFERENCE
In this paper, they present an integrated approach for automated service discovery.
Specifically, the approach addresses two major aspects related to semantic-based service
discovery: semantic-based service categorization and semantic-based service selection. For
semantic-based service categorization, and provides an ontology guided categorization of
web services into functional categories for service discovery. This leads to better service
discovery by matching the service request with an appropriate service description. For
semantic-based service selection, ontology linking is employed (semantic web) and LSI thus
extending the indexing procedure from solely syntactical information to a semantic level.
19
CHAPTER 3
PROBLEM DEFINITION
The end-user have special interests on web service and its functionality, several logics
and techniques were proposed by researchers for discovering the web service that satisfy the
user needs. In such way, researchers proposed the service discovery process using the
concepts of Dual Clustering, Service Matching and etc. Web services that are appropriate to a
user specific request are usually not considered in discovering the exact service since they are
present without explicit related semantic descriptions. In our approach, actually deal with the
issue of service discovery provided non-explicit service description semantics that match a
particular service request.
3.1 EXISTING SYSTEMS
3.1.1 Service Matching Technique
Similar web services identification process is now becoming increasingly an
important issue to make sure the accomplishment of dynamically integrated Web-service-
based applications. Ontology-based models are applied for the improvement of the searching
capabilities in Agent Systems. Matchmaking troubles happen when a service is being
requested and it comprises the distance calculation between the required service description
and also from the service registry. The problems in service and resource matching is being
keenly discussed at present as one of important new challenging job for the next generation
of semantic discovery approaches for Web agents and Web services. Correct ontology-based
matching tools to be used to effectively integrate Agent, Grid Services, and Web Service
technologies with each other. A categorization-based scheme is used to identify the identical
Web services that could function on diverse domain ontologies. By using ontology instance
categorization concept, the matching system states whether a given Web service is a possible
replacement and then it also adapts itself by the process of enhancing with the recognized
ontologies by means of the newly determined ontology instances.
20
3.1.2 Dual Clustering Technique
Increasingly, service requestors look for the ability to seek for existing Web services
in large Internet-based warehouse and the main objective is to retrieve services that match the
user’s requirements. With the repositories increasing in number of services and the
challenges in rapid finding the accurate ones, the call for clustering associated services
becomes evident to improve search engine results with a record of similar services for each
hit. Spatial clustering has various applications. In most conventional clustering issues, the
geometric attributes are taken into consideration for the similarity measurement. Users are
generally concerned about the non-geometric attributes in many real applications. The input
data set is partitioned into numerous compacted regions in conventional spatial clustering,
and data points which are comparable to one another in their non-geometric attributes may be
scattered over various regions, hence making the corresponding objective complicated to
reach. Dual clustering is remedy to this. Constraint domain specifies the application
dependent and attributes present in the optimization domain are those intricate in the
optimization of the objective function. The information in both domains is combined by the
ICC algorithm and clustering algorithm on the optimization domain is iteratively performed.
3.2 DRAWBACKS OF EXISTING SYSTEM
The existing systems although have benefits like combining the information from
both domain and matching the service for perfect replacement but they fail to provide the
related service in terms of functionality i.e in terms of meaning what the service is going to
perform. Most of the time the systems work based on the syntactic concept. Majority of the
service description that exists so far are syntactic in nature. The same syntax might be used at
different place for various purposes. Actually when a service is requested only small amount
of services that are an exact syntactical match of the request is selected. And the selected
services may perform different functions rather than the requested functions and the
discovery process is limited by its dependence of human involvement for selecting right
service based on its functionality.
21
3.3 PROPOSED SYSTEM
The existing system has the advantages like optimization of domains and replacement
of the web services but lacks in the important properties such as the functionality based
categorization and providing the appropriate service to the user. The proposed system deal
with the issue of service discovery provided non-explicit service description semantics that
match a particular service request.
The proposed system involves semantic-based service categorization which is
performed at the UDDI with a key for achieving the service categorization at functional level
based on an ontology skeleton. Our proposed work is to categorize the services based on their
functionality and they are clustered based on their category along with related ontology
concepts. The pattern recognition algorithm is used for retrieving the exact service that
matches the user request.
3.3.1 Semantic Categorization
This approach the process begins with the semantic categorization of services in the
UDDI where the ontology concepts are used. The semantic categorization is achieved by
adding a user-defined tag in WSDL file, so for a particular search keyword can make the
services to fall under a given category. A single service can be made to appear in different
categories by implementation of the ontology concepts that identifies the relationships. This
user defined tag is given by the service provider and it is based on the functionality of the
service. Web service description vectors are built and the markups and index entries
are removed. T he web service vector development generally includes the parsing
of the WSDL file forms part of the initial WSDL set and its parallel description and
also the related parameters. Web Service Vector Modification is being done by
enhancing the service vectors with the concepts from the core ontology resolves issues
related to synonyms and induces domain related concepts that provide the context.
First, the initial service vector is added with relevant concepts of ontology. The architecture
design of our proposed system as follows.
3.3.2 Clustering
The next step of this phase involves deleting irrelevant terms based on the ranking of
semantic relationships among the terms. Grouping of functionally similar services together is
performed by the process of clustering of service vectors. Clustering is actually done in
hierarchical manner facilitates classification of all the services, such that each secondary
22
cluster and the combinations of secondary clusters create a hierarchy—a structure that is
more informative than the set of clusters which are unstructured. The service categorization
is performed manually and only service request enhancement is performed during runtime,
thus increase in the timing delay will not be significant. The relevant ontology concepts are
associated to clusters during the creation. The association of concepts to each cluster help
web service discovery by mapping to functional categories. A cluster is defined as Ѳ i = cj
where, cj is the corresponding ontology concept. The ontology concepts deliver semantic for
web service categorization.
23
CHAPTER 4
SOFTWARE REQUIREMENTS SPECIFICATION
The software requirement specification document enlists all necessary
requirements for project development. To derive the requirements a clear and thorough
understanding of the products to be developed.
4.1 PURPOSE
The main purpose of the Software requirement specification is to provide
technical, functional and non functional features. A Semantic based web system is built
mainly based on the existing web service architecture and by implementing user defined tags.
4.2 SCOPE
The main aim is to build a Semantic based Web System built will reduce the
effort of the user by providing at a more appropriate service, thereby producing a system
which will tend to learn from the observed set of services that are provided. So when the
system is projected towards a new set of service it will be easier for them to provide it.
4.3 SYSTEM REQUIREMENT
4.3.1 Hardware Requirements
The most common set of requirements defined by any operating
system or software application is the physical computer resources called as hardware
requirements. A hardware requirements list is often accompanied by a hardware compatibility
list (HCL), especially in case of operating systems. An HCL lists tested, compatible, and
sometimes incompatible hardware devices for a particular operating system or application.
Our proposed system requires the following hardware requirements.
Processor : Intel Pentium D or higher
Random Access Memory : 256 MB
Secondary Memory : 100 MB
Display : Color Monitor
Keyboard : Windows OS Compatible
24
Mouse : Windows OS Compatible
Network : 512 Kbps Internet / 100 Mbps LAN
4.3.2 Software Requirements
Software requirements deal with defining software resources and prerequisites
that need to be installed on a computer to provide optimal functioning of an application.
In computing, a platform describes some sort of framework, either in hardware or software,
which allows software to run. Our proposed system requires the following software
requirements:
Operating System : Windows XP Service Pack 2 or Advance
Coding Environment : Java
Tool : Eclipse
Database : Microsoft Access
4.4 FEATURES OF PROPOSED SYSTEM
The proposed system differs from existing systems in the fact that it does provide the
exact service as requested by the user for any input regarding the auction. Here in our
approach is being started with the semantic categorization of services in the UDDI where the
ontology concepts are used. The semantic categorization is achieved by adding a user-defined
tag in WSDL file.
The next step of this phase involves deleting irrelevant terms based on the ranking of
semantic relationships among the terms. Grouping of functionally similar services together is
performed by the process of clustering of service vectors. The service has to be selected from
the cluster containing number of services. For searching the appropriate service the pattern
recognition algorithm is used.
4.5 LIMITATIOM OF PROPOSED SYSTEM
The proposed system can also be extending by work for web service composition.
Typically, multiple services have to be discovered so that they together match a service
request. It should be possible to utilize ontologies, and explicitly return the sequence of
individual service invocations to be performed in order to achieve the desired composite
25
service. When no full match is possible, a flexible matching approach could be created to
return partial matches and/or suggest additional inputs that would produce a full match by
capturing the dependencies among the matched services. This has several interesting research
issues. Another avenue for future work is to create an interactive, intelligent service
composer that is semantically guided to locate the target service components step by step.
26
CHAPTER 5
OBJECT ORIENTED ANALYSIS
5.1 USE CASE DIAGRAM
The following use case diagram depicts the way in which the input parameters are fed
into the database which is required to train the system.
Figure 5.1 Use Case Diagram
The activity diagram of the clustered based approach using pattern recognition System
consists of four main actors (admin, channel, portal, and database) and the event occurring
between them are listed above. The Admin takes control of the entire process and provide
credential to the channel and portal subscribers. These actor will establish various
relationship between them for activity occurring or retrieve the requested services as user
needed. The channel provider is generally used to register the channel and the portal is used
to validate the service.
Validate
Create Account
Rule Setting
IPTV portal
channel credential
package list
Login
Package Explorer
channel
Portal
Admin
27
5.2 SEQUENCE DIAGRAM
The sequence diagram is a kind of interaction diagram that shows how processes
operate with one another and in what order. Sequence diagrams are sometimes called Event-
trace diagrams, event scenarios, and timing diagrams. It shows how the system is trained with
the input data so as to make a more intellectual system that will be able to predict the
different stages in the renal disorder in a more accurate way.
Figure 5.2 Sequence Diagram
Sequence diagram of the clustered based approach using pattern recognition System consists
of four main actors (admin, channel, portal, and database) and the event occurring between
them are listed below hence sometimes called Event-trace diagrams, event scenarios, and
timing diagrams. These actor will establish various relationship between them for event
occurring or retrieve the requested services as user needed.
AdminAdmin PortalPortal ChannelChannel DatabaseDatabase
1: Login
2: Set Credential
3: Set Credential
4: Register Movies
5: Validate
6: Update XML
7: Rule Setting
8: Search Similar Service
9: Provide Similar Service
28
5.3 ACTIVITY DIAGRAM
An activity diagram illustrates the dynamic nature of a system by modeling the flow
of control from activity to activity. It depicts the overall flow of the training process and the
testing process to obtain a best probabilistic approach. An activity diagram illustrates the
dynamic nature of a system by modeling the flow of control from activity to activity. Because
an activity diagram is a special kind of state chart diagram, it uses some of the same modeling
conventions. The above activity diagram describes different modules. This diagram initially
will have a login page of channel, admin, portal etc. The purpose of channel register is that
we can add new or different user for the listed channels. After registering it the user can
retrieve the needed information through the internet. The purpose of portal is to validate
whether services get registered properly to the UDDI registry in the xml format.
Figure 5.3 Activity Diagram
29
5.4 CLASS DIAGRAM
The class diagram represents the classes with its attributes and operations. This
projects Explains the classes of admin, channel, portal and Package Exporter, where each and
every module has different functions and attributes. The admin has various functions like
channel credential, portal credential, UDDI registry and rule setting, where as the portal has a
function of validation or check constrain whether the services get registered properly in
UDDI in the xml format or WSDL file tag form. The package Exporter as a function of
retrieving the services that user needs.
Figure 5.4 Class Diagram
30
5.5 DEPLOYMENT DIAGRAM
The deployment diagram represent the integration of modules which are been
developed in our project. The bidders are linked with the bidding environment using the User
Interface. The user interface directs the user data to access the database through the modem.
The modem is present at both bidder end and seller end. This modem helps to act as a
medium for the communication medium through wired-mode or wireless mode for services
from internet.
The requestor modem directs the request message to the user interface at requestor
end. This interface is being used by the request to provide the services to the internet
environment. The requestor data also access the database and through the interfaces and the
modem. Thus the database in being the center of auction environment to provide the data
services to the bidders and the requestor through the gateway of the networking.
Figure 5.5 Deployment Diagram
31
CHAPTER 6
OBJECT ORIENTED (PROJECT) DESIGN
6.1 ARCHITECTURAL DESIGN
Architecture diagram shows the relationship between different components of system.
This diagram is very important to understand the overall concept of system. Architecture
diagram shows the relationship between different components of system. This diagram is
very important to understand the overall concept of system. The architecture diagram in the
software project is nothing but the diagrammatic representation of the internal features of the
project.
Figure 6.1 Architectural Design of the Proposed System
In our approach is being started with the indexing the WSDL dataset which are
associated with the semantically categorized services in the UDDI where the ontology
concepts are employed. By adding a user-defined tag in WSDL file the semantic
categorization is achieved, so for a particular search keyword can make the services to fall
under a given category. A single service can be made to appear in different categories by
implementation of the ontology concepts that identifies the relationships. This user defined
Web
Service
Registry
UDDI
Semantic Matching
WSDL Dataset
Service
Categorization
Clustering
WSDL Parameter
Association Rule
Associati
on
Pattern
Discovery
Semantic
Relation
Expanded
Request
Ontology
Concept
WS
Request
32
tag is given by the service provider and it is based on the functionality of the service. Web
service description vectors are built and the markups and index entries are removed.
T he web service vector development generally includes the parsing of the WSDL
file forms part of the initial WSDL set and its parallel description and also the related
parameters. Web Service Vector Modification is being done by enhancing the service
vectors with the concepts from the core ontology resolves issues related to synonyms
and induces domain related concepts that provide the context.
Clustering
Grouping of functionally similar services together is performed by the process of
clustering of service vectors. Clustering is actually done in hierarchical manner facilitates
classification of all the services, such that each secondary cluster and the combinations of
secondary clusters create a hierarchy—a structure that is more informative than the set of
clusters which are unstructured. The service categorization is performed manually and only
service request enhancement is performed during runtime, thus increase in the timing delay
will not be significant. The relevant ontology concepts are associated to clusters during the
creation.
Parameter-Based Service Refinement
This phase actually used to select service from the related group of services. Input,
output, and the description, of web service help service refinement process through narrowing
the set of appropriate services matching the service request. The statistical associations are
generally used to represent the relationship between the web service input and output
parameters. The parameters relationship pattern item set for all the web services within the
cluster is being built. Then the corresponding WSDL document is processed to retrieve t he
relevant service parameters. The weights are assigned for each o f the parameters by user
to refine the request. The ranking process is made more flexible by this. T h e binary
values are actually assigned to the ranking parameters which is an important task. The
association pattern mining phase generates a large number of association patterns and the
patterns having unrelated information that will negatively influence the service discovery
process have to be discarded.
33
Service Search
The next phase for our proposed system is to search the appropriate service based on the user
request. The service has to be selected from the cluster containing number of services. For
searching the appropriate service pattern recognition algorithm is used.
34
6.2 USER INTERFACE DESIGN
The user interface design expresses the interface between the user and the system. Using the
interface design, the user will interact to the system and get the service.
Fig 6.2 Index Page
Fig 6.3 Admin Login
35
Fig 6.4 Channel Registration
Fig 6.5 Portal Registration
36
Fig 6.6 Pattern Setting
Fig 6.7 Channel Login
37
Fig 6.8 Retrieve Service
Fig 6.9 Alert Message
38
6.3 DATA DESIGN
The data design provided in the project database to accept the given data and to
process it. In our project during the registration phase, the user name and the password is
registered ansd it is saved in database. Along with the username name and the password , the
mode is selected ,whether read or write mode.
Login table
S.no Column name Description Size Datatype Remark
1. U_name Username 15 Varchar Primary
2. Pwd Password 15 Varchar Notnull
Registration table
Table 6.10 Data Design
S.no Column name Description Size Datatype Remark
1 chan Channel Name 15 Varchar Primary
2 gen Genre 15 Varchar Not null
3 imdb IMDB 10 Varchar Not null
39
CHAPTER 7
DESIGN OF WORKING MODEL
7.1 CLUSTERING
Grouping of functionally similar services together is performed by the process of
clustering of service vectors. Clustering is actually done in hierarchical manner facilitates
classification of all the services, such that each secondary cluster and the combinations of
secondary clusters create a hierarchy—a structure that is more informative than the set of
clusters which are unstructured. The service categorization is performed manually and only
service request enhancement is performed during runtime, thus increase in the timing delay
will not be significant. The relevant ontology concepts are associated to clusters during the
creation. The association of concepts to each cluster help web service discovery by mapping
to functional categories. A cluster is defined as Ѳ i = cj where, cj is the corresponding ontology
concept. The ontology concepts deliver semantic for web service categorization. Then build a
set which contains all concepts that exist in at least one service description and remove the
concepts that makes again. This is followed by locating the places of the remaining
concepts in the concept hierarchy Hc. Each concept is checked for subsumes or subsumed
relationship with the elements of the set. The resultant super concept is then mapped to the
cluster. The process of association of the ontology concept to the cluster extends semantic
information in UDDI is done by the creation of tModels for the associated web services of
the cluster within the registry. The relateOntologyCluster algorithm is given above.
ALGORITHM
relateOntologyCluster
Input: Web Service Description clusters set Ѳ = {Ѳ1,Ѳ2.,…., Ѳn},
Min. Term Frequency Threshold µ
Output: Modified UDDI tModels
1. begin
2. For each Web Service cluster Ѳi do
3. Retrieve modified Web Service vector wsm € Ѳi do
4. Calculate term frequency ( tj , xj ) where tj ⊆ Ѳi
5. if xj < µ
40
6. delete tj
7. Map tj ≈ cj
8. Traverse Hc for upper ontology concept C
8. if the term concept is subsumed by the upper concept cj ⊆ C
9. CѲ = C
10. else
11. CѲ = cj
12. Map CѲ to Ѳi
13. for each Web Service wk € Ѳi
14. Update tModelm to include CѲ
15. end for
16. end
7.2 SERVICE SEARCH
The important phase for our proposed system is to search the appropriate service based on the
user request. The service has to be selected from the cluster containing number of services.
For searching the appropriate service pattern recognition algorithm is used[9]. The search
cluster is given below.
ALGORITHM
searchClusters(hN,S,FSi)
hN - a head node intended for the linked list containing of a collection of sibling
clusters.
S – the service to be searched in the cluster
FSi – feature section to be used for comparing similarity
1: hPtr = hN Next;
2: σ = α;
3: while hPtr ≠ Null
4: if σ > Dist(hPtr data,S|FSi)
5: σ = Dist(hPtr data,S|FSi);
41
6: Cl = hPtr data;
7: end if
8: hPtr = hPtr Next;
9: end while
10: return Cl
The above algorithm provides the method for searchClusters, which has to iterate the linked
list which is being headed by hN and the cluster having the minimum distance with S is also
found.
Dist(hPtr data,S|FSi) – distance between S and cluster
hPtr data – cluster is pointed using this
42
CHAPTER 8
EXPERIMENTAL DESIGN
The input design is the link between the information system and the user. It comprises
the developing specification and procedures for data preparation and those steps are
necessary to put transaction data in to a usable form for processing can be achieved by
inspecting the computer to read data from a written or printed document or it can occur by
having people keying the data directly into the system. The design of input focuses on
controlling the amount of input required, controlling the errors, avoiding delay, avoiding
extra steps and keeping the process simple. The input is designed in such a way so that it
provides security and ease of use with retaining the privacy. Input Design considered the
following things:
What data should be given as input?
How the data should be arranged or coded?
The dialog to guide the operating personnel in providing input.
Methods for preparing input validations and steps to follow when error occur.
OBJECTIVES
1. Input Design is the process of converting a user-oriented description of the input into a
computer-based system. This design is important to avoid errors in the data input process and
show the correct direction to the management for getting correct information from the
computerized system.
2. It is achieved by creating user-friendly screens for the data entry to handle large volume of
data. The goal of designing input is to make data entry easier and to be free from errors. The
data entry screen is designed in such a way that all the data manipulates can be performed. It
also provides record viewing facilities.
3. When the data is entered it will check for its validity. Data can be entered with the help of
screens. Appropriate messages are provided as when needed so that the user will not be in
maize of instant. Thus the objective of input design is to create an input layout that is easy to
follow.
43
OUTPUT DESIGN
A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users and to
other system through outputs. In output design it is determined how the information is to be
displaced for immediate need and also the hard copy output. It is the most important and
direct source information to the user. Efficient and intelligent output design improves the
system’s relationship to help user decision-making.
1. Designing computer output should proceed in an organized, well thought out manner; the
right output must be developed while ensuring that each output element is designed so that
people will find the system can use easily and effectively. When analysis design computer
output, they should Identify the specific output that is needed to meet the requirements.
2. Select methods for presenting information.
3. Create document, report, or other formats that contain information produced by the system.
The output form of an information system should accomplish one or more of the following
objectives.
Convey information about past activities, current status or projections of the
Future.
Signal important events, opportunities, problems, or warnings.
Trigger an action.
Confirm an action.
The user registration module explains the design and implementation of user registration via
web based services. This module wills also communication established between client and
web based service. The semantic categorization of UDDI wherein we combine ontology with
an established hierarchical clustering methodology, following the service description vector
building process. For each term in the service description vector, a corresponding concept is
located in the relevant ontology. If there is a match, the concept is added to the description
vector. The next step is service selection from the relevant category of services using
parameter-based service refinement. Web service parameters, i.e., input, output, and
description, aid service refinement through narrowing the set of appropriate services
matching the service request. The relationship between web service input and output
parameters may be represented as statistical associations. These associations relay
information about the operation parameters that are frequently associated with each other.
The parameter-based refined set of web services is then matched against an enhanced service
44
request as part of Semantic Similarity-based Matching. A key part of this process involves
enhancing the service request. Our approach for web semantic similarity-based service
selection employs ontology-based request enhancement and LSI based service matching.
The basic idea of the proposed approach is to enhance the service request with
relevant ontology terms and then find the similarity measure of the semantically enhanced
service request with the web service description vectors generated in the service refinement
phase. A large number of web services structure a service oriented architecture and facilitate
the creation of distributed applications over the web. These web services offer various
functionalities in the areas of communications, data enhancement e-commerce, marketing,
utilities among others. Some of the web services are published and invoked in-house by
various organizations. These web services may be used for business applications, or in
government and military. However, this requires careful selection and composition of
appropriate web services. The web services within the service registry (UDDI) have
predefined categories that are specified by the service providers. Services may be listed under
different categories.
Login
Channel
End
Package Exporter(reterieve movie name)
Portal Admin
Vaalidate
Rule Setting
Channel Credential
IPTV Portal
Package List
Figure 8.1 Work Flow Diagram
45
CHAPTER 9
EXPERIMENTAL RESULTS AND DISCUSSIONS
The results improve with an increase in the number of clusters. These results validate
the scalability of our approach. This may be explained by an increase in the purity of clusters
with lesser number of service descriptions in comparison to that of a cluster with maximum
number of service descriptions for individual categories. Another aspect of our evaluation
deals with the frequency of service categorization for the entire UDDI. The perform service
categorization on an incremental basis. And assume that the ontology is not perfect and that
the ontology is updated to represent additional domain objects and their interrelationships.
Then the categorization must be performed every time a newer service is added to the UDDI.
However, periodic categorizations may be required if the service additions are frequent, as
can be expected in real-life situations with large user and provider communities.
However, it can be updated the service category by isolating the upper ontology
concept that remains unchanged and then recategorizing all the services that fall in its child
concepts. When evaluating the efficiency of our approach, there are a number of factors that
affect the timings obtained viz., the size of the underlying ontology and the number of service
to be categorized. For evaluating the analytical complexity of the proposed service
categorization approach, let n represents the total number of concepts that form the ontology
and m represents the total number of web services. For searching a specific concept in the
ontology, O(log n) search operations need to be performed. The add operation for including
the relevant concepts for each web service occurs in constant time. For this reason the
standard representation of our approach for service categorization would be O½mðlogn þ
nÞ_.
To maintain the balance between the generality and the specificity of terms in web
service descriptions. This is achieved by expansion of the term vectors with relevant ontology
concepts and subsequent reduction of terms from the web service descriptions. The results
follow those observed in the add set of experiments. The technique, where in ontology
concepts are added to all terms of web service descriptions followed by pruning, results in
increased generality. The best results can be compared to all techniques were observed in the
technique, where in ontology concepts are added to relevant terms of web service
descriptions followed by pruning. These results might provide an increase in specificity and
reduction of generality of the terms in web service description.
46
Semantic Similarity-Based Matching
For evaluating our approach for semantic similarity-based service discovery, to set out
to discover relevant services for an average of ten service requests. The initial discovery is
based on a smaller number of WSDL files with a focus on precision. The next discovery
experiment examines a larger section of WSDL files with a focus on maximizing recall. In
order to assess the impact of service request expansion with relevant terms from ontology
concepts on service discovery, then compare the cosine measure-based similarity scores of
the two different service selection methods; with enhanced service request and original
service request. The expanded service requests, thus, facilitate improved differentiation
between the appropriate services and the rest of the services on account of the higher score
differences indicating a better match to the service request. The ranking of the services
change as more dimensions are added to the service collection under consideration.
Performance
We have compared the time taken to match a Service Request with a web service
description within service sets that include 1) predefined categories, 2) semantic
categorization, and 3) entire service set or the set of uncategorized services. The basis of this
experiment is to validate our approach for an ontology guided web service Categorization. It
was observed that the time taken for service matching within pre-defined categories,
semantically categorized (our approach) and uncategorized services was 2.58, 3.65, and 406.8
seconds, respectively. We observe that our approach provides a balance in terms of quality of
the service selected and also the time taken for matching of an appropriate service. The
observed time for service discovery seems acceptable, especially given that most of the time
users will submit more incremental, and hence less time consuming requests. The time it
takes to load the system though could be improved. In the future, we plan to further evaluate
the scalability of our approach, along with detailed experimentation with actual users to fine
tune the way in which our integrated functionality is presented and to eventually evaluate the
full benefits of our approach from a performance and solution quality standpoint.
Deployment
In the existing architecture, the service provider/requestor accesses the UDDI through
an application server. To deploy our approach, it is needed to enhance this by incorporating a
semantic application server as well as an ontology repository. The Application Server now
47
executes our approach to select the most suitable services based on semantics processed by
the Semantic Application Server in conjunction with the ontology repository. The Semantic
Application Server should include an ontology reasoned (e.g., Racer) that utilizes description
logics to load and query ontologies to extract the relevant concepts for semantic
categorization of web service descriptions and enhancement of service requests. Since our
proposed work considers semantic functionality of web services for service discovery and
ranking, it is not explicitly address other QoS measures such as trust and reputation.
However, this can be easily incorporated as follows: for example, a trust and reputation
registry could be integrated with the UDDI server. Depending upon the number of web
services and service requests, it may need to use XML gateway devices to offload the work
of parsing and transformation of XML to reduce the computational burden.
48
CHAPTER 10
CONCLUSION
Thus the proposed system provides a novel approach which deals with the service
discovery and addresses two of the major aspects namely: categorization of services based on
their functional semantics and the clustering of the service using related ontology concepts.
The pattern recognition algorithm is implemented here to identify the appropriate service
from the cluster. Hence the proposed system satisfies the user needs by providing the
appropriate service as requested by the user.
In future, our approach can be extended to allow service requests that are formed
using specialized query languages and these requests are matched to semi annotated services
that are described using formats such as SAWSDL, OWL-S among others. Typically,
multiple services have to be discovered so that they together match a service request. It
should be possible to utilize ontologies, and explicitly return the sequence of individual
service invocations to be performed in order to achieve the desired composite service.
A flexible matching approach could be created to return partial matches and/or
suggest additional inputs that would produce a full match by capturing the dependencies
among the matched services. Another avenue for future work is to create an interactive,
intelligent service composer that is semantically guided to locate the target service
components step by step. It is also intended to extend our ontology framework and
investigate additional mapping tools to better express a service request to search for relevant
concepts. As part of the service discovery process, it explores associating semantic weights to
the retrieved set of web services for effective semantic ranking of the results.
49
REFERENCES
[1] Anton Naumenko, Sergiy Nikitin, Vagan Terziyan, 2006, “Service matching in agent
systems”, Springer Science+Business Media, LLC.
[2] Hui Xiong, Junjie Wu, and Jian Chen, 2009, “K-Means Clustering Versus Validation
Measures: A Data-Distribution Perspective”, Transactions on Systems, Man, and
Cybernetics—Part B: Cybernetics, Vol. 39, No. 2.
[3] M.A .Corella and P.Castells, 2006, “Semi-Automatic Semantic-Based Web Service
Classification,” Proc. Int’l Conf. Business Process Management Workshops (BPM ’06).
[4] www.ibm.com/developerworks/tutorials/wesws511pt1/wesws511 pt1-pdf.pdf
[5] Liang and Herman Lam, July 2008"Web Service Matching by Ontology Instance
Categorization", IEEE International Conference, Services Computing.
[6] Jian Wen, Zhoujun Li and Xiaohua Hu, 2007, “Ontology Based Clustering for Improving
Genomic IR”, International Symposium on Computer-Based Medical Systems, 2007. CBMS.
[7] Cheng-Ru Lin, Ken-Hao Liu, and Ming-Syan Chen, 2005, “Dual Clustering: Integrating
Data Clustering over Optimization and Constraint Domains”, IEEE Transactions on
Knowledge and Data Engineering, Vol. 17, No. 5.
[8] Alireza Zohali, DR.Kamran Zamanfiar, 2005 – 2009, “Matching Model for Semantic
Web Services Discovery”, Journal of Theoretical and Applied Information Technology,
JATIT.
[9] Liang-Jie Zhang, Shuxing Cheng, Carl K. Chang, and Qun Zhou, 2012, “A Pattern-
Recognition-Based Algorithm and Case Study for Clustering and Selecting Business
Services”, IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and
Humans, Vol. 42, No. 1.
[10] Tsang-Hsiang Cheng and Chih-Ping Wei, March, “A Clustering-Based Approach for
Integrating Document-Category Hierarchies”, IEEE Transactions on Systems, Man, and
Cybernetics—Part A: Systems and Humans, Vol. 38, No. 2.
[11] Aabhas V. Paliwal, Basit Shafiq, Jaideep Vaidya, Hui Xiong, and Nabil Adam, 2012,
“Semantics-Based Automated Service Discovery”, IEEE Transactions on Services
Computing, Vol. 5, No. 2.
[12] Michael Forbes, Jim Lawrence, Yu Lei, Raghu N. Kacker and D. Richard Kuhn, 2008,
“Refining the In-Parameter-Order Strategy for Constructing Covering Arrays”, Journal of
Research of the National Institute of Standards and Technology, Volume 113, Number 5.
50
[13] Fei You, Qingxi Hu, Yuan Yao, Gaochun Xu, Minglun Fang, 2009, “Study on Web
Service Matching and Composition Based on Ontology”, WRI World Congress on Computer
Science and Information Engineering, Volume 4.
[14] Jie Liu, Jigui Sun and Shengsheng Wang, 2006, “Pattern Recognition: An overview”,
IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.6.
51
PUBLICATIONS
TITLE : Cluster based Approach for Service Discovery using Pattern Recognition
Publication : International Journal of Advanced and Innovative Research.
ISSUE : Volume 2 Issue 2
Month : FEBRUARY 2013, PAGE 345
ISSUE ISBN : 2278-7844
52
APPENDIX-1
SAMPLE CODING
Linearprog.java
import java.sql.*;
import java.lang.*;
import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
import java.sql.*;
import java.util.*;
import java.io.*;
class linearprog extends JFrame implements ActionListener
{
JLabel l1,l2,l3,l4,l5,l6,l7;
JTextField t1,t2,t3,t4,t5,t6,t7;
String str,str1,str2;
JButton b1,b2,b3,b4;
JComboBox c1;
JLabel lite,lerr;
JTextField ite,err;
JLabel lbid,lbidtime;
JTextField bid,bidtime;
double bamt,btime;
double dstr2;
double xavg=0; double yavg=0;
double alpha=0; double beta=0;
53
Connection con=null;
ResultSet rs=null;
public linearprog()
{
super("Neural Prediction- Online Auction");
Container c = getContentPane();
c.setLayout(new GridLayout(8,2,10,10));
JPanel inp = new JPanel(new GridLayout(1,9,10,10));
lite = new JLabel("Iterations");
ite = new JTextField("500");
inp.add(lite);
inp.add(ite);
lerr = new JLabel("Error Rate");
err = new JTextField("0.5");
inp.add(lerr);
inp.add(err);
String alg[]= {"Online Auction"};
c1 = new JComboBox(alg);
// inp.add(c1);
c.add(c1);
c.add(inp);
lbid = new JLabel("Bid Amount");
c.add(lbid);
bid = new JTextField();
c.add(bid);
lbidtime = new JLabel("Remaining Time");
54
c.add(lbidtime);
bidtime = new JTextField();
c.add(bidtime);
l2 = new JLabel("Alpha");
c.add(l2);
t2 = new JTextField();
c.add(t2);
l3 = new JLabel("Beta");
c.add(l3);
t3 = new JTextField();
c.add(t3);
l4 = new JLabel("Predicted Price");
c.add(l4);
t4 = new JTextField();
c.add(t4);
b1 = new JButton("Load");
c.add(b1);
b2= new JButton("Train");
c.add(b2);
b3 = new JButton("Calculate");
c.add(b3);
b4 = new JButton("Exit");
c.add(b4);
setSize(600,600);
55
b1.addActionListener(this);
b2.addActionListener(this);
b3.addActionListener(this);
b4.addActionListener(this);
setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
b2.enable(false);
b3.enable(false);
}
public void actionPerformed(ActionEvent e)
{
String s = e.getActionCommand();
if(s.equals("Load"))
{
btime = Double.parseDouble(bidtime.getText());
bamt = Double.parseDouble(bid.getText());
Dataset s1=new Dataset(bamt,btime);
s1.show();
b2.enable(true);
}
if(s.equals("Train"))
{
String s1=bid.getText();
int i = Integer.parseInt(s1.trim());
calavg(i);
b3.enable(true);
}
if(s.equals("Calculate"))
{
double pprice = bamt+ dstr2;
double i=Double.parseDouble(ite.getText());
t4.setText(String.valueOf(pprice));
}
56
if(s.equals("Exit"))
{
System.exit(0);
}
}
public void calavg(int x)
{
String error=null;
// tempx,tempy;
double tempx,tempy;
//int sx=0;int sy=0;
//int sxx=0;int sxy=0;
double sx=0;
double sy=0;
double sxx=0;
double sxy=0;
String s1=null; String s2=null;
try
{
System.out.println("\nAttempting to load JDBC Driver....");
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
System.out.println("JDBC Driver loaded...");
System.out.println("Connecting to database...");
con=DriverManager.getConnection("jdbc:odbc:ebay","","");
System.out.println("Database connection established");
}//end of try
catch (Exception sqle)
57
{System.out.println("Unable to load driver..."+sqle);}
try
{
//String queryString=("SELECT * FROM auction;");
String queryString=("SELECT * FROM auction where bid> "+bamt+";");
Statement stmt=con.createStatement();
rs=stmt.executeQuery(queryString);
while (rs.next())
{
s1=rs.getString("bid");
s2=rs.getString("bidderrate");
//tempx = Integer.parseInt(s1.trim());
//tempy = Integer.parseInt(s2.trim());
tempx = Double.parseDouble(s1.trim());
tempy = Double.parseDouble(s2.trim());
sx=sx+tempx;
sy=sy+tempy;
sxx=sxx+(tempx*tempx);
sxy=sxy+(tempx*tempy);
} //end of while
System.out.println("\n");
System.out.println("Training the Network...\n");
double t=10;
beta=((t*sxy)-(sx*sy))/((t*sxx)-(sx*sx));
58
double yavg=sy/t;
double xavg=sx/t;
alpha=yavg-(beta*xavg);
double y;
y=alpha+(beta*x);
double e = Double.parseDouble(err.getText());
double dstr= Math.abs(alpha) * e;
str= Double.toString(dstr);
double dstr1= Math.abs(beta) * e;
str1= Double.toString(dstr1);
dstr2= Math.abs(y) * e;
str2= Double.toString(dstr2);
// str1 = Double.toString(beta);
// str2 = Double.toString(y);
//System.out.println("Beta is " + beta);
t2.setText(str);
t3.setText(str1);
}//end of try
catch (SQLException sqle)
{
System.out.println("Some SQL error occured.");
}
}//end of function
59
public static void main(String args[])
{
linearprog t=new linearprog();
t.show();
}//end of main
}//end of class
60
Channel.jsp
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<%@taglib uri="/struts-tags" prefix="s"%>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<title>Bootstrapping IPTV</title>
<link rel="stylesheet" type="text/css" href="style.css" />
<script type="text/javascript">
function call() {
var channelType = document.registration.channelType.value;
var userName = document.registration.userName.value;
var password = document.registration.passWord.value;
var mode = document.registration.mode.value;
if (channelType != '' && userName != '' && password != '' && mode != '')
document.registration.submit();
else
alert('Please Enter all the fields !!');
}
</script>
</head>
<body>
<div id="wrapper">
<div id="splash">
<img src="images/pic1.jpg" alt="" />
</div>
<div id="menu">
<ul>
<li><a id="href" href="registration.jsp">Channel's
61
Credential</a></li>
<li><a id="href" href="iptvPortal.jsp">IPTV Portal</a></li>
<li><a id="href" href="iptvPortalRule.jsp">Rule Settings</a></li>
<li><a id="href" href="login.jsp">Logout</a></li>
</ul>
</div>
<div id="page">
<div align="center">
<s:form name="registration" action="ChannelAction">
<table border="0" cellspacing="15">
<tr>
<td colspan=2 align="center" style="color: #447289;">
<h2>Channel Registration</h2>
</td>
</tr>
<tr>
<td><s:combobox label="Select a Channel"
list="#{'HBO':'HBO','StarMovies':'StarMovies','WorldMovies':'WorldMovies','ESPN':'ESPN'
}"
name="channelType" /></td>
</tr>
<tr>
<td><s:textfield label="User Name" name="userName"
id="userName"></s:textfield></td>
</tr>
<tr>
<td><s:password label="Password" name="password"
id="passWord"></s:password></td>
</tr>
<tr>
<td><s:radio label="Mode" name="mode" id="mode"
list="#{'1':'read','2':'write'}" value="1" /></td>
</tr>
<tr>
62
<td colspan="2" style="text-align: center"><input
id="param1" type="button" onclick="call()" value="Enter"></input>
</td>
</tr>
</table>
</s:form>
</div>
<br class="clearfix" />
</div>
</div>
</body>
</html>