Date post: | 07-Aug-2015 |
Category: |
Documents |
Upload: | muhammad-jawaid-shamshad |
View: | 40 times |
Download: | 1 times |
Stateful Web Services
An Independent Study Report - I
SUBMITTED TO THE FACULTY OF COMPUTER SCIENCES
ON GRADUATE STUDIES OF
SHAHEED ZULFIKAR ALI BHUTTO INSTITUTE OF SCIENCE & TECHNOLOGY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE (COMPUTER SCIENCE)
Muhammad Jawaid Shamshad MS/PhD (CS) 052210
December 2006
Supervised by
Aslam Parvez Memon
Stateful Web Services
______________________________________________________________________________________
2
To my Parents,
who contributed most in my studies.
Stateful Web Services
______________________________________________________________________________________
3
Acknowledgements
In this era of technology with ever-increasing scope and compressed schedules,
acknowledging the contributions of everyone involved along the way is more and more
important. All too often, we move on to new projects without remembering to thank the
people who have helped us on the current one. A contribution that might on the surface
seem small can in fact make or break a project or present a fresh way of solving a
problem. It's important not only to thank people personally but also to find every
opportunity to recognize their contributions publicly.
The author's study of "Stateful Web Services" need lots of effort, and the author wishes to
thank his colleagues and specially his advisor Sir Aslam Parvez Memon who gave his
precious advice in conducting this study.
Muhammad Jawaid Shamshad
MS/PhD (CS) 052210
Stateful Web Services
______________________________________________________________________________________
4
TABLE OF CONTENTS
CHAPTER 1 - INTRODUCTION ................................................................................... 6
1.1 LITERATURE REVIEW ............................................................................................................................ 7
1.2 RESEARCH METHOD ............................................................................................................................. 7
1.3 PROBLEM .............................................................................................................................................. 7
1.4 NEED .................................................................................................................................................... 7
1.5 MODEL IN PRACTICE ............................................................................................................................. 8
CHAPTER 2 – WHAT ARE WEB SERVICES? .......................................................... 9
2.1 WEB SERVICE ....................................................................................................................................... 9
2.2 SOAP ..................................................................................................................................................10
2.3 WSDL .................................................................................................................................................14
2.4 DISCOVERING WEB SERVICE ...............................................................................................................18
2.4.1 UDDI ...............................................................................................................................................18
2.4.2 EBXML REGISTRIES .........................................................................................................................20
CHAPTER 3 – PROBLEM ............................................................................................ 23
3.1 INTRODUCTION TO STATE ....................................................................................................................23
3.2 NEED FOR STATE MANAGEMENT .........................................................................................................26
3.3 STATE MANAGEMENT IN WEB SERVICES .............................................................................................28
CHAPTER 4 – BEST PRACTICES OF STATE MANAGEMENT .......................... 31
4.1 STATEFUL MODEL ...............................................................................................................................31
4.2 STATE MANAGEMENT TECHNIQUES ....................................................................................................32
4.2.1 IN-MEMORY SESSION ........................................................................................................................32
4.2.2 DATABASES ......................................................................................................................................33
4.3 GENERALIZED MODEL .........................................................................................................................34
4.3.1 FUNCTIONAL DESIGN ........................................................................................................................37
4.3.2 MIDDLE TIER ....................................................................................................................................37
4.3.3 DATA LAYER ....................................................................................................................................37
4.4 ISSUES TO ADDRESS .............................................................................................................................38
4.4.1 SECURITY ISSUES ..............................................................................................................................38
4.4.2 SESSION LIFETIME ............................................................................................................................40
CONCLUSION ............................................................................................................... 40
REFERENCES ................................................................................................................ 41
Stateful Web Services
______________________________________________________________________________________
5
TABLE OF FIGURES
FIGURE 2.1: STRUCTURE OF A SOAP MESSAGE………………………………….……………….....5
FIGURE 2.2: THE SOAP MESSAGE STRUCTURE……………………………………………..……….5
FIGURE 2.3: THE SOAP REQUEST……………………………………………………………...……….6
FIGURE 2.4: THE SOAP RESPONSE………………………………………………………….………….6
FIGURE 2.5: WSDL ABSTRACT DESCRIPTION.……………………………………………………….7
FIGURE 2.6: WSDL’S CONCRETE BINDING INFORMATION…………………………………..…….8
FIGURE 2.7 AN EBXML ARCHITECTURE IN USE………………………………………………...….10
FIGURE 3.1: TRANSACTIONAL STEPS IN A PROCESS……………………………………………...12
FIGURE 4.1: LOGIN PROCESS…………………………………………………………………….…….18
FIGURE 4.2: SERVICE REQUEST PROVIDING TOKEN……………………………………………...19
FIGURE 4.3: SEQUENCE DIAGRAM FOR SESSION MANAGEMENT………………………...…….20
Stateful Web Services
______________________________________________________________________________________
6
Executive Summary
Web services are by nature stateless. There are certain situations where state is needed of
certain resources like user sessions. In web services this is normally required when
applications like business and ecommerce applications based on user sign on needs to
maintain state of clients connected while applications are built on web services which
does not provide an implicit state management facility. This study presents the logical
model for maintaining state of resources in web services.
Stateful Web Services
______________________________________________________________________________________
7
Chapter 1
Introduction
1.1 Literature Review
Literature has been collected from research papers published, like at ACM and IEEE and
chapter excerpts from some books (Books are listed in reference section), and internet.
1.2 Research Method
This study presents the logical model for maintaining state of resources in web services.
Study has been conducted by first defining the problem domain, its need, its scope, and
then its generalized logical model in practice is discussed. The study plan can be depicted
as:
Background Review
Problem Domain
Requirements Specification
Generalized logical model in Practice
1.3 Problem
Web services are by nature stateless. There are certain situations where state is needed of
certain resources like user sessions. In web services this is normally required when
applications like business and ecommerce applications based on user sign on needs to
maintain state of clients connected while applications are built on web services which
does not provide an implicit state management facility.
1.4 Need
State management is difficult to avoid in a number of situations. One situation is to
establish a session between a consumer and a provider. A session is typically established
Stateful Web Services
______________________________________________________________________________________
8
for efficiency reasons. For example, sending a security certificate with each request is a
serious burden for both any consumer and provider. It is much quicker to replace the
certificate with a token shared just between the consumer and provider. Therefore
services need to manage state and lookout this state and ensure through business logic
that it is kept consistent and accurate. This state is the only true and current source of
information.
1.5 Model in Practice
The common practice of managing state is to have a token associated with the resource.
On first request this token is generated after authenticating. This token is then passed
during further communication between client and server. Token can have lifetime and
guaranteed to be unique since no two resources can share the same token, otherwise
doing so will result in conflict. A common example of such type of token is the session id
in login based systems. On login request a unique session id is generated which is then
returned to the client and this client then uses this session id in further communication.
The server will store some information against this session id to save the state of the
client. This session id can be expired on logout request or after a specified time period of
inactivity or idleness.
Stateful Web Services
______________________________________________________________________________________
9
Chapter 2
What are Web Services?
2.1 Web Service
There are several definitions available for describing web services and it is difficult to
give a concrete definition, but according to the authors of “The Semantic Web” [1] the
concrete definition of web service would be:
“Web services are software applications that can be discovered, described, and accessed
based on XML and standard Web protocols over intranets, extranets, and the Internet.”
Starting with the concept, the first sentence, “Web services are software applications”
expresses the main point that Web services are software applications like other usual
software applications which performs some specific tasks depending on their
implementation. In addition these software applications are available on web.
Next, according to the definition web services “can be discovered, described, and
accessed based on XML and standard Web protocols”. This part clearly states that it is
built on XML [2] which is a worldwide accepted standard and supported by majority of
the vendors. Due to this web services are interoperable and the main focus of web service
is interoperability. Other web protocols include Hypertext Transport Protocol (HTTP [3])
which is the underlying communication protocol. So web services use XML as the syntax
of their message and use HTTP to transfer that message. This is the access method. The
message is basically a Simple Object Access Protocol (SOAP [4]) envelop which is in
XML format.
Web services can dynamically be discovered by Universal Description, Discovery, and
Integration (UDDI [5]) registries. These registries keep the record of web services and
their description i.e. syntax of web services.
Stateful Web Services
______________________________________________________________________________________
10
UDDI then provides the description in Web Service Definition Language (WSDL [5])
form which is also in xml format and describes the syntax of web services.
The last part of the definition state that Web services are available “over intranets,
extranets, and the Internet”. This means that web services can be public as well as private
web services for organizations internal use. Or web services can be between two
partnering organizations in a B2B solution. So it is important to understand that web
services are not only public accessible by world, but can also be private accessible within
organization’s intranet.
There is another important concept which should be cleared. Web services are not
dependent on user interfaces. Web services are only APIs which an application can call to
get information like flight schedule web service or it can be an airline reservation web
service. Since message passing is in XML format so its representation is dependent on
application how it displays it.
2.2 SOAP
Simple Object Access Protocol (SOAP) was created by Microsoft, Developmentor, IBM,
Lotus, and UserLand.
SOAP is an XML-based protocol for messaging and remote procedure calls (RPCs). That
is the format of SOAP is XML. The reason for adoption of XML as format of SOAP is
that XML is universally accepted and adopted for data encoding for platforms
independence. SOAP uses existing transport protocols like HTTP, SMPT, and MQSeries
to transfer messages or remote procedure calls.
Web services transfers XML messages in SOAP format which is like envelop and called
SOAP envelop, which contains a SOAP header and a SOAP body. SOAP header contains
the Meta information and the body contains the actual message or remote procedure call
Stateful Web Services
______________________________________________________________________________________
11
(RPC) in XML syntax. This SOAP envelop is sent over HTTP between web service
consumers and web service providers or simply web service. There are also other
protocols as defined above but in usual cases HTTP is used. W3C defines SOAP as “a
lightweight protocol for exchange of information in a decentralized, distributed
environment.” [4]. SOAP provides a standard language for tying client and server
applications together on different platforms in which client application sends a SOAP
request and web service returns a SOAP response.
SOAP is associated with web services and it does not have any relation to object oriented
programming. That means a developer can create a SOAP based web service in C, Pascal
or any similar language which does not support object oriented programming. The only
thing significant is that application written in such languages can understand XML i.e.
can parse and evaluate XML documents, and can communicate over transport protocols
like which SOAP supports like HTTP.
SOAP has been adopted as the standard for Web services, and majority of the vendors
have developed SOAP APIs for their products like Microsoft for their .Net platform and
Sun for Java, thus making integration of software systems much easier.
Now let’s look at the SOAP message syntax and how it works. A SOAP message consists
of an envelope containing an optional header and a required body. Figure 2.1 shows a
SOAP envelope’s structure.
Stateful Web Services
______________________________________________________________________________________
12
<SOAP:Envelope xmlns:SOAP=“http://schemas.xmlsoap.org/soap/envelope/”>
<SOAP:Header>
<!— content of header goes here —>
</SOAP:Header>
<SOAP:Body>
<!— content of body goes here —>
</SOAP:Body>
</SOAP:Envelope>
Figure 2.1: Structure of a SOAP message. The envelope features child elements that
contain the message header and body elements. [5]
The header contains information concerning how the message is to be processed. This
includes routing and delivery settings, authentication or authorization assertions, and
transaction contexts. The body contains the actual message to be delivered and processed.
Anything that can be expressed in XML syntax can go in the body of a message. This is
graphically depicted in Figure 2.2 [6].
Figure 2.2: The SOAP message structure [6]
Let’s look at an example of a simple SOAP message. This example has been taken from
the SOAP 1.1 specification. The Figure 2.3 [5] shows a simple SOAP message for getting
the last trade price of the “DIS” ticker symbol. The SOAP envelope wraps everything in
the message. The encodingStyle attribute of the SOAP envelope shows how the message
Stateful Web Services
______________________________________________________________________________________
13
is encoded, so that the Web service can read it. Next is the SOAP body of the message
that wraps the application-specific information i.e. the call to GetLastTradePrice in the
SOAP body. A Web service receives this information, processes the request in the SOAP
body, and can return a SOAP response.
<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”
SOAP-ENV:encodingStyle=”http://schemas.xmlsoap.org/soap/encoding/”>
<SOAP-ENV:Body>
<m:GetLastTradePrice xmlns:m=”Some-URI”>
<symbol>DIS</symbol>
</m:GetLastTradePrice>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Figure 2.3: The SOAP request [5]
The SOAP response for our example stock price request is shown in the Figure 2.4 [5]
that follows. Just like the request, the message is syntactically the same: It consists of an
envelope that wraps the message, it describes its encoding style, and it wraps the content
of the message in the SOAP body. The message inside the body is different. Under
SOAP-ENV:Body tag, we see that the message is wrapped in the
GetLastTradePriceResponse tag, with the result price shown in Price tag.
<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”
SOAP-ENV:encodingStyle=”http://schemas.xmlsoap.org/soap/encoding/”/>
<SOAP-ENV:Body>
<m:GetLastTradePriceResponse xmlns:m=”Some-URI”>
<Price>34.5</Price>
</m:GetLastTradePriceResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Figure 2.4: The SOAP response [5]
Stateful Web Services
______________________________________________________________________________________
14
2.3 WSDL
Whereas SOAP is the communication language of Web services, but speaking a universal
language is not very useful unless you can maintain the basic conversations that let you
achieve your goals. Now how can we tell what messages must be exchanged to
successfully interact with a service. That role is filled by WSDL. Web Service Definition
Language (WSDL) is the way we describe the communication details and the application-
specific messages that can be sent in SOAP. WSDL, like SOAP, is also in XML format
and developed by IBM and Microsoft. The W3C defines WSDL as “an XML format for
describing network services as a set of endpoints operating on messages containing either
document-oriented or procedure-oriented information”. To know how to send messages
to a particular Web service, an application can look at the WSDL and dynamically
construct SOAP messages. WSDL describes the operational information—where the
service is located, what the service does, and how to invoke the service. The format of
WSDL is very difficult to understand, but it isn’t really intended to be human-readable.
Developers do not have to understand WSDL and SOAP to create Web services. When
developer creates a Web service, most toolkits generate WSDL for you. Then the client
application generates the code for handling the Web service, generally called stub by
looking at the WSDL. Finally, the client application and the Web service can
communicate with each other.
Two pieces of information are described in a WSDL service description. One is an
abstract interface that is the application-level service description, and second is the
specific protocol-dependent detail that client application needs to follow to access the
service. These two types of information are necessary because similar application-level
service functionality is often deployed at different end points with slightly different
access protocol details. Separating the description of these two aspects helps WDSL
represent common functionality between seemingly different end points.
Abstract description is defined in WSDL as messages that need to be exchanged between
client and web service communication. Abstract interface contains components like the
Stateful Web Services
______________________________________________________________________________________
15
vocabulary, the message, and the interaction. Vocabulary describes the type system to
provide data type definitions to exchange the information. WSDL uses external type
systems for this purpose. XSD is the most widely used but any type system is supported
by WSDL. Generally XSD is used to define standard data types like string, int, float etc,
which are supported by most of the languages like C/C++, Java, C# etc. External type
system is used to define custom data types like if developer wants to define his class or
structure and want to use that in communication. Figure 2.5 [5] shows an example in
which two data types are defined in XSD (string and int), and two data types are defined
in external schema (FlightInfoType and Ticket).
<message name=“GetFlightInfoInput”>
<part name=“airlineName” type=“xsd:string”/>
<part name=“flightNumber” type=“xsd:int”/>
</message>
<message name=“GetFlightInfoOutput”>
<part name=“flightInfo” type=“fixsd:FlightInfoType”/>
</message>
<message name=“CheckInInput”>
<part name=“body” element=“eticketxsd:Ticket”/>
</message>
<portType name=“AirportServicePortType”>
<operation name=“GetFlightInfo”>
<input message=“tns:GetFlightInfoInput”/>
<output message=“tns:GetFlightInfoOutput”/>
</operation>
<operation name=“CheckIn”>
<input message=“tns:CheckInInput”/>
</operation>
</portType>
Figure 2.5: WSDL abstract description. This fragment shows the string and int data
types, which are defined in XSD, and two other data types defined in external
schema:FlightInfoType and Ticket, which we assume were imported earlier in the
WSDL file. [5]
Stateful Web Services
______________________________________________________________________________________
16
External XSD definitions are imported in WSDL using an “import” element which
specifies the location of the schema. The message elements are defined in WSDL as
aggregations of parts, and each part is described by XSD types or elements from any
other external schema. Messages provide an abstract, typed data definition sent to and
from the services. The example in Figure 2.5 shows the three messages that might appear
during a Web services interaction. The message, GetFlightInfoInput, has two parts:
airlineName, which is an XSD string, and flightNumber, which is an XSD integer. The other
two messages, GetFlightInfoOutput and CheckInInput have only one part each. Interaction is
defined by the operation and portType elements. Each operation represents a message
exchange pattern that the Web service supports. An operation is simply a combination of
messages labeled as input, output, or fault to indicate what part a particular message plays
in the interaction. A portType is a collection of operations that are collectively supported
by an end point. In our example, AirportServicePortType describes two operations: a single
request-response operation, GetFlightInfo, which expects the GetFlightInfoInput message as
input and returns a GetFlightInfoOutput message as the response; and a one-way operation,
CheckIn, which just takes the CheckInInput message as input.
Among the application-level functionality of the service we also need three more pieces
of information:
what communication protocol to use (such as SOAP over HTTP)
how to accomplish individual service interactions over this protocol, and
where to terminate communication (the network address).
“what” and “how” parts of this information are provided by the binding element of the
WSDL, including the communication protocol and data format specification. In short, the
binding element tells how a given interaction occurs over the specified protocol. Figure
2.6 shows a fragment from our example.
Stateful Web Services
______________________________________________________________________________________
17
<binding name=“AirportServiceSoapBinding” type=“tns:AirportServicePortType”>
<soap:binding transport=“http://schemas.xmlsoap.org/soap/http”/>
<operation name=“GetFlightInfo”>
<soap:operation style=“rpc” soapAction=“http://acme-travel/flightinfo”/>
<input>
<soap:body use=“encoded” namespace=” http://acme-
travel.com/flightinfo” encodingStyle=
“http://schemas.xmlsoap.org/soap/encoding/”/>
</input>
<output>
<soap:body use=“encoded” namespace=“http://acme-
travel.com/flightinfo” encodingStyle=
“http://schemas.xmlsoap.org/soap/encoding/”/>
</output>
</operation>
<operation name=“CheckIn”>
<soap:operation style=“document” soapAction=“http://acme-
travel.com/checkin”/>
<input>
<soap:body use=“literal”/>
</input>
</operation>
</binding>
<service name=“travelservice”>
<port name=“travelservicePort” binding=“tns:AirportServiceSoapBinding”>
<soap:address location=“http://acmetravel.com/travelservice”/>
</port>
</service>
Figure 2.6: WSDL’s concrete binding information. GetFlightInfo is a SOAP RPC
interaction and CheckIn is a pure messaging interaction that uses XSD to describe the
transmitted XML. [5]
The binding describes how to use SOAP to access the service-this combination of
abstract interface and protocol and data marshalling details (the binding). A port element
describes a single end point as a combination of a binding and a network address.
Consequently, a service element groups a set of related ports. In our travel service
Stateful Web Services
______________________________________________________________________________________
18
example, a single port describes an end point that processes SOAP requests for the
travelservice service. WSDL provides a formalized description of client–service
interaction for users and developers. During development, developers use WSDL as the
input to a proxy generator which can be a dynamic invocation proxy that generates client
code according to the service requirements either at development time or at runtime.
Proxy or stub generators relieve the developer to remember or understand all the details
of service access.
2.4 Discovering Web Service
As we have discussed that web services are now being widely used in ecommerce based
applications and to do business on web. Now most of the organizations are moving
towards web service. We have also discussed the technical details of web service that
how web service works and which protocols are used in communication. But knowing
this is not enough since if you don’t know who is providing which web service and what
that web service do? If you know this then you can easily get or build your own
application to communicate with that web service to expand your business. For this
purpose some kind of registry is needed who maintains a list of web services and their
feature so anyone can search his desired web service in that registry and communicate
with it. There are two major technologies for this purpose. One is UDDI (Universal
Description, Discovery, and Integration) and second is ebXML registries. UDDI was
introduced in 2000 by Ariba, Microsoft, and IBM to facilitate the discovery of business
processes. OASIS introduced ebXML in 1999, but their main focus was into consistent
use of XML and standard protocols in EDI (Electronic Data Interchange) community.
But a part of this effort, ebXML registries, is used for the discovery of ebXML business
details. Both of these technologies are worth discussing and competing technologies, so
let’s discuss one by one.
2.4.1 UDDI
Universal Description, Discovery, and Integration (UDDI) is an evolving technology and
is not yet a standard, but it is being implemented and embraced by major vendors like
Stateful Web Services
______________________________________________________________________________________
19
Microsoft and IBM [1]. To understand simply UDDI is like a phone book for Web
services. Where organizations register their information about their web services and
applications can search for that information in the registry. So UDDI allows applications
to dynamically discover web services. When organization registers their information with
UDDI they have to provide following information [5]:
white pages of company contact information,
yellow pages that categorize businesses by standard categorization, and
green pages that document the technical information about web services, like
WSDL.
White pages may include information about a company or organization like organization
name, their contact information like their phone numbers and email addresses,
description of their business, and links to external sources and documents where their
business is described in more detail.
The yellow pages describe the categories or classification that what type of information
their web services provide, plus their products, industry codes, and geographic index.
The green pages describes their ebusiness rules that how to do business with the web
services they have exposed, what are their business rules, and how to invoke the web
service i.e. WSDL.
It is worth noting that these registries can be public or private within organizations where
only inter organization communication is needed. For internal integration, the use of
UDDI private registries is of much value today. Within a large organization, where
several large enterprise applications may need to interoperate, can use UDDI registries
which can be helpful for discovering how to do so. The use of such a private registry
could potentially minimize the use of interoperation documentation and reduce
integration and development time for legacy enterprise applications. Once applications
have been built on web services, and once the interfaces have been described in WSDL
Stateful Web Services
______________________________________________________________________________________
20
and published in a private UDDI registry, other programs and projects within your
organization can dynamically connect and begin to interoperate [1]. Private registries are
therefore secure since these are inter-organization and no alien is allowed to interfere and
it will be operated according to the organization’s policy. You can also use public
registries to expose your business but it is a long debate that putting your organizations
asset into a public registry is risky [1], but after all there are also alternatives and
resolution to this problem, like you can put a list of authorized users who can
communicate with your web service. If anyone wants to talk to your web service then he
has to register himself in your system as an authorized user. Here comes the need for
state management in web services. Since if an authorized entity wants to talk with your
web service then how can you maintain its state that how many times he has talked with
your web service, or if he has tried to do some kind of fraud or performed any kind of
illegal action. If you have been managing its state then you can block him and put it in
your block list. Since to do this you need statistics about him and without managing its
state you can not generate a report which describes his statistics so you can find who is
doing a fear business with you or someone tries to cheat you or wants to play an unfair
game. Securing web service is another topic which is beyond the scope of this study but
we will see that what is the importance of managing state and how can we manage state
in web service environment.
2.4.2 ebXML Registries
The ebXML standard was created by OASIS. The main aim behind ebXML was to
enable business applications exchange data in uniform format and to enable intelligent
business processes using XML. ebXML was developed as a mechanism for XML-based
business language since XML by itself does not provide semantics to solve
interoperability problems. In short, ebXML provides a common way for businesses to
quickly and dynamically perform business transactions based on common business
practices [1]. Figure 2.7 shows an example of an ebXML architecture in use. In the
diagram, company business process information and implementation details are found in
the ebXML registry and businesses can do business transactions after they agree on
Stateful Web Services
______________________________________________________________________________________
21
trading arrangements. Information that can be described and discovered in an ebXML
architecture includes the following [1]:
Business processes and components described in XML
Capabilities of a trading partner
Trading partner agreements between companies
Figure 2.7 An ebXML architecture in use. [1]
Understanding Web Services
The spirit of the ebXML architecture is the ebXML registry, which is the system that is
used to store and discover this information. The ebXML registry contains domain-
specific semantics for B2B. These domain-specific semantics are the product of
agreement on many business technologies and protocols. Simply ebXML could be
described as the start of a domain specific semantic Web. The focus of ebXML was not
initially on Web services, but it now uses SOAP as its message format. Therefore, many
believe that ebXML will have a large role in the future of Web services. Unlike UDDI,
ebXML is a standard. The ebXML standard does have support from many businesses, but
the most influential companies in Web services, IBM and Microsoft, would like to see
UDDI succeed as a registry for business information. However, it is possible that the two
Stateful Web Services
______________________________________________________________________________________
22
technologies can complement each other, and ebXML could succeed in the B2B market,
while private UDDI registries succeed in the EAI market in the short term.
Stateful Web Services
______________________________________________________________________________________
23
Chapter 3
Problem
Services manage state; this state is the very reason for their existence. Services lookout
this state and they ensure through their business logic that it is kept consistent and
accurate. This state is the only true and current source of information [7].
3.1 Introduction to State
Almost all services manage durable state; i.e., state that is stored on some durable
medium such as a file system or a database. The services receive a request from another
service, retrieve some state from that durable medium, and build a response or update the
state. The durable state allows services to be brought down without loss of context; when
they are brought up again, the durable state is still there and they can continue as if
nothing had happened. Services do their best to keep that durable state consistent; they
would like to keep their application state in memory consistent as well, but if something
happens, they can just abort the processing, forget their memory state, and set up anew
using the durable state. Services often use ACID (Atomicity, Consistency, Isolation,
Durability) transactions to maintain consistent durable state [7].
Services handle requests and sometimes they must handle multiple requests to complete a
business transaction. This may be organized using a business process, or process, which
controls the step-by-step actions of executing some work and moving the system from
one state to another [7]. At each step, it performs a business operation. For instance, a
business process may take an incoming order request, update the order system, send a
response, and then update the customer relationship management (CRM) system and the
production system. Another example would be a process that manages the complete
order, delivery, and payment process. It may accept a request for a quote, send the
requested quote, accept an order, check whether the order can be fulfilled, send an order
confirmation, arrange delivery, send an invoice, and so on. Each step moves the process
from one consistent state to the next.
Stateful Web Services
______________________________________________________________________________________
24
Processes deal with three types of durable state [7]:
Permanent state. This is the state that is updated by the processes. This is what
we usually think of when talking about the state of a service. In the previous
examples, this is the order database and the CRM database. This state stays in
existence after the processes finish.
Process state. A business process may persist its state; this allows the service to
stop and restart a process. The business process saves its state and stops. When a
message is received for that process, the process is started again; it retrieves its
state and handles the request. We call the persisted state of a process the process
state. The process state exists only for the duration of the process; when the
process is finished, the process state is removed.
Message state. When messages are sent from one service to another, they are
created in memory. Then they are transported, and finally created in memory on
the other side. However, when using a store and forward mechanism such as a
queue, the message is persisted and then transported, persisted on the other side
and removed at the source. On the receiving side, similar process takes place. The
message state exists while the message is being sent.
As stated earlier, processes move from one consistent state to the next; the combination
of all three types of state must be consistent after each step.
A process executes business activities in a step-by-step manner. Processes may update
their durable state, they may send messages, and they may persist their own state. Each
step may consist of multiple actions and each step takes the process from one consistent
state to the next consistent state. The state is considered consistent if the combination of
the permanent state, the process state, and the message state is consistent. For example, if
you want to do a debit and a credit posting, the state would be consistent if the debit
posting is done and the process state has the information that the credit posting still needs
to be done. The state would be consistent if the order is accepted and the process has the
information that the CRM and the production system need to be updated, or the state is
Stateful Web Services
______________________________________________________________________________________
25
consistent if the order is accepted and the messages to the CRM system and the
production system have been sent.
Figure 3.1 illustrates how a process service steps through a series of transactions.
Figure 3.1: Transactional steps in a process [7]
Services contain state, and the current state is always kept inside the service, protected
from outsiders. Services not only work with the data that they manage themselves; in
many cases, they use data that is obtained from another service [7].
Services exist to guard state. You allow other services to access the state in your service
through a number of well-defined interfaces. For example, a message is sent to a service,
which includes a request to access state stored within the service. The service may then
change some of that state or send a copy of it back to the requesting service.
Messages may be sent or received within an ACID transaction; however, it is not
practical to wait a long time for a response within an atomic transaction.
Stateful Web Services
______________________________________________________________________________________
26
3.2 Need for State Management
State management is the process by which state is maintained over multiple requests for
the same or different web services. As is true for any HTTP-based technology, Web
Services are stateless, which means that they do not automatically indicate whether the
requests in a sequence are all from the same client.
Resource state includes any piece of information or data that affects the behavior of the
resource, for example catalogs, shopping carts, user options, lists of reviews, and hit
counters. State management can be complex because a wide variety of usage patterns,
data types, and access methods are available to application developers building state-
based solutions [8].
There are several situations where state is needed. There are even stateful solutions but
when we talk about the technologies which are implemented as stateless like HTTP
protocol [3], and Web Services, it is difficult to model state upon these technologies.
These protocols or technologies do not keep track of requests in sequence. State
management is difficult to avoid in a number of situations. One situation is to establish a
session between a consumer and a provider. A session is typically established for
efficiency reasons. For example, sending a security certificate with each request is a
serious burden for both any consumer and provider. It is much quicker to replace the
certificate with a token shared just between the consumer and provider. Another situation
is to provide customized service. Let’s illustrate an example to understand this.
Consider an e-commerce application which exposes web services to its client and clients
interact with it using these web services. Like B2B sites needs information to get the list
of products from vendors and prices of these products. These details can be retrieved at
runtime on user request using web services. Now if user wants to buy any product then he
might be asked to login. When user is logged in he can add different products from
different vendors to his cart and can finally check out. This whole user interaction needs
to be managed so it can be determined that this particular user has chosen what products
Stateful Web Services
______________________________________________________________________________________
27
to buy. If web services are used in back end then state maintenance is a problem for web
services. The web services provide stateless client-server interactions using the stateless
HTTP protocol. Stateless means client requests are independent and no memory of
previous requests is required. This approach mirrors file server designs like NFS [9] and
simplifies failure recovery. No dialogue is required with former clients after a crash. But
modern software like P2P applications frequently need state maintenance, and state must
be maintained on both sides since communicating entities are peers. Therefore, a flexible
and efficient state maintenance capability is necessary.
There are generally two kinds of state: session and application state [8]. Recent requests
from a particular client are in the same session. Session state is only visible within that
particular session. Website shopping carts are examples of session state. Session state is
per-user while application state is per-application. Note that web servers often support
multiple applications simultaneously. Application state is shared across sessions within
the same application but is not shared between distinct applications or instances.
Application state is typically maintained in an external database. A variety of techniques
are used to implement session state. The standard solution is to store most application and
session state in a database. But external databases are not well-suited for storing
application state for certain types of increasingly important applications. Pervasive
computing and sensor-based environments often involve dynamic, P2P interactions.
Many such systems can be viewed as managing streams of potentially high-volume data.
For example, surveillance applications require coordination and processing many sensor
streams (video, audio, motion data). Such applications require:
(1) Performance: Database storage may be too slow to satisfy throughput and
latency requirements of the applications we consider.
(2) Fault-tolerance: Web services are frequently implemented using a single
shared web server resource. Web services are sometimes ill-behaved, requiring
occasional maintenance and restart of web servers. Memory-based application state will
be lost by these administrative restarts. In general, web services should provide some
form of fault-tolerance.
Stateful Web Services
______________________________________________________________________________________
28
(3) Persistence: While much data in persistent computing environments is
temporary, it is important to persist data periodically for subsequent analysis and
processing.
People are using different mechanisms to manage state in stateless environment [8]. It
solely depends on the needs and requirement of the application. Sometimes state is
maintained in memory, but this is not feasible since application crash, power failure, or
any other kind of failure can loose any state information saved in memory. Some times
there are separate applications called state servers to manage state. This requires inter
process communication which in case of real time applications is not feasible. Since inter
process communication would be intense and resource hungry, which degrades
performance. Another disadvantage is the server reboot or shut down which will cause
loss of state. Databases are also used to save the state. Normally session state is feasible
for applications which do not require real time processing. Like some e-commerce
applications in which user logs in and performs its tasks, buy anything, and then logs out.
These kind of applications needs the state of that user, what user has viewed, which item
he has moved from the system, or what he has purchased etc. this kind of information can
be stored in database. Let’s look at these different approaches in detail.
3.3 State Management in Web Services
Solving the state management problem with fault tolerance on web services is a complex
problem. Since there are several factors that needs to be considered. When implementing
state in web service environment it also depends on the requirements and scenario.
Another thing that needs to be considered is to maintain state of what resource. Normally
when track between service calls is required then user state is managed. There are several
mechanisms proposed like one can generate a token and store user related information
against it in a repository and return that token to the client. Then client can provide that
token in each subsequent call to identify itself and server can get its state information
from the repository. One point to note here is that a repository is needed to store the state.
This repository can be a server’s memory, a separate server, a database, or any other
Stateful Web Services
______________________________________________________________________________________
29
repository. Each repository has its own pros and cons. Lets start with the in memory state
management.
The in memory state management is suitable for only shorter time period, where state is
maintained for resources which are not too crucial. This approach does not handle the
server shutdown or restarts. So there should be a proper mechanism to handle it.
Another possibility is that another state server manages the state for the server. This
approach will have its own consequences. It allows restarting the main server which will
not cause the state but if state server is restarted then we have the same problem over
here.
Database is also a good option. In which state related data is stored in database. This
approach provides fault tolerance since server restart will not loose state related data.
It is common for an e-commerce application to maintain state information by using a
relational database for the following reasons [10]:
Security
Personalization
Consistency
Data mining
The following are typical features of a database supported solution:
Security: The visitor types an account name and password into an application
logon dialog. The application infrastructure calls a web service which in turn
queries the database with the logon values to determine whether the user has
rights to utilize application. If the database validates the user information, the
application will distribute a valid token containing a unique ID for that user on
that client computer. The application grants access to the user.
Personalization: With security information in place, application can distinguish
each user by reading the token on the client computer. Typically, applications
Stateful Web Services
______________________________________________________________________________________
30
have information in the database that describes the preferences of a user
(identified by a unique ID). This relationship is known as personalization. The
application can research the user's preferences using the unique ID contained in
the token, and then place content and information in front of the user that pertains
to the user's specific wishes, reacting to the user's preferences over time.
Consistency: If anyone has created a commerce application, he might want to
keep transactional records of purchases made for goods and services. This
information can be reliably saved in the database and referenced by the user's
unique ID. It can be used to determine whether a purchase transaction has been
completed, and to determine the course of action if a purchase transaction fails.
The information can also be used to inform the user of the status of an order
placed using the application.
Data mining: Information about application usage, application visitors, or product
transactions can be reliably stored in a database. For example, a business
development department might want to use the data collected from the application
to determine next year's product line or distribution policy. Marketing department
might want to examine demographic information about users. Engineering and
support departments might want to look at transactions and note areas where
purchasing process could be improved. Most enterprise-level relational databases,
such as Microsoft SQL Server, and Oracle contain an expansive toolset for most
data mining projects.
By designing the application to repeatedly query the database by using the unique Id
during each general stage in the above scenario, the application maintains state. In this
way, the user perceives that the application is remembering and reacting to him or her
personally.
The last database driven approach is described in more detail with an abstract model in
next the chapter.
Stateful Web Services
______________________________________________________________________________________
31
Chapter 4
Best Practices of State Management
Online transactions require a persistence of state through many server data exchanges.
Session creates a logical connection to maintain state between client and server upon web
services which is a connectionless and stateless protocol. The information relevant to a
particular session of a specific user is known as the session state.
Session state management is all about connecting or associating a web service request
with other previous requests generated from the same session, as these requests appear
unrelated to the web service because of the connectionless and statelessness nature of the
web service.
4.1 Stateful Model
Organizations have been using different models as per their needs to manage state in
stateless environment. Like state management in HTTP has been in practice, which is a
stateless protocol. There is not a standard model for maintaining state in stateless
environment but a common generalized model can be proposed for certain environments
like Web Services. Research organizations are working to propose a model and to
standardize the state management using web services. Web Services Resource
Framework (WSRF) [11] is a first step towards this, which has been proposed to manage
resources and their state through web service interface. This framework also needs and
proposed enhancements in other web services related technologies like WSDL [5], and
have proposed WSDL 2.0 [12], but the problem with WSRF is that its compatibility with
the WS-I architecture is very limited due to which it was ignored and major vendors like
Microsoft and Sun chose to adopt alternatives.
To manage state in current web service environment, software architects and developers
are using several methods as per their needs. We will discuss a generalized model for
Stateful Web Services
______________________________________________________________________________________
32
managing state in web services, especially to maintain state of user conversation, or in
other words users’ session state.
4.2 State Management Techniques
SOAP is a connectionless specification for how a web service consumer communicates
with a web service. Due to the connectionless and statelessness nature of the web service,
session management is the challenge that developers come across while building Web
applications using web services. From the viewpoint of the Web server, each web service
request appears as though it is a separate and distinct request, unrelated to any previous
requests. That means the information a user gives on one web service request is not
automatically available on the next web service requested. The inability of the web
services to retain knowledge of previous requests means it is difficult to write
applications, such as an online catalog, where the application might need to track the
catalog items a user has selected while jumping between the various pages of the catalog.
For many reasons, such as data security, size limit, durability, etc., one might want to
implement a better and more robust session management technique. There are many
techniques used to manage session state, all of which require session state information to
be explicitly passed between the web service consumer and the web service. This
information can be unique identifier of the session generally called a session Id. There are
different techniques of storing session id. For example, the session Id can be stored on the
client-side which consumes the web service, and the rest of the data, such as user
information, can be stored on the server-side which exposes web services [13].
4.2.1 In-memory session
The time a user requests for login, the server internally generates a session Id using a
complex algorithm. So at the start of a new session, the server returns the session Id to
the client [13] and keeps a reference of it in its memory. Though the in-memory session
object works well in a single server environment, it is not very useful in a farm [14]. In a
farm, where there is a cluster of servers, web service requests are routed to each server in
Stateful Web Services
______________________________________________________________________________________
33
a round-robin fashion to distribute load and allow to handle more requests. The Session
object is tied to a single server and is not shared among servers in the farm. Because
requests from the same user can be routed to any available server for load balancing,
session information can potentially be lost between requests. In order to use the session
object in a clustered server environment, we can dedicate a single server to handle all
requests from a user for the lifetime of the session. In doing so, however, we will
compromise scalability as the distribution of load among multiple servers is not fairly
balanced.
4.2.2 Databases
In view of the limitations of the in-memory session objects, let’s look at the technique in
which session management is moved to a database server accessible to all other servers in
the farm. Any database can be used to maintain session state. The main advantage of
storing session state in the back-end database is that state information can be durable and
one can store state information as per his needs [13]. Each user will be given a unique
identifier that will serve as a key to the user's information in the database. This key is
normally called a Session Id.
Session information is stored based on the unique identifier, which is generated every
time a new session starts. So, even if the same user logs on twice from two different
sessions, they will not be given a single identifier for their sessions and, therefore,
information associated with one session is not accessible from other session.
The drawback of storing session information in a database is that it puts a greater load on
the server, because the application requires more time-consuming database transactions.
This technique does not impose a security risk, because the client stores only the unique
identifier, and other sensitive data is stored in the database, session information is secure
in this technique, and it is always better to put greater load on the server than to risk
security.
Stateful Web Services
______________________________________________________________________________________
34
4.3 Generalized Model
As it discussed that there are several techniques for state management, now let’s discuss a
generalized model that is in common use when managing state. First a token generator is
needed which generates a unique token or identifier for each client. Token will act as a
key for the client. This can be a universally unique ID like GUID or UUID. Then to
maintain state there should be a repository in which session state can be stored. Databases
are a good choice for that. It is previously discussed that why database is a preferred
choice for state management.
Now, let’s look at the whole process how it will work. First user or client will call a login
web service and will provide its credentials. Server will first authenticate the user, if user
is authenticated, and then server will ask the token generator to generate a unique token.
Token generator will generate a new unique token. Server will then store user
information regarding its session against that token in repository or database, and finally
return that token to the client. Client will then subsequently call further web services
providing the token. If server exposes a web service say GetAppointments, then client will
call this web service and will also provide the token to verify itself as an authorized user.
Server will accept that token and will verify with the token stored in database. If token is
verified then server will return the desired information to the client.
This is a simplified mock-up. There are several issues which have to be addressed to
properly implement this generalized model.
Stateful Web Services
______________________________________________________________________________________
35
Client
Server
Repository
Token generator
1
2
3
4
Login Process
1 – Client calls a web service
2 – Server gets a unique token from token generator
3 – Store that token in repository against client credentials
4 – Returns the token to client which client will use in subsequent calls
Figure 4.3: Login Process
Stateful Web Services
______________________________________________________________________________________
36
Client
Server
Repository
1
2
3
Service request providing token
1 – Client calls a web service providing token
2 – Server matches the token in repository and if verified then gets user and requested information from database
3 – Respond the client with information requested
Figure 4.4: Service request providing token
Stateful Web Services
______________________________________________________________________________________
37
4.3.1 Functional Design
Figure 4.3 shows the sequence of events that take place with respect to session
management in various layers [13].
Figure 4.3: Sequence diagram for session management [13]
4.3.2 Middle Tier
The session management code services application use cases, such as account
management, and order purchase by providing read and write service. Session state are
always read and written as a whole. For instance, the account management service reads
the whole session state and extracts only the user information to operate on it. It then
modifies the session state by including the updated user information and sends the whole
session state to the database.
The middle-tier components explicitly generate a globally unique identifier (GUID) that
is used as the unique session identifier for every user who logs on to the system and
associate it with the user's session information [13].
4.3.3 Data Layer
In this technique database layer maintains a Session table to store the session data. The
key to a user's session data is the unique session Id assigned to him by the middle-tier
components on logging on to the system. There are certain interfaces exposed by the data
Stateful Web Services
______________________________________________________________________________________
38
layer to access the Session table. These interfaces are called by the middle-tier
components to insert or update session information in the session table [13].
4.4 Issues to Address
There are several issues which must be resolved. When implementing this model one
might ask that:
Why generate token, not keeping the list of allowed IP addresses?
What about false token generation?
What if token gets stolen?
Is there any token lifetime?
Most of the issues are security related. Since security is definitely an issue when
managing user session state, but security experts has advised best practices for these
kinds of problems.
4.4.1 Security Issues
One of the frequently asked and important issues is that why use randomly generated
unique token or Identifier for client authentication and not keeping a list of allowed IP
addresses?
The answer is that keeping list of IP addresses is good when you know who are your
clients and what are their IP addresses, and their IP addresses will not change. This can
be implemented where number of clients is few. When there are huge number of clients
then this token is necessary, also because if client is connected to the network using a dial
up connection then the client will get a different IP each time he get connected, so list of
IP addresses is not enough to authorize a client. Another factor is that there can be more
than one users logged in from a same host or IP address, which is normally a case for
users behind firewall. If list of IP addresses is maintained then it will not be possible for
two different users to be logged in from same host or IP address or their information will
Stateful Web Services
______________________________________________________________________________________
39
be jumbled with each other and they can not be identified as two different users. So a
unique token is required for each user session.
Hackers constantly try to break security by any means, whether they have to sniff
network traffic to get user conversation or to break security key or by any other means.
Hackers can break or generate a false token, but this totally depends on token generation
algorithm and process that how hard it makes to generate a false token. Other two issues
are also related to this and there are different practices for that. Use of SSL is
recommended to prevent hackers from sniffing the conversation and keep a short timeout
for a session. If user is idle for few minutes then expire its token i.e. it’s token or session
lifetime should be short.
There are also other considerations like when authenticated, on what criteria server
should return results of requests. This totally depends on the application which has
exposed the web services. This is the internal business logic of that application that how
or on what criteria to return or accept the information from the client. One practice that
normally software architects follow is that they store some user and his session related
information in database against the session Id. This information can be used in criteria for
accepting or returning information.
A client request is identified by its session Id. If a hacker were able to capture the session
Id in use by an active session, he could submit valid requests and, hence, access the user's
private data [13].
You may choose to employ any of several possible alternatives to this problem. The
session data for any session should not be stored for long on the server—that is, the
lifetime should be short.
The longer the session Id is stored, the higher the probability that it will be steal. Session
Id can be periodically changed to reduce exposure, without shortening the lifetime of the
associated session. The frequency of changing the session Id should be a factor to
Stateful Web Services
______________________________________________________________________________________
40
consider when using this method. SSL is also an option to keep the conversation secure.
But it degrades performance since there is an extra encryption and decryption overhead.
4.4.2 Session Lifetime
In general, the lifetime of a session starts when the user visits the first time, and ends
when he leaves. In login based schemes, a session starts when a user logs on. The session
ends when the user finally leaves by logging out [13] or when it is expired.
On online shopping applications, users typically spend most of their time at the catalog
browsing before adding a single item to the shopping cart, and often leave the site after
browsing the catalog without selecting any items for purchase. Maintaining session state
for such users is unnecessary, because there really isn't any data that needs to be saved.
For this reason, the online application starts the session only after the logon process has
been completed. This helps to avoid maintaining unnecessary session state.
The session state is maintained in the database and removed on a daily basis, since user
might leave without logging out or user may be idle for a long time. This is necessary for
security reasons which have been discussed in the previous section. If there is a need for
longer sessions then there is always the possibility of extending the session lifetime
management by implementing a way for users to return to their previous sessions at a
later time, thereby increasing the lifetime of a session. In this case, the session ends when
the session state is removed from the system, preventing the user from returning to their
earlier session.
Conclusion
This study tried to assess different approaches toward building session-based or stateful
web services. In general, it is recommended that web services be designed according to
the principles of a service-oriented architecture. However, it is sometimes desirable to
build services capable of referencing each other, which may lead to a finer-grained,
session-oriented services design. When building a new service, it is worth considering
Stateful Web Services
______________________________________________________________________________________
41
carefully the pros and cons of all design styles, which can result in a better integration
solution for a targeted domain.
REFERENCES
[1] Micheal C. Daconta, Leo J. Orbst, Kevin T. Smith. Chapter 4: Understanding Web
Services. The Semantic Web:A Guide to the Future of XML, Web Services, and
Knowledge Management. Wiley Publication, Inc. 2003
[2] Extensible Markup Language (XML) – “http://www.w3.org/TR/xml/” Retrieved on
September 8, 2006
[3] Fielding, Gettys, Mogul, et al. Hypertext Transfer Protocol – HTTP/1.1 RFC 2616,
Internet Society, 1999. "http://www.w3.org/Protocols/rfc2616/rfc2616.html" Retrieved
on September 8, 2006
[4] Simple Object Access Protocol (SOAP) – “http://www.w3.org/TR/SOAP” Retrieved
on September 8, 2006
[5] Francisco Curbera, Matthew Duftler, Rania Khalaf, William Nagy, Nirmal Mukhi,
and Sanjiva Weerawarana. Unraveling the Web Services Web An Introduction to SOAP,
WSDL, and UDDI, pages 86-93, IEEE Internet Computing
[6] Doug Tidwell, James Snell, Pavel Kulchenko. Chapter 2: Introducing SOAP.
Programming Web Services with SOAP. O'Reilly December 2001
[7] Microsoft Developers Network (MSDN) – Application Architecture: Conceptual
View – Introduction to State “http://windowssdk.msdn.microsoft.com/en-
us/library/z1hkazw7.aspx” Retrieved on September 22, 2006
Stateful Web Services
______________________________________________________________________________________
42
[8] Xiang Song, Namgeun Jeong, Phillip W. Hutto, Umakishore Ramachandran, James
M. Rehg - State Management in Web Services, Proceedings of the 10th IEEE International
Workshop on Future Trends of Distributed Computing Systems.
[9] B. Callaghan. NFS Illustrated. Addison-Wesley, 2000.
[10] Microsoft Developers Network (MSDN) - ASP.NET State Management
Recommendations “http://windowssdk.msdn.microsoft.com/en-
us/library/z1hkazw7.aspx” Retrieved on October 2, 2006
[11] Czajkowski D, Ferguson F, Foster I, Frey J, Graham S, Sedukhin I, Snelling D,
Tuecke S, Vambenepe W. The WS-Resource Framework, 2005.
“http://www.globus.org/wsrf/specs” Retrieved on October 15, 2006
[12] E. Christensen, F. Curbera, G. Meredith, and S.Weerawarana. (2001) Web Services
Description Language (WSDL). [Online]. “www.w3.org/TR/wsdl” Retrieved on October
25, 2006
[13] Microsoft Developers Network (MSDN) – Designing Session Management,
Duwamish Online
[14] Server Farm – “http://www.webopedia.com/TERM/S/server_farm.html” Retrieved
on November 6, 2006