Stateful Web Services - Full Report

Stateful Web Services

An Independent Study Report - I

SUBMITTED TO THE FACULTY OF COMPUTER SCIENCES

ON GRADUATE STUDIES OF

SHAHEED ZULFIKAR ALI BHUTTO INSTITUTE OF SCIENCE & TECHNOLOGY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE (COMPUTER SCIENCE)

Muhammad Jawaid Shamshad MS/PhD (CS) 052210

December 2006

Supervised by

Aslam Parvez Memon


______________________________________________________________________________________

2

To my Parents,

who contributed most in my studies.


______________________________________________________________________________________

3

Acknowledgements

In this era of technology with ever-increasing scope and compressed schedules,

acknowledging the contributions of everyone involved along the way is more and more

important. All too often, we move on to new projects without remembering to thank the

people who have helped us on the current one. A contribution that might on the surface

seem small can in fact make or break a project or present a fresh way of solving a

problem. It's important not only to thank people personally but also to find every

opportunity to recognize their contributions publicly.

The author's study of "Stateful Web Services" need lots of effort, and the author wishes to

thank his colleagues and specially his advisor Sir Aslam Parvez Memon who gave his

precious advice in conducting this study.

Muhammad Jawaid Shamshad

MS/PhD (CS) 052210


______________________________________________________________________________________

4

TABLE OF CONTENTS

CHAPTER 1 - INTRODUCTION ................................................................................... 6

1.1 LITERATURE REVIEW ............................................................................................................................ 7

1.2 RESEARCH METHOD ............................................................................................................................. 7

1.3 PROBLEM .............................................................................................................................................. 7

1.4 NEED .................................................................................................................................................... 7

1.5 MODEL IN PRACTICE ............................................................................................................................. 8

CHAPTER 2 – WHAT ARE WEB SERVICES? .......................................................... 9

2.1 WEB SERVICE ....................................................................................................................................... 9

2.2 SOAP ..................................................................................................................................................10

2.3 WSDL .................................................................................................................................................14

2.4 DISCOVERING WEB SERVICE ...............................................................................................................18

2.4.1 UDDI ...............................................................................................................................................18

2.4.2 EBXML REGISTRIES .........................................................................................................................20

CHAPTER 3 – PROBLEM ............................................................................................ 23

3.1 INTRODUCTION TO STATE ....................................................................................................................23

3.2 NEED FOR STATE MANAGEMENT .........................................................................................................26

3.3 STATE MANAGEMENT IN WEB SERVICES .............................................................................................28

CHAPTER 4 – BEST PRACTICES OF STATE MANAGEMENT .......................... 31

4.1 STATEFUL MODEL ...............................................................................................................................31

4.2 STATE MANAGEMENT TECHNIQUES ....................................................................................................32

4.2.1 IN-MEMORY SESSION ........................................................................................................................32

4.2.2 DATABASES ......................................................................................................................................33

4.3 GENERALIZED MODEL .........................................................................................................................34

4.3.1 FUNCTIONAL DESIGN ........................................................................................................................37

4.3.2 MIDDLE TIER ....................................................................................................................................37

4.3.3 DATA LAYER ....................................................................................................................................37

4.4 ISSUES TO ADDRESS .............................................................................................................................38

4.4.1 SECURITY ISSUES ..............................................................................................................................38

4.4.2 SESSION LIFETIME ............................................................................................................................40

CONCLUSION ............................................................................................................... 40

REFERENCES ................................................................................................................ 41


______________________________________________________________________________________

5

TABLE OF FIGURES

FIGURE 2.1: STRUCTURE OF A SOAP MESSAGE………………………………….……………….....5

FIGURE 2.2: THE SOAP MESSAGE STRUCTURE……………………………………………..……….5

FIGURE 2.3: THE SOAP REQUEST……………………………………………………………...……….6

FIGURE 2.4: THE SOAP RESPONSE………………………………………………………….………….6

FIGURE 2.5: WSDL ABSTRACT DESCRIPTION.……………………………………………………….7

FIGURE 2.6: WSDL’S CONCRETE BINDING INFORMATION…………………………………..…….8

FIGURE 2.7 AN EBXML ARCHITECTURE IN USE………………………………………………...….10

FIGURE 3.1: TRANSACTIONAL STEPS IN A PROCESS……………………………………………...12

FIGURE 4.1: LOGIN PROCESS…………………………………………………………………….…….18

FIGURE 4.2: SERVICE REQUEST PROVIDING TOKEN……………………………………………...19

FIGURE 4.3: SEQUENCE DIAGRAM FOR SESSION MANAGEMENT………………………...…….20


______________________________________________________________________________________

6

Executive Summary

Web services are by nature stateless. There are certain situations where state is needed of

certain resources like user sessions. In web services this is normally required when

applications like business and ecommerce applications based on user sign on needs to

maintain state of clients connected while applications are built on web services which

does not provide an implicit state management facility. This study presents the logical

model for maintaining state of resources in web services.


______________________________________________________________________________________

7

Chapter 1

Introduction

1.1 Literature Review

Literature has been collected from research papers published, like at ACM and IEEE and

chapter excerpts from some books (Books are listed in reference section), and internet.

1.2 Research Method

This study presents the logical model for maintaining state of resources in web services.

Study has been conducted by first defining the problem domain, its need, its scope, and

then its generalized logical model in practice is discussed. The study plan can be depicted

as:

Background Review

Problem Domain

Requirements Specification

Generalized logical model in Practice

1.3 Problem

Web services are by nature stateless. There are certain situations where state is needed of

certain resources like user sessions. In web services this is normally required when

applications like business and ecommerce applications based on user sign on needs to

maintain state of clients connected while applications are built on web services which

does not provide an implicit state management facility.

1.4 Need

State management is difficult to avoid in a number of situations. One situation is to

establish a session between a consumer and a provider. A session is typically established


______________________________________________________________________________________

8

for efficiency reasons. For example, sending a security certificate with each request is a

serious burden for both any consumer and provider. It is much quicker to replace the

certificate with a token shared just between the consumer and provider. Therefore

services need to manage state and lookout this state and ensure through business logic

that it is kept consistent and accurate. This state is the only true and current source of

information.

1.5 Model in Practice

The common practice of managing state is to have a token associated with the resource.

On first request this token is generated after authenticating. This token is then passed

during further communication between client and server. Token can have lifetime and

guaranteed to be unique since no two resources can share the same token, otherwise

doing so will result in conflict. A common example of such type of token is the session id

in login based systems. On login request a unique session id is generated which is then

returned to the client and this client then uses this session id in further communication.

The server will store some information against this session id to save the state of the

client. This session id can be expired on logout request or after a specified time period of

inactivity or idleness.


______________________________________________________________________________________

9

Chapter 2

What are Web Services?

2.1 Web Service

There are several definitions available for describing web services and it is difficult to

give a concrete definition, but according to the authors of “The Semantic Web” [1] the

concrete definition of web service would be:

“Web services are software applications that can be discovered, described, and accessed

based on XML and standard Web protocols over intranets, extranets, and the Internet.”

Starting with the concept, the first sentence, “Web services are software applications”

expresses the main point that Web services are software applications like other usual

software applications which performs some specific tasks depending on their

implementation. In addition these software applications are available on web.

Next, according to the definition web services “can be discovered, described, and

accessed based on XML and standard Web protocols”. This part clearly states that it is

built on XML [2] which is a worldwide accepted standard and supported by majority of

the vendors. Due to this web services are interoperable and the main focus of web service

is interoperability. Other web protocols include Hypertext Transport Protocol (HTTP [3])

which is the underlying communication protocol. So web services use XML as the syntax

of their message and use HTTP to transfer that message. This is the access method. The

message is basically a Simple Object Access Protocol (SOAP [4]) envelop which is in

XML format.

Web services can dynamically be discovered by Universal Description, Discovery, and

Integration (UDDI [5]) registries. These registries keep the record of web services and

their description i.e. syntax of web services.


______________________________________________________________________________________

10

UDDI then provides the description in Web Service Definition Language (WSDL [5])

form which is also in xml format and describes the syntax of web services.

The last part of the definition state that Web services are available “over intranets,

extranets, and the Internet”. This means that web services can be public as well as private

web services for organizations internal use. Or web services can be between two

partnering organizations in a B2B solution. So it is important to understand that web

services are not only public accessible by world, but can also be private accessible within

organization’s intranet.

There is another important concept which should be cleared. Web services are not

dependent on user interfaces. Web services are only APIs which an application can call to

get information like flight schedule web service or it can be an airline reservation web

service. Since message passing is in XML format so its representation is dependent on

application how it displays it.

2.2 SOAP

Simple Object Access Protocol (SOAP) was created by Microsoft, Developmentor, IBM,

Lotus, and UserLand.

SOAP is an XML-based protocol for messaging and remote procedure calls (RPCs). That

is the format of SOAP is XML. The reason for adoption of XML as format of SOAP is

that XML is universally accepted and adopted for data encoding for platforms

independence. SOAP uses existing transport protocols like HTTP, SMPT, and MQSeries

to transfer messages or remote procedure calls.

Web services transfers XML messages in SOAP format which is like envelop and called

SOAP envelop, which contains a SOAP header and a SOAP body. SOAP header contains

the Meta information and the body contains the actual message or remote procedure call


______________________________________________________________________________________

11

(RPC) in XML syntax. This SOAP envelop is sent over HTTP between web service

consumers and web service providers or simply web service. There are also other

protocols as defined above but in usual cases HTTP is used. W3C defines SOAP as “a

lightweight protocol for exchange of information in a decentralized, distributed

environment.” [4]. SOAP provides a standard language for tying client and server

applications together on different platforms in which client application sends a SOAP

request and web service returns a SOAP response.

SOAP is associated with web services and it does not have any relation to object oriented

programming. That means a developer can create a SOAP based web service in C, Pascal

or any similar language which does not support object oriented programming. The only

thing significant is that application written in such languages can understand XML i.e.

can parse and evaluate XML documents, and can communicate over transport protocols

like which SOAP supports like HTTP.

SOAP has been adopted as the standard for Web services, and majority of the vendors

have developed SOAP APIs for their products like Microsoft for their .Net platform and

Sun for Java, thus making integration of software systems much easier.

Now let’s look at the SOAP message syntax and how it works. A SOAP message consists

of an envelope containing an optional header and a required body. Figure 2.1 shows a

SOAP envelope’s structure.


______________________________________________________________________________________

12

<SOAP:Envelope xmlns:SOAP=“http://schemas.xmlsoap.org/soap/envelope/”>

<SOAP:Header>

<!— content of header goes here —>

</SOAP:Header>

<SOAP:Body>

<!— content of body goes here —>

</SOAP:Body>

</SOAP:Envelope>

Figure 2.1: Structure of a SOAP message. The envelope features child elements that

contain the message header and body elements. [5]

The header contains information concerning how the message is to be processed. This

includes routing and delivery settings, authentication or authorization assertions, and

transaction contexts. The body contains the actual message to be delivered and processed.

Anything that can be expressed in XML syntax can go in the body of a message. This is

graphically depicted in Figure 2.2 [6].

Figure 2.2: The SOAP message structure [6]

Let’s look at an example of a simple SOAP message. This example has been taken from

the SOAP 1.1 specification. The Figure 2.3 [5] shows a simple SOAP message for getting

the last trade price of the “DIS” ticker symbol. The SOAP envelope wraps everything in

the message. The encodingStyle attribute of the SOAP envelope shows how the message


______________________________________________________________________________________

13

is encoded, so that the Web service can read it. Next is the SOAP body of the message

that wraps the application-specific information i.e. the call to GetLastTradePrice in the

SOAP body. A Web service receives this information, processes the request in the SOAP

body, and can return a SOAP response.

<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”

SOAP-ENV:encodingStyle=”http://schemas.xmlsoap.org/soap/encoding/”>

<SOAP-ENV:Body>

<m:GetLastTradePrice xmlns:m=”Some-URI”>

<symbol>DIS</symbol>

</m:GetLastTradePrice>

</SOAP-ENV:Body>

</SOAP-ENV:Envelope>

Figure 2.3: The SOAP request [5]

The SOAP response for our example stock price request is shown in the Figure 2.4 [5]

that follows. Just like the request, the message is syntactically the same: It consists of an

envelope that wraps the message, it describes its encoding style, and it wraps the content

of the message in the SOAP body. The message inside the body is different. Under

SOAP-ENV:Body tag, we see that the message is wrapped in the

GetLastTradePriceResponse tag, with the result price shown in Price tag.

<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”

SOAP-ENV:encodingStyle=”http://schemas.xmlsoap.org/soap/encoding/”/>

<SOAP-ENV:Body>

<m:GetLastTradePriceResponse xmlns:m=”Some-URI”>

<Price>34.5</Price>

</m:GetLastTradePriceResponse>

</SOAP-ENV:Body>

</SOAP-ENV:Envelope>

Figure 2.4: The SOAP response [5]


______________________________________________________________________________________

14

2.3 WSDL

Whereas SOAP is the communication language of Web services, but speaking a universal

language is not very useful unless you can maintain the basic conversations that let you

achieve your goals. Now how can we tell what messages must be exchanged to

successfully interact with a service. That role is filled by WSDL. Web Service Definition

Language (WSDL) is the way we describe the communication details and the application-

specific messages that can be sent in SOAP. WSDL, like SOAP, is also in XML format

and developed by IBM and Microsoft. The W3C defines WSDL as “an XML format for

describing network services as a set of endpoints operating on messages containing either

document-oriented or procedure-oriented information”. To know how to send messages

to a particular Web service, an application can look at the WSDL and dynamically

construct SOAP messages. WSDL describes the operational information—where the

service is located, what the service does, and how to invoke the service. The format of

WSDL is very difficult to understand, but it isn’t really intended to be human-readable.

Developers do not have to understand WSDL and SOAP to create Web services. When

developer creates a Web service, most toolkits generate WSDL for you. Then the client

application generates the code for handling the Web service, generally called stub by

looking at the WSDL. Finally, the client application and the Web service can

communicate with each other.

Two pieces of information are described in a WSDL service description. One is an

abstract interface that is the application-level service description, and second is the

specific protocol-dependent detail that client application needs to follow to access the

service. These two types of information are necessary because similar application-level

service functionality is often deployed at different end points with slightly different

access protocol details. Separating the description of these two aspects helps WDSL

represent common functionality between seemingly different end points.

Abstract description is defined in WSDL as messages that need to be exchanged between

client and web service communication. Abstract interface contains components like the


______________________________________________________________________________________

15

vocabulary, the message, and the interaction. Vocabulary describes the type system to

provide data type definitions to exchange the information. WSDL uses external type

systems for this purpose. XSD is the most widely used but any type system is supported

by WSDL. Generally XSD is used to define standard data types like string, int, float etc,

which are supported by most of the languages like C/C++, Java, C# etc. External type

system is used to define custom data types like if developer wants to define his class or

structure and want to use that in communication. Figure 2.5 [5] shows an example in

which two data types are defined in XSD (string and int), and two data types are defined

in external schema (FlightInfoType and Ticket).

<message name=“GetFlightInfoInput”>

<part name=“airlineName” type=“xsd:string”/>

<part name=“flightNumber” type=“xsd:int”/>

</message>

<message name=“GetFlightInfoOutput”>

<part name=“flightInfo” type=“fixsd:FlightInfoType”/>

</message>

<message name=“CheckInInput”>

<part name=“body” element=“eticketxsd:Ticket”/>

</message>

<portType name=“AirportServicePortType”>

<operation name=“GetFlightInfo”>

<input message=“tns:GetFlightInfoInput”/>

<output message=“tns:GetFlightInfoOutput”/>

</operation>

<operation name=“CheckIn”>

<input message=“tns:CheckInInput”/>

</operation>

</portType>

Figure 2.5: WSDL abstract description. This fragment shows the string and int data

types, which are defined in XSD, and two other data types defined in external

schema:FlightInfoType and Ticket, which we assume were imported earlier in the

WSDL file. [5]


______________________________________________________________________________________

16

External XSD definitions are imported in WSDL using an “import” element which

specifies the location of the schema. The message elements are defined in WSDL as

aggregations of parts, and each part is described by XSD types or elements from any

other external schema. Messages provide an abstract, typed data definition sent to and

from the services. The example in Figure 2.5 shows the three messages that might appear

during a Web services interaction. The message, GetFlightInfoInput, has two parts:

airlineName, which is an XSD string, and flightNumber, which is an XSD integer. The other

two messages, GetFlightInfoOutput and CheckInInput have only one part each. Interaction is

defined by the operation and portType elements. Each operation represents a message

exchange pattern that the Web service supports. An operation is simply a combination of

messages labeled as input, output, or fault to indicate what part a particular message plays

in the interaction. A portType is a collection of operations that are collectively supported

by an end point. In our example, AirportServicePortType describes two operations: a single

request-response operation, GetFlightInfo, which expects the GetFlightInfoInput message as

input and returns a GetFlightInfoOutput message as the response; and a one-way operation,

CheckIn, which just takes the CheckInInput message as input.

Among the application-level functionality of the service we also need three more pieces

of information:

what communication protocol to use (such as SOAP over HTTP)

how to accomplish individual service interactions over this protocol, and

where to terminate communication (the network address).

“what” and “how” parts of this information are provided by the binding element of the

WSDL, including the communication protocol and data format specification. In short, the

binding element tells how a given interaction occurs over the specified protocol. Figure

2.6 shows a fragment from our example.


______________________________________________________________________________________

17

<binding name=“AirportServiceSoapBinding” type=“tns:AirportServicePortType”>

<soap:binding transport=“http://schemas.xmlsoap.org/soap/http”/>

<operation name=“GetFlightInfo”>

<soap:operation style=“rpc” soapAction=“http://acme-travel/flightinfo”/>

<input>

<soap:body use=“encoded” namespace=” http://acme-

travel.com/flightinfo” encodingStyle=

“http://schemas.xmlsoap.org/soap/encoding/”/>

</input>

<output>

<soap:body use=“encoded” namespace=“http://acme-

travel.com/flightinfo” encodingStyle=

“http://schemas.xmlsoap.org/soap/encoding/”/>

</output>

</operation>

<operation name=“CheckIn”>

<soap:operation style=“document” soapAction=“http://acme-

travel.com/checkin”/>

<input>

<soap:body use=“literal”/>

</input>

</operation>

</binding>

<service name=“travelservice”>

<port name=“travelservicePort” binding=“tns:AirportServiceSoapBinding”>

<soap:address location=“http://acmetravel.com/travelservice”/>

</port>

</service>

Figure 2.6: WSDL’s concrete binding information. GetFlightInfo is a SOAP RPC

interaction and CheckIn is a pure messaging interaction that uses XSD to describe the

transmitted XML. [5]

The binding describes how to use SOAP to access the service-this combination of

abstract interface and protocol and data marshalling details (the binding). A port element

describes a single end point as a combination of a binding and a network address.

Consequently, a service element groups a set of related ports. In our travel service


______________________________________________________________________________________

18

example, a single port describes an end point that processes SOAP requests for the

travelservice service. WSDL provides a formalized description of client–service

interaction for users and developers. During development, developers use WSDL as the

input to a proxy generator which can be a dynamic invocation proxy that generates client

code according to the service requirements either at development time or at runtime.

Proxy or stub generators relieve the developer to remember or understand all the details

of service access.

2.4 Discovering Web Service

As we have discussed that web services are now being widely used in ecommerce based

applications and to do business on web. Now most of the organizations are moving

towards web service. We have also discussed the technical details of web service that

how web service works and which protocols are used in communication. But knowing

this is not enough since if you don’t know who is providing which web service and what

that web service do? If you know this then you can easily get or build your own

application to communicate with that web service to expand your business. For this

purpose some kind of registry is needed who maintains a list of web services and their

feature so anyone can search his desired web service in that registry and communicate

with it. There are two major technologies for this purpose. One is UDDI (Universal

Description, Discovery, and Integration) and second is ebXML registries. UDDI was

introduced in 2000 by Ariba, Microsoft, and IBM to facilitate the discovery of business

processes. OASIS introduced ebXML in 1999, but their main focus was into consistent

use of XML and standard protocols in EDI (Electronic Data Interchange) community.

But a part of this effort, ebXML registries, is used for the discovery of ebXML business

details. Both of these technologies are worth discussing and competing technologies, so

let’s discuss one by one.

2.4.1 UDDI

Universal Description, Discovery, and Integration (UDDI) is an evolving technology and

is not yet a standard, but it is being implemented and embraced by major vendors like


______________________________________________________________________________________

19

Microsoft and IBM [1]. To understand simply UDDI is like a phone book for Web

services. Where organizations register their information about their web services and

applications can search for that information in the registry. So UDDI allows applications

to dynamically discover web services. When organization registers their information with

UDDI they have to provide following information [5]:

white pages of company contact information,

yellow pages that categorize businesses by standard categorization, and

green pages that document the technical information about web services, like

WSDL.

White pages may include information about a company or organization like organization

name, their contact information like their phone numbers and email addresses,

description of their business, and links to external sources and documents where their

business is described in more detail.

The yellow pages describe the categories or classification that what type of information

their web services provide, plus their products, industry codes, and geographic index.

The green pages describes their ebusiness rules that how to do business with the web

services they have exposed, what are their business rules, and how to invoke the web

service i.e. WSDL.

It is worth noting that these registries can be public or private within organizations where

only inter organization communication is needed. For internal integration, the use of

UDDI private registries is of much value today. Within a large organization, where

several large enterprise applications may need to interoperate, can use UDDI registries

which can be helpful for discovering how to do so. The use of such a private registry

could potentially minimize the use of interoperation documentation and reduce

integration and development time for legacy enterprise applications. Once applications

have been built on web services, and once the interfaces have been described in WSDL


______________________________________________________________________________________

20

and published in a private UDDI registry, other programs and projects within your

organization can dynamically connect and begin to interoperate [1]. Private registries are

therefore secure since these are inter-organization and no alien is allowed to interfere and

it will be operated according to the organization’s policy. You can also use public

registries to expose your business but it is a long debate that putting your organizations

asset into a public registry is risky [1], but after all there are also alternatives and

resolution to this problem, like you can put a list of authorized users who can

communicate with your web service. If anyone wants to talk to your web service then he

has to register himself in your system as an authorized user. Here comes the need for

state management in web services. Since if an authorized entity wants to talk with your

web service then how can you maintain its state that how many times he has talked with

your web service, or if he has tried to do some kind of fraud or performed any kind of

illegal action. If you have been managing its state then you can block him and put it in

your block list. Since to do this you need statistics about him and without managing its

state you can not generate a report which describes his statistics so you can find who is

doing a fear business with you or someone tries to cheat you or wants to play an unfair

game. Securing web service is another topic which is beyond the scope of this study but

we will see that what is the importance of managing state and how can we manage state

in web service environment.

2.4.2 ebXML Registries

The ebXML standard was created by OASIS. The main aim behind ebXML was to

enable business applications exchange data in uniform format and to enable intelligent

business processes using XML. ebXML was developed as a mechanism for XML-based

business language since XML by itself does not provide semantics to solve

interoperability problems. In short, ebXML provides a common way for businesses to

quickly and dynamically perform business transactions based on common business

practices [1]. Figure 2.7 shows an example of an ebXML architecture in use. In the

diagram, company business process information and implementation details are found in

the ebXML registry and businesses can do business transactions after they agree on


______________________________________________________________________________________

21

trading arrangements. Information that can be described and discovered in an ebXML

architecture includes the following [1]:

Business processes and components described in XML

Capabilities of a trading partner

Trading partner agreements between companies

Figure 2.7 An ebXML architecture in use. [1]

Understanding Web Services

The spirit of the ebXML architecture is the ebXML registry, which is the system that is

used to store and discover this information. The ebXML registry contains domain-

specific semantics for B2B. These domain-specific semantics are the product of

agreement on many business technologies and protocols. Simply ebXML could be

described as the start of a domain specific semantic Web. The focus of ebXML was not

initially on Web services, but it now uses SOAP as its message format. Therefore, many

believe that ebXML will have a large role in the future of Web services. Unlike UDDI,

ebXML is a standard. The ebXML standard does have support from many businesses, but

the most influential companies in Web services, IBM and Microsoft, would like to see

UDDI succeed as a registry for business information. However, it is possible that the two


______________________________________________________________________________________

22

technologies can complement each other, and ebXML could succeed in the B2B market,

while private UDDI registries succeed in the EAI market in the short term.


______________________________________________________________________________________

23

Chapter 3

Problem

Services manage state; this state is the very reason for their existence. Services lookout

this state and they ensure through their business logic that it is kept consistent and

accurate. This state is the only true and current source of information [7].

3.1 Introduction to State

Almost all services manage durable state; i.e., state that is stored on some durable

medium such as a file system or a database. The services receive a request from another

service, retrieve some state from that durable medium, and build a response or update the

state. The durable state allows services to be brought down without loss of context; when

they are brought up again, the durable state is still there and they can continue as if

nothing had happened. Services do their best to keep that durable state consistent; they

would like to keep their application state in memory consistent as well, but if something

happens, they can just abort the processing, forget their memory state, and set up anew

using the durable state. Services often use ACID (Atomicity, Consistency, Isolation,

Durability) transactions to maintain consistent durable state [7].

Services handle requests and sometimes they must handle multiple requests to complete a

business transaction. This may be organized using a business process, or process, which

controls the step-by-step actions of executing some work and moving the system from

one state to another [7]. At each step, it performs a business operation. For instance, a

business process may take an incoming order request, update the order system, send a

response, and then update the customer relationship management (CRM) system and the

production system. Another example would be a process that manages the complete

order, delivery, and payment process. It may accept a request for a quote, send the

requested quote, accept an order, check whether the order can be fulfilled, send an order

confirmation, arrange delivery, send an invoice, and so on. Each step moves the process

from one consistent state to the next.


______________________________________________________________________________________

24

Processes deal with three types of durable state [7]:

Permanent state. This is the state that is updated by the processes. This is what

we usually think of when talking about the state of a service. In the previous

examples, this is the order database and the CRM database. This state stays in

existence after the processes finish.

Process state. A business process may persist its state; this allows the service to

stop and restart a process. The business process saves its state and stops. When a

message is received for that process, the process is started again; it retrieves its

state and handles the request. We call the persisted state of a process the process

state. The process state exists only for the duration of the process; when the

process is finished, the process state is removed.

Message state. When messages are sent from one service to another, they are

created in memory. Then they are transported, and finally created in memory on

the other side. However, when using a store and forward mechanism such as a

queue, the message is persisted and then transported, persisted on the other side

and removed at the source. On the receiving side, similar process takes place. The

message state exists while the message is being sent.

As stated earlier, processes move from one consistent state to the next; the combination

of all three types of state must be consistent after each step.

A process executes business activities in a step-by-step manner. Processes may update

their durable state, they may send messages, and they may persist their own state. Each

step may consist of multiple actions and each step takes the process from one consistent

state to the next consistent state. The state is considered consistent if the combination of

the permanent state, the process state, and the message state is consistent. For example, if

you want to do a debit and a credit posting, the state would be consistent if the debit

posting is done and the process state has the information that the credit posting still needs

to be done. The state would be consistent if the order is accepted and the process has the

information that the CRM and the production system need to be updated, or the state is


______________________________________________________________________________________

25

consistent if the order is accepted and the messages to the CRM system and the

production system have been sent.

Figure 3.1 illustrates how a process service steps through a series of transactions.

Figure 3.1: Transactional steps in a process [7]

Services contain state, and the current state is always kept inside the service, protected

from outsiders. Services not only work with the data that they manage themselves; in

many cases, they use data that is obtained from another service [7].

Services exist to guard state. You allow other services to access the state in your service

through a number of well-defined interfaces. For example, a message is sent to a service,

which includes a request to access state stored within the service. The service may then

change some of that state or send a copy of it back to the requesting service.

Messages may be sent or received within an ACID transaction; however, it is not

practical to wait a long time for a response within an atomic transaction.


______________________________________________________________________________________

26

3.2 Need for State Management

State management is the process by which state is maintained over multiple requests for

the same or different web services. As is true for any HTTP-based technology, Web

Services are stateless, which means that they do not automatically indicate whether the

requests in a sequence are all from the same client.

Resource state includes any piece of information or data that affects the behavior of the

resource, for example catalogs, shopping carts, user options, lists of reviews, and hit

counters. State management can be complex because a wide variety of usage patterns,

data types, and access methods are available to application developers building state-

based solutions [8].

There are several situations where state is needed. There are even stateful solutions but

when we talk about the technologies which are implemented as stateless like HTTP

protocol [3], and Web Services, it is difficult to model state upon these technologies.

These protocols or technologies do not keep track of requests in sequence. State

management is difficult to avoid in a number of situations. One situation is to establish a

session between a consumer and a provider. A session is typically established for

efficiency reasons. For example, sending a security certificate with each request is a

serious burden for both any consumer and provider. It is much quicker to replace the

certificate with a token shared just between the consumer and provider. Another situation

is to provide customized service. Let’s illustrate an example to understand this.

Consider an e-commerce application which exposes web services to its client and clients

interact with it using these web services. Like B2B sites needs information to get the list

of products from vendors and prices of these products. These details can be retrieved at

runtime on user request using web services. Now if user wants to buy any product then he

might be asked to login. When user is logged in he can add different products from

different vendors to his cart and can finally check out. This whole user interaction needs

to be managed so it can be determined that this particular user has chosen what products


______________________________________________________________________________________

27

to buy. If web services are used in back end then state maintenance is a problem for web

services. The web services provide stateless client-server interactions using the stateless

HTTP protocol. Stateless means client requests are independent and no memory of

previous requests is required. This approach mirrors file server designs like NFS [9] and

simplifies failure recovery. No dialogue is required with former clients after a crash. But

modern software like P2P applications frequently need state maintenance, and state must

be maintained on both sides since communicating entities are peers. Therefore, a flexible

and efficient state maintenance capability is necessary.

There are generally two kinds of state: session and application state [8]. Recent requests

from a particular client are in the same session. Session state is only visible within that

particular session. Website shopping carts are examples of session state. Session state is

per-user while application state is per-application. Note that web servers often support

multiple applications simultaneously. Application state is shared across sessions within

the same application but is not shared between distinct applications or instances.

Application state is typically maintained in an external database. A variety of techniques

are used to implement session state. The standard solution is to store most application and

session state in a database. But external databases are not well-suited for storing

application state for certain types of increasingly important applications. Pervasive

computing and sensor-based environments often involve dynamic, P2P interactions.

Many such systems can be viewed as managing streams of potentially high-volume data.

For example, surveillance applications require coordination and processing many sensor

streams (video, audio, motion data). Such applications require:

(1) Performance: Database storage may be too slow to satisfy throughput and

latency requirements of the applications we consider.

(2) Fault-tolerance: Web services are frequently implemented using a single

shared web server resource. Web services are sometimes ill-behaved, requiring

occasional maintenance and restart of web servers. Memory-based application state will

be lost by these administrative restarts. In general, web services should provide some

form of fault-tolerance.


______________________________________________________________________________________

28

(3) Persistence: While much data in persistent computing environments is

temporary, it is important to persist data periodically for subsequent analysis and

processing.

People are using different mechanisms to manage state in stateless environment [8]. It

solely depends on the needs and requirement of the application. Sometimes state is

maintained in memory, but this is not feasible since application crash, power failure, or

any other kind of failure can loose any state information saved in memory. Some times

there are separate applications called state servers to manage state. This requires inter

process communication which in case of real time applications is not feasible. Since inter

process communication would be intense and resource hungry, which degrades

performance. Another disadvantage is the server reboot or shut down which will cause

loss of state. Databases are also used to save the state. Normally session state is feasible

for applications which do not require real time processing. Like some e-commerce

applications in which user logs in and performs its tasks, buy anything, and then logs out.

These kind of applications needs the state of that user, what user has viewed, which item

he has moved from the system, or what he has purchased etc. this kind of information can

be stored in database. Let’s look at these different approaches in detail.

3.3 State Management in Web Services

Solving the state management problem with fault tolerance on web services is a complex

problem. Since there are several factors that needs to be considered. When implementing

state in web service environment it also depends on the requirements and scenario.

Another thing that needs to be considered is to maintain state of what resource. Normally

when track between service calls is required then user state is managed. There are several

mechanisms proposed like one can generate a token and store user related information

against it in a repository and return that token to the client. Then client can provide that

token in each subsequent call to identify itself and server can get its state information

from the repository. One point to note here is that a repository is needed to store the state.

This repository can be a server’s memory, a separate server, a database, or any other


______________________________________________________________________________________

29

repository. Each repository has its own pros and cons. Lets start with the in memory state

management.

The in memory state management is suitable for only shorter time period, where state is

maintained for resources which are not too crucial. This approach does not handle the

server shutdown or restarts. So there should be a proper mechanism to handle it.

Another possibility is that another state server manages the state for the server. This

approach will have its own consequences. It allows restarting the main server which will

not cause the state but if state server is restarted then we have the same problem over

here.

Database is also a good option. In which state related data is stored in database. This

approach provides fault tolerance since server restart will not loose state related data.

It is common for an e-commerce application to maintain state information by using a

relational database for the following reasons [10]:

Security

Personalization

Consistency

Data mining

The following are typical features of a database supported solution:

Security: The visitor types an account name and password into an application

logon dialog. The application infrastructure calls a web service which in turn

queries the database with the logon values to determine whether the user has

rights to utilize application. If the database validates the user information, the

application will distribute a valid token containing a unique ID for that user on

that client computer. The application grants access to the user.

Personalization: With security information in place, application can distinguish

each user by reading the token on the client computer. Typically, applications


______________________________________________________________________________________

30

have information in the database that describes the preferences of a user

(identified by a unique ID). This relationship is known as personalization. The

application can research the user's preferences using the unique ID contained in

the token, and then place content and information in front of the user that pertains

to the user's specific wishes, reacting to the user's preferences over time.

Consistency: If anyone has created a commerce application, he might want to

keep transactional records of purchases made for goods and services. This

information can be reliably saved in the database and referenced by the user's

unique ID. It can be used to determine whether a purchase transaction has been

completed, and to determine the course of action if a purchase transaction fails.

The information can also be used to inform the user of the status of an order

placed using the application.

Data mining: Information about application usage, application visitors, or product

transactions can be reliably stored in a database. For example, a business

development department might want to use the data collected from the application

to determine next year's product line or distribution policy. Marketing department

might want to examine demographic information about users. Engineering and

support departments might want to look at transactions and note areas where

purchasing process could be improved. Most enterprise-level relational databases,

such as Microsoft SQL Server, and Oracle contain an expansive toolset for most

data mining projects.

By designing the application to repeatedly query the database by using the unique Id

during each general stage in the above scenario, the application maintains state. In this

way, the user perceives that the application is remembering and reacting to him or her

personally.

The last database driven approach is described in more detail with an abstract model in

next the chapter.


______________________________________________________________________________________

31

Chapter 4

Best Practices of State Management

Online transactions require a persistence of state through many server data exchanges.

Session creates a logical connection to maintain state between client and server upon web

services which is a connectionless and stateless protocol. The information relevant to a

particular session of a specific user is known as the session state.

Session state management is all about connecting or associating a web service request

with other previous requests generated from the same session, as these requests appear

unrelated to the web service because of the connectionless and statelessness nature of the

web service.

4.1 Stateful Model

Organizations have been using different models as per their needs to manage state in

stateless environment. Like state management in HTTP has been in practice, which is a

stateless protocol. There is not a standard model for maintaining state in stateless

environment but a common generalized model can be proposed for certain environments

like Web Services. Research organizations are working to propose a model and to

standardize the state management using web services. Web Services Resource

Framework (WSRF) [11] is a first step towards this, which has been proposed to manage

resources and their state through web service interface. This framework also needs and

proposed enhancements in other web services related technologies like WSDL [5], and

have proposed WSDL 2.0 [12], but the problem with WSRF is that its compatibility with

the WS-I architecture is very limited due to which it was ignored and major vendors like

Microsoft and Sun chose to adopt alternatives.

To manage state in current web service environment, software architects and developers

are using several methods as per their needs. We will discuss a generalized model for


______________________________________________________________________________________

32

managing state in web services, especially to maintain state of user conversation, or in

other words users’ session state.

4.2 State Management Techniques

SOAP is a connectionless specification for how a web service consumer communicates

with a web service. Due to the connectionless and statelessness nature of the web service,

session management is the challenge that developers come across while building Web

applications using web services. From the viewpoint of the Web server, each web service

request appears as though it is a separate and distinct request, unrelated to any previous

requests. That means the information a user gives on one web service request is not

automatically available on the next web service requested. The inability of the web

services to retain knowledge of previous requests means it is difficult to write

applications, such as an online catalog, where the application might need to track the

catalog items a user has selected while jumping between the various pages of the catalog.

For many reasons, such as data security, size limit, durability, etc., one might want to

implement a better and more robust session management technique. There are many

techniques used to manage session state, all of which require session state information to

be explicitly passed between the web service consumer and the web service. This

information can be unique identifier of the session generally called a session Id. There are

different techniques of storing session id. For example, the session Id can be stored on the

client-side which consumes the web service, and the rest of the data, such as user

information, can be stored on the server-side which exposes web services [13].

4.2.1 In-memory session

The time a user requests for login, the server internally generates a session Id using a

complex algorithm. So at the start of a new session, the server returns the session Id to

the client [13] and keeps a reference of it in its memory. Though the in-memory session

object works well in a single server environment, it is not very useful in a farm [14]. In a

farm, where there is a cluster of servers, web service requests are routed to each server in


______________________________________________________________________________________

33

a round-robin fashion to distribute load and allow to handle more requests. The Session

object is tied to a single server and is not shared among servers in the farm. Because

requests from the same user can be routed to any available server for load balancing,

session information can potentially be lost between requests. In order to use the session

object in a clustered server environment, we can dedicate a single server to handle all

requests from a user for the lifetime of the session. In doing so, however, we will

compromise scalability as the distribution of load among multiple servers is not fairly

balanced.

4.2.2 Databases

In view of the limitations of the in-memory session objects, let’s look at the technique in

which session management is moved to a database server accessible to all other servers in

the farm. Any database can be used to maintain session state. The main advantage of

storing session state in the back-end database is that state information can be durable and

one can store state information as per his needs [13]. Each user will be given a unique

identifier that will serve as a key to the user's information in the database. This key is

normally called a Session Id.

Session information is stored based on the unique identifier, which is generated every

time a new session starts. So, even if the same user logs on twice from two different

sessions, they will not be given a single identifier for their sessions and, therefore,

information associated with one session is not accessible from other session.

The drawback of storing session information in a database is that it puts a greater load on

the server, because the application requires more time-consuming database transactions.

This technique does not impose a security risk, because the client stores only the unique

identifier, and other sensitive data is stored in the database, session information is secure

in this technique, and it is always better to put greater load on the server than to risk

security.


______________________________________________________________________________________

34

4.3 Generalized Model

As it discussed that there are several techniques for state management, now let’s discuss a

generalized model that is in common use when managing state. First a token generator is

needed which generates a unique token or identifier for each client. Token will act as a

key for the client. This can be a universally unique ID like GUID or UUID. Then to

maintain state there should be a repository in which session state can be stored. Databases

are a good choice for that. It is previously discussed that why database is a preferred

choice for state management.

Now, let’s look at the whole process how it will work. First user or client will call a login

web service and will provide its credentials. Server will first authenticate the user, if user

is authenticated, and then server will ask the token generator to generate a unique token.

Token generator will generate a new unique token. Server will then store user

information regarding its session against that token in repository or database, and finally

return that token to the client. Client will then subsequently call further web services

providing the token. If server exposes a web service say GetAppointments, then client will

call this web service and will also provide the token to verify itself as an authorized user.

Server will accept that token and will verify with the token stored in database. If token is

verified then server will return the desired information to the client.

This is a simplified mock-up. There are several issues which have to be addressed to

properly implement this generalized model.


______________________________________________________________________________________

35

Client

Server

Repository

Token generator

1

2

3

4

Login Process

1 – Client calls a web service

2 – Server gets a unique token from token generator

3 – Store that token in repository against client credentials

4 – Returns the token to client which client will use in subsequent calls

Figure 4.3: Login Process


______________________________________________________________________________________

36

Client

Server

Repository

1

2

3

Service request providing token

1 – Client calls a web service providing token

2 – Server matches the token in repository and if verified then gets user and requested information from database

3 – Respond the client with information requested

Figure 4.4: Service request providing token


______________________________________________________________________________________

37

4.3.1 Functional Design

Figure 4.3 shows the sequence of events that take place with respect to session

management in various layers [13].

Figure 4.3: Sequence diagram for session management [13]

4.3.2 Middle Tier

The session management code services application use cases, such as account

management, and order purchase by providing read and write service. Session state are

always read and written as a whole. For instance, the account management service reads

the whole session state and extracts only the user information to operate on it. It then

modifies the session state by including the updated user information and sends the whole

session state to the database.

The middle-tier components explicitly generate a globally unique identifier (GUID) that

is used as the unique session identifier for every user who logs on to the system and

associate it with the user's session information [13].

4.3.3 Data Layer

In this technique database layer maintains a Session table to store the session data. The

key to a user's session data is the unique session Id assigned to him by the middle-tier

components on logging on to the system. There are certain interfaces exposed by the data


______________________________________________________________________________________

38

layer to access the Session table. These interfaces are called by the middle-tier

components to insert or update session information in the session table [13].

4.4 Issues to Address

There are several issues which must be resolved. When implementing this model one

might ask that:

Why generate token, not keeping the list of allowed IP addresses?

What about false token generation?

What if token gets stolen?

Is there any token lifetime?

Most of the issues are security related. Since security is definitely an issue when

managing user session state, but security experts has advised best practices for these

kinds of problems.

4.4.1 Security Issues

One of the frequently asked and important issues is that why use randomly generated

unique token or Identifier for client authentication and not keeping a list of allowed IP

addresses?

The answer is that keeping list of IP addresses is good when you know who are your

clients and what are their IP addresses, and their IP addresses will not change. This can

be implemented where number of clients is few. When there are huge number of clients

then this token is necessary, also because if client is connected to the network using a dial

up connection then the client will get a different IP each time he get connected, so list of

IP addresses is not enough to authorize a client. Another factor is that there can be more

than one users logged in from a same host or IP address, which is normally a case for

users behind firewall. If list of IP addresses is maintained then it will not be possible for

two different users to be logged in from same host or IP address or their information will


______________________________________________________________________________________

39

be jumbled with each other and they can not be identified as two different users. So a

unique token is required for each user session.

Hackers constantly try to break security by any means, whether they have to sniff

network traffic to get user conversation or to break security key or by any other means.

Hackers can break or generate a false token, but this totally depends on token generation

algorithm and process that how hard it makes to generate a false token. Other two issues

are also related to this and there are different practices for that. Use of SSL is

recommended to prevent hackers from sniffing the conversation and keep a short timeout

for a session. If user is idle for few minutes then expire its token i.e. it’s token or session

lifetime should be short.

There are also other considerations like when authenticated, on what criteria server

should return results of requests. This totally depends on the application which has

exposed the web services. This is the internal business logic of that application that how

or on what criteria to return or accept the information from the client. One practice that

normally software architects follow is that they store some user and his session related

information in database against the session Id. This information can be used in criteria for

accepting or returning information.

A client request is identified by its session Id. If a hacker were able to capture the session

Id in use by an active session, he could submit valid requests and, hence, access the user's

private data [13].

You may choose to employ any of several possible alternatives to this problem. The

session data for any session should not be stored for long on the server—that is, the

lifetime should be short.

The longer the session Id is stored, the higher the probability that it will be steal. Session

Id can be periodically changed to reduce exposure, without shortening the lifetime of the

associated session. The frequency of changing the session Id should be a factor to


______________________________________________________________________________________

40

consider when using this method. SSL is also an option to keep the conversation secure.

But it degrades performance since there is an extra encryption and decryption overhead.

4.4.2 Session Lifetime

In general, the lifetime of a session starts when the user visits the first time, and ends

when he leaves. In login based schemes, a session starts when a user logs on. The session

ends when the user finally leaves by logging out [13] or when it is expired.

On online shopping applications, users typically spend most of their time at the catalog

browsing before adding a single item to the shopping cart, and often leave the site after

browsing the catalog without selecting any items for purchase. Maintaining session state

for such users is unnecessary, because there really isn't any data that needs to be saved.

For this reason, the online application starts the session only after the logon process has

been completed. This helps to avoid maintaining unnecessary session state.

The session state is maintained in the database and removed on a daily basis, since user

might leave without logging out or user may be idle for a long time. This is necessary for

security reasons which have been discussed in the previous section. If there is a need for

longer sessions then there is always the possibility of extending the session lifetime

management by implementing a way for users to return to their previous sessions at a

later time, thereby increasing the lifetime of a session. In this case, the session ends when

the session state is removed from the system, preventing the user from returning to their

earlier session.

Conclusion

This study tried to assess different approaches toward building session-based or stateful

web services. In general, it is recommended that web services be designed according to

the principles of a service-oriented architecture. However, it is sometimes desirable to

build services capable of referencing each other, which may lead to a finer-grained,

session-oriented services design. When building a new service, it is worth considering


______________________________________________________________________________________

41

carefully the pros and cons of all design styles, which can result in a better integration

solution for a targeted domain.

REFERENCES

[1] Micheal C. Daconta, Leo J. Orbst, Kevin T. Smith. Chapter 4: Understanding Web

Services. The Semantic Web:A Guide to the Future of XML, Web Services, and

Knowledge Management. Wiley Publication, Inc. 2003

[2] Extensible Markup Language (XML) – “http://www.w3.org/TR/xml/” Retrieved on

September 8, 2006

[3] Fielding, Gettys, Mogul, et al. Hypertext Transfer Protocol – HTTP/1.1 RFC 2616,

Internet Society, 1999. "http://www.w3.org/Protocols/rfc2616/rfc2616.html" Retrieved

on September 8, 2006

[4] Simple Object Access Protocol (SOAP) – “http://www.w3.org/TR/SOAP” Retrieved

on September 8, 2006

[5] Francisco Curbera, Matthew Duftler, Rania Khalaf, William Nagy, Nirmal Mukhi,

and Sanjiva Weerawarana. Unraveling the Web Services Web An Introduction to SOAP,

WSDL, and UDDI, pages 86-93, IEEE Internet Computing

[6] Doug Tidwell, James Snell, Pavel Kulchenko. Chapter 2: Introducing SOAP.

Programming Web Services with SOAP. O'Reilly December 2001

[7] Microsoft Developers Network (MSDN) – Application Architecture: Conceptual

View – Introduction to State “http://windowssdk.msdn.microsoft.com/en-

us/library/z1hkazw7.aspx” Retrieved on September 22, 2006


______________________________________________________________________________________

42

[8] Xiang Song, Namgeun Jeong, Phillip W. Hutto, Umakishore Ramachandran, James

M. Rehg - State Management in Web Services, Proceedings of the 10th IEEE International

Workshop on Future Trends of Distributed Computing Systems.

[9] B. Callaghan. NFS Illustrated. Addison-Wesley, 2000.

[10] Microsoft Developers Network (MSDN) - ASP.NET State Management

Recommendations “http://windowssdk.msdn.microsoft.com/en-

us/library/z1hkazw7.aspx” Retrieved on October 2, 2006

[11] Czajkowski D, Ferguson F, Foster I, Frey J, Graham S, Sedukhin I, Snelling D,

Tuecke S, Vambenepe W. The WS-Resource Framework, 2005.

“http://www.globus.org/wsrf/specs” Retrieved on October 15, 2006

[12] E. Christensen, F. Curbera, G. Meredith, and S.Weerawarana. (2001) Web Services

Description Language (WSDL). [Online]. “www.w3.org/TR/wsdl” Retrieved on October

25, 2006

[13] Microsoft Developers Network (MSDN) – Designing Session Management,

Duwamish Online

[14] Server Farm – “http://www.webopedia.com/TERM/S/server_farm.html” Retrieved

on November 6, 2006

Date post:	07-Aug-2015
Category:	Documents
Upload:	muhammad-jawaid-shamshad
View:	40 times
Download:	1 times

Stateful Web Services - Full Report

Documents