1 Reliable Web Services by Fault Tolerant Techniques: Methodology, Experiment, Modeling and...

transcript

Reliable Web Services by Fault Tolerant Techniques: Methodology, Experiment, Modeling and Evaluation

Term Presentation

Presented by Pat Chan3 May 2006

Outline Introduction Problem Statement Methodologies for Web Service

Reliability New Reliable Web Service Paradigm Road Map for Experiment Experimental Results and Discussion Conclusion

Introduction Service-oriented computing is becoming a reality. Web Service is a promoting technique in the internet. The benefit of interoperability, reusability, and

adaptability. Reliability is an important issue. Existing web service model needs to be extended to

assure survivability and reliability. We propose experimental settings and offer a

roadmap to dependable Web services.

Reliability

"a measure of the success with which the system conforms to some authoritative specification"Guaranteed deliveryDuplicate eliminationOrderingCrash toleranceState synchronization

What are Web Services ?

Self-contained, modular applications built on deployed network infrastructure including XML and HTTP

Use open standards for description (WSDL), discovery (UDDI) and invocation (SOAP)

Web Services

Internet

HTTP/SOAP

Web Services Architecture

SOAPSOAP

HTTP/SMTPHTTP/SMTPXMLXMLTCP/IPTCP/IP

DirectoryDirectory

InspectionInspection

Building Block ModulesBuilding Block Modules

Inter Application ProtocolsInter Application Protocols

ReferralReferral

RoutingRouting

SecuritySecurity

LicenseLicense

EventingEventing TransactionsTransactions

Reliable MessagingReliable Messaging

The InternetThe Internet

DescriptionDescription

……

Web Services

Benefits of WSService-orientedHighly accessibleOpen specificationEasy integration

Simplicity

Dynamic Standard

Web Services

Build common infrastructure reducing the barriers of business integration with lower costs and faster speed.

Problems of Web Services

Transaction Atomicity is not provided

Security Insecure Internet transportation

Reliability The internet is inherently unreliableNo single underlying “transport

protocols” address all the reliability issues.

Problem Statement Fault-tolerant techniques

Replication Diversity

Replication is one of the efficient ways for providing reliable systems by time or space redundancy.

Increasing the availability of distributed systems Key components are re-executed or replicated Protect against hardware malfunctions or transient system faults.

Another efficient technique is design diversity. By independently designing software systems or services with

different programming teams, Resort in defending against permanent software design faults.

We focus on the analysis of the replication techniques when applied to Web services.

A generic Web service system with spatial as well as temporal replication is proposed and investigated.

Methodologies for Reliable Web services -- Redundancy Spatial redundancy

Static redundancy, all replicas are active at the same time and voting takes place to obtain a correct result.

Dynamic redundancy engages one active replica at one time while others are kept in an active or in standby state.

Temporal redundancy Redundancy in time

Methodologies for Reliable Web services -- Diversity

Protect redundant systems against common-mode failures

With different designs and implementations, common failure modes will probably cause different error effects.

N-version programming, recovery blocks…

Failure Response Stages of Web Services Fault confinement Fault detection Diagnosis Fail-over Reconfiguration Recovery Restart Repair Reintegration

Fault Confinement

Fault Detection Fault Detection

Failover Diagnosis

Online Offline

Reconfiguration

Recovery

Restart

Repair

Reintegration

Replication Manager

Web service selection algorithm

WatchDog

Registry

Web ServiceIIS

Application

Database

Web ServiceIIS

Application

Database

Web ServiceIIS

Application

Database

Client

Application

Database

1. Create web services

2. Select primary web service (PWS)

3. Register

4. Look up

5. Get WSDL

6. Invoke web service

7. Keep check the availability of the PWS

8. If PWS failed, reselect the PWS.

9. Update the WSDL

Propose Paradigm

RM sends message to the Web Service

Reselect a primary Web Service

Do not get reply

Map the new address to the WSDL

System Fail

Get reply

All Service failed

Work Flow of the Replication Manager

Road Map for Experiment Research

Redundancy in time Redundancy in space

SequentiallyParallelMajority voting using N modular

redundancyDiversified version of different

services

Experiments

A series of experiments are designed and performed for evaluating the reliability of the Web service, single service without replication,single service with retry or reboot and, service with spatial replication.

We will also perform retry or failover when the Web service is down.

Summary of the Experiments

None Retry/Reboot

Failover Both (hybrid)

Single service, no retry

0 -- -- --

Single service with retry

-- 1 -- --

Single service with reboot

-- 2 -- --

Spatial replication

-- -- 3 4

Parameters of the Experiments

Parameters Current setting/metric

Request frequency 1 req/min

Polling frequency 5 ms

Number of replicas 5

Client timeout period for retry 10 s

Failure rate λ # failures/hour

Load (profile of the program) % or load function

Reboot time 10 min

Failover time 1 s

Experimental Results

Experiments over 360 hour period (43200 reqs)

Normal Resource Problem

Entry Point Failure

Network Level Fault Injection

Exp 0 4928 6130 6492 5324

Exp 1 2210 2327 2658 2289

Exp 2 2561 3160 3323 5211

Exp 3 1324 1711 1658 5258

Exp 4 1089 1148 1325 2210

Retry11.97% to 4.93%

Reboot11.97% to 6.44%

Failover11.97% to 3.56%Retry and Failover11.97% to 2.59%

Number of Failure When the Server is Normal Situation

Number of Failure When the Server is Busy

Number of Failure When the Server Reboots Periodically

Network Level Fault Injection

Reliability of the System Over Time

( ) ( )lim 0.025t

F t t F t

( )( ) t tR t e

(1-c1)μ*

λ* S-1 S-2

(1-c1)μ*(1-c1)μ*

(1-c1)μ1

(1-c1)μ2

(1-c2)μ1

(1-c2)μ2

Reliability Model

1 2 1 2 2 2* (1 ) (1 )C C 1 1 2 2*

ID Description Value

λn Network failure rate 0.02

λ* Web service failure rate 0.025

λ1 Resource problem rate 0.142

λ2 Entry point failure rate 0.150

μ* Web service repair rate 0.286

μ1 Resource problem repair rate 0.979

μ2 Entry point failure repair rate 0.979

C1 Probability that the RM response on time 0.9

C2 Probability that the server reboot successfully 0.9

SHARPE

Failure rate

0.0050.050.010.020.030.04

Reliability with different failure rate

Conclusion

Surveyed replication and design diversity techniques for reliable services.

Proposed a hybrid approach to improving the availability of Web services.

Carried out a series of experiments to evaluate the availability and reliability of the proposed Web service system.

Developed the Reliability Model for the proposed system.

1 Reliable Web Services by Fault Tolerant Techniques: Methodology, Experiment, Modeling and...

Documents