+ All Categories
Home > Documents > Reliability on Web Services Pat Chan 31 Oct 2006.

Reliability on Web Services Pat Chan 31 Oct 2006.

Date post: 15-Jan-2016
Category:
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
34
Reliability on Web Services Pat Chan 31 Oct 2006
Transcript
Page 1: Reliability on Web Services Pat Chan 31 Oct 2006.

Reliability on Web ServicesPat Chan

31 Oct 2006

Page 2: Reliability on Web Services Pat Chan 31 Oct 2006.

Outline Introduction to Web Services Problem Statement Methodologies for Web Service Reliability New Reliable Web Service Paradigm Optimal Parameters Experimental Results and Discussion Conclusion Future Directions

Page 3: Reliability on Web Services Pat Chan 31 Oct 2006.

Introduction Service-oriented computing is becoming a reality. The problems of service dependability, security and

timeliness are becoming critical. We propose experimental settings and offer a

roadmap to dependable Web services.

Page 4: Reliability on Web Services Pat Chan 31 Oct 2006.

Problem Statement Fault-tolerant techniques

Replication Diversity

Replication is one of the efficient ways for providing reliable systems by time or space redundancy. Increasing the availability of distributed systems Key components are re-executed or replicated Protect against hardware malfunctions or transient system faults.

Another efficient technique is design diversity. By independently designing software systems or services with different programming

teams, Resort in defending against permanent software design faults.

We focus on the analysis of the replication techniques when applied to Web services.

A generic Web service system with spatial as well as temporal replication is proposed and investigated.

Page 5: Reliability on Web Services Pat Chan 31 Oct 2006.

Road Map for Research Redundancy in time Redundancy in space

Sequentially Parallel Majority voting using N modular redundancy Diversified version of different services

Page 6: Reliability on Web Services Pat Chan 31 Oct 2006.

Replication Manager

Web service selection algorithm

WatchDog

UDDI

Registry

WSDL

Web ServiceIIS

Application

Database

Web ServiceIIS

Application

Database

Web ServiceIIS

Application

Database

Client

Port

Application

Database

1. Create web services

2. Select primary web service (PWS)

3. Register

4. Look up

5. Get WSDL

6. Invoke web service

8.. Keep check the availability of all the web.

9. If Web service failed, update the list of availability of Web services

7. Update the WSDL

Proposed Paradigm

Page 7: Reliability on Web Services Pat Chan 31 Oct 2006.

Round Robin

Web ServiceIIS

Application

Database

Web ServiceIIS

Application

Database

Web ServiceIIS

Application

Database

Web ServiceIIS

Application

Database

Page 8: Reliability on Web Services Pat Chan 31 Oct 2006.

RM sends message to the Web Service

Update the Web service availability list

Do not get reply

Map the new address to the WSDL after RR

System Fail

Get reply

All Service failed

Work Flow of the Replication Manager

Page 9: Reliability on Web Services Pat Chan 31 Oct 2006.

Experiments A series of experiments are designed and performed for

evaluating the reliability of the Web service, Spatial replication 0 0 0 0 1 1 1 1

Reboot 0 0 1 1 0 0 1 1

Retry 0 1 0 1 0 1 0 1

Spatial replication

Reboot

Retry

(1,1,1)

(0,1,1)

(1,0,0)

(0,0,0)

(1,0,1)

(1,1,0)

(0,1,0)

(0,0,1)

Page 10: Reliability on Web Services Pat Chan 31 Oct 2006.

Parameters Computational load Communication Request frequency Load of the server Retry frequency Polling frequency Reboot time

Page 11: Reliability on Web Services Pat Chan 31 Oct 2006.

Fault Mode Machine – reboot the server with probability Network – Network Fault Injection Tool (NFIT) Application – inject fault in the application

Failure definition: 5 retries are allowed. If there is still no correct result

from the web service after 5 retries, it is considered as a failure.

Page 12: Reliability on Web Services Pat Chan 31 Oct 2006.

Experimental Setup Examine the computational to communication ratio Examine the request frequency to limit the load of

the server to 75% Fix the following parameters

Computational to communication ratio (e.g 10:1) Request frequency

Page 13: Reliability on Web Services Pat Chan 31 Oct 2006.

Experimental Setup -- Overloadprime(2, 12000)

Communication time: Computational time

143:14 (10:1)

Request frequency 5 req/s (200ms)

Load 78.5%

Timeout period of retry 10s

Timeout for web service in RM 1s (web service specific)

Polling frequency 5 requests per min

Number of replicas 3

Max number of retries 5

Round Robin rate 1 s

Page 14: Reliability on Web Services Pat Chan 31 Oct 2006.

Experiment Parameters Fault mode

Timing Calculation Temporary (fault probability: 0.01) Permanent (fault probability: 0.001)

Experiment time 30mins (9000 requests)

Measure: Number of failures Average response time (ms)

Page 15: Reliability on Web Services Pat Chan 31 Oct 2006.

Experimental Result (failures / response time in ms)

Experiments Single server Single server with retry

Single server with reboot (continues no response for 3 requests)

Single server with retry and reboot

Spatial ReplicationRR

Hybrid approachRR+Retry

Hybrid approachRR+Reboot

All round approach RR spatial + Retry(5 times) + reboots

Normal case 0 / 157 0 / 156 0 / 156 0 / 155 0 / 155 0 / 156 0 / 156 0 / 157

Timing Temp 93 / 158 0 / 269 91 / 160 0 / 256 89 / 156 0 / 258 85 / 155 0 / 243

Calculation Temp

87 / 157 0 / 254 88 / 157 0 / 261 78 / 155 0 / 244 87 / 155 0 / 231

Timing Perm 8174 / -- 8362 / -- 2505 / -- 3 / 2938 7177 / -- 7284 / -- 76 / 157 0 / 172

Calculation Perm

8256 / -- 8581 / -- 2782 / -- 2 / 2872 7529 / -- 7425 / -- 69 / 158 0 / 176

Page 16: Reliability on Web Services Pat Chan 31 Oct 2006.

Varying the parameters Number of tries Timeout period for retry in single server Timeout period for retry in our paradigm Polling frequency Number of replicas Load of server

Page 17: Reliability on Web Services Pat Chan 31 Oct 2006.

Number of tries

Number of tries Number of failures in Temp

Number of failures in Perm

0 95 76

1 2 2

2 0 0

3 0 0

4 0 0

5 0 0

Page 18: Reliability on Web Services Pat Chan 31 Oct 2006.

Timeout period for retry in single serverTimeout period for retry (s)

Number of failures in Temp

Number of failures in Perm

0 95 7265

2 2 7156

5 0 7314

6 0 6890

7 0 189

8 0 82

9 0 11

10 0 2

12 0 0

14 0 0

16 0 0

18 0 0

Page 19: Reliability on Web Services Pat Chan 31 Oct 2006.

Timeout period for retry in single server

# of failure

Timeout period

Page 20: Reliability on Web Services Pat Chan 31 Oct 2006.

Timeout period for retry in our paradigm

Timeout period for retry (s)

Number of failures in Temp

Number of failures in Perm

0 2 81

2 0 2

5 0 0

10 0 0

20 0 0

Page 21: Reliability on Web Services Pat Chan 31 Oct 2006.

Polling frequency

Polling frequency (number of requests per min)

Number of failures in Temp

Number of failures in Perm

0 0 7124

1 0 811

2 0 30

5 0 12

10 0 1

15 213 254

20 1124 1023

Page 22: Reliability on Web Services Pat Chan 31 Oct 2006.

Polling frequency

# of failure

Polling frequency

Page 23: Reliability on Web Services Pat Chan 31 Oct 2006.

Vary the reboot time with different polling frequency

# of failure

Polling frequency

Reboot time

Page 24: Reliability on Web Services Pat Chan 31 Oct 2006.

Number of Replicas

Number of replicas Number of failures in Temp Number of failures in Perm

No replica 91 8152

2 2 356

3 0 0

4 0 0

Page 25: Reliability on Web Services Pat Chan 31 Oct 2006.

Load of Web ServerLoad of the web server

(%)Number of failures in Temp

Number of failures in Perm

70 0 0

75 0 0

80 2 3

85 10 14

90 512 528

95 3214 3125

100 8792 8845

110 8997 8994

Page 26: Reliability on Web Services Pat Chan 31 Oct 2006.

Summary of Parameters Number of tries = 2 Timeout period for retry in single server = 10s Timeout period for retry in our paradigm = 5s Polling frequency = 10 request per min Number of replicas = 3 Load of server = < 75%

Page 27: Reliability on Web Services Pat Chan 31 Oct 2006.

Cross-continental Experiments cross-continental

(Berlin < - >CUHK) Setting: UDDI: gaia Replication Manager: gaia Client: 141.20.33.17 Replica: PC89084, PC89083, PC89082

Page 28: Reliability on Web Services Pat Chan 31 Oct 2006.

Experimental Result (failures / response time in ms)

Experiments Single server Single server with retry

Single server with reboot (continues no response for 3 requests)

Single server with retry and reboot

Spatial replicationRR

Hybrid approachRR+Retry

Hybrid approachRR+Reboot

All round approach RR spatial + Retry (5 times) + reboots

Normal case 0 / 672 0 / 673 0 / 673 0 / 671 0 / 671 0 / 672 0 / 673 0 / 671

Timing Temp

92 / 674 0 / 776 91 / 673 0 / 771 89 / 672 0 / 778 85 / 672 0 / 772

Calculation Temp

89 / 673 0 / 768 88 / 672 0 / 769 78 / 671 0 / 774 87 / 673 0 / 769

Timing Perm

8210 / -- 8258 / -- 2410 / -- 5 / 3076 7456 / -- 7310 / -- 71 / 674 0 / 712

Calculation Perm

8189 / -- 8412 / -- 2623 / -- 3 / 2985 7125 / -- 7514 / -- 68 / 671 0 / 718

Page 29: Reliability on Web Services Pat Chan 31 Oct 2006.

Cross-continental 2 Setting: UDDI: gaia Replication Manager: gaia Client: 141.20.33.17 Replica: luna, Uranus, PC89084, PC89083

Page 30: Reliability on Web Services Pat Chan 31 Oct 2006.

Experimental Result (failures / response time in ms)

Experiments Single server Single server with retry

Single server with reboot (continues no response for 3 requests)

Single server with retry and reboot

Spatial replicationRR

Hybrid approachRR+Retry

Hybrid approachRR+Reboot

All round approach RR spatial + Retry (5 times) + reboots

Normal case 0 / 410 0 / 412 0 / 411 0 / 411 0 / 410 0 / 412 0 / 411 0 / 410

Timing Temp

89 / 412 0 / 513 90 / 413 0 / 512 89 / 412 0 / 510 87 / 413 0 / 511

Calculation Temp

91 / 411 0 / 514 87 / 411 0 / 509 80 / 410 0 / 508 88 / 412 0 / 509

Timing Perm

8210 / -- 8412 / -- 2714 / -- 5 / 2856 7451 / -- 7247 / -- 75 / 411 0 / 431

Calculation Perm

8145 / -- 8247 / -- 2563 / -- 4 / 2745 7320 / -- 7235 / -- 68 / 410 0 / 433

Page 31: Reliability on Web Services Pat Chan 31 Oct 2006.

Average Average response time from local: 156ms Average response tine from CUHK: 670ms Average response time of the whole system:

411ms

Page 32: Reliability on Web Services Pat Chan 31 Oct 2006.

Conclusion Surveyed replication and design diversity

techniques for reliable services. Proposed a hybrid approach to improving the

availability of Web services. Carried out a series of experiments to evaluate

the availability and reliability of the proposed Web service system.

Optimal parameter are obtained.

Page 33: Reliability on Web Services Pat Chan 31 Oct 2006.

Future Directions Improve the current fault-tolerant techniques

Current approach can deal with hardware and software failures.

How about software fault detectors? N-version programming

Different providers provide different solutions. There is a problem in failover or switch between

the Web Services.

Page 34: Reliability on Web Services Pat Chan 31 Oct 2006.

Future Directions Modeling the Web Service behavior

The behavior of the We Services can be studied. Modeling the failure

The failure model of Web Service is different from traditional software.


Recommended