SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in...

Post on 06-Jul-2015

124 views 0 download

Tags:

description

SERENE 2014 - 6th International Workshop on Software Engineering for Resilient Systems http://serene.disim.univaq.it/ Session 4: Monitoring Paper 3: Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems

transcript

1. Quanopt Ltd.

Combined Error Propagation Analysis and Runtime Event Detection in

Process-driven Systems

Gábor Urbanics, László Gönczy, Balázs Urbán, János Hartwig, Imre Kocsis

2. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

3. Quanopt Ltd.

Motivation

Analyse complex IT system oDuring development

oDuring integration

oAt runtime

oBased on system models

Generate analysis for huge systems

Extendable

4. Quanopt Ltd.

Process modelling

Business process: oDirectly executed models (e.g. BPMN)

In a complex systems there are many supporting resources oWe present a method for business process and

supporting resources together oOnly general tools:

• Markov chains, Event trees • Too general, modelling could be hard

oDevelopment tools • Basic performance analysis • Business activity monitoring

5. Quanopt Ltd.

Contributions

Multi aspect modelling of complex (IT) systems oCustom, general process and resource model

Qualitative error propagation analysis oRoot cause and sensitivity analysis

oUsing finite domain constraint satisfaction problem

Runtime process monitoring

6. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

7. Quanopt Ltd.

Approach

Process model

Resource model

Annotation model

System model Error Propagation

Analysis

Monitoring

[New Monitoring Rule]

[New Constraint]

Physical and Logical

Can be imported

Failure modes

Error propagation

behavior

Extra annotations for analysis

8. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

9. Quanopt Ltd.

Motivational example

Design time analysis capabilities oSPOF analysis

oProcess-level effects of resource faults

oPropagating resource errors to the resource layer

10. Quanopt Ltd.

Case study

Large

transaction?

ReceiptN

Y

N

N

Y

Y

Client

Business Processes Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Manual

laundering check

Perform full

check

Timeout

Client checked

earlier?

Legend

Activity Execution Path

11. Quanopt Ltd.

Process with resources

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Manual

laundering check

Perform full

check

Timeout

DB

Client checked

earlier?

Cashier Module

Single

Hypervisor

Blade Server

Legend

Activity

ResourceDependency

Execution Path

12. Quanopt Ltd.

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Manual

laundering check

Perform full

check

Timeout

DB

Client checked

earlier?

Cashier Module

Outage 1

Outage 1

Stuck 1

Single Fault 1

Outage 1

Stuck 1

Single

Hypervisor

Blade Server

Legend

Outage 1

Resource Setup Identifier

Failure Mode

Use Case Id

Activity

ResourceDependency

Execution Path

Single fault in physical layer

13. Quanopt Ltd.

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Virtualized

HA Cluster

Manual

laundering check

Perform full

check

Timeout

Blade

Server Farm

DB

Client checked

earlier?

Cashier ModuleDegraded 2

Degraded 2

Failover 2

Single Fault 2

Delay-incurred Cost 2

Delayed 2

Delayed

Delay-incurred Cost 2

2

Legend

Outage 1

Resource Setup Identifier

Failure Mode

Use Case Id

Activity

ResourceDependency

Execution Path

Effects of a single fault

14. Quanopt Ltd.

Backwards error propagation

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Virtualized

HA Cluster

Manual

laundering check

Perform full

check

Timeout

Blade

Server Farm

DB

Client checked

earlier?

Cashier ModuleSQLInjected 3

OK 3

OK 3

OK 3

SQLInjected 3

SQLInjected 3

Legend

Outage 1

Resource Setup Identifier

Failure Mode

Use Case Id

Activity

ResourceDependency

Execution Path

15. Quanopt Ltd.

Motivational example

Design time analysis capabilities oSPOF analysis

oProcess-level effects of resource faults

oPropagating process errors to the resource layer

16. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

17. Quanopt Ltd.

Design time analysis

Error propagation rules oThrough the process’ execution path

oThrough dependencies

Translate model to constraint satisfaction problem (CSP)

Solution of the CSP provide the results oOf root cause analysis

oSensitivity analysis Process model

Resource model

Annotation model

System model Error Propagation Analysis

Monitoring

18. Quanopt Ltd.

What is CSP?

Constraint satisfaction problem oProblems defined mathematically

• A set of variables

• Constraints between them

A general solver can find the solution oA single or a list of variable layouts

oAll constraints satisfied

19. Quanopt Ltd.

Business Processes Layer

Form processingCustomer login

Legend

Activity Execution Path

Sample mapping to CSP

(Customer_login_run)

(Form_processing_run)

20. Quanopt Ltd.

Sample mapping to CSP

(Customer_login_delay & Customer_login_run)

(Form_processing_delay)

Business Processes Layer

Form processingCustomer login

Legend

Activity Execution Path

21. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

22. Quanopt Ltd.

Runtime process monitoring

Runtime monitoring based on the same model

Rule based online event processing oEvents captured during the execution

oEach time a rule satisfied • Notification can be recorded

• Update of rule-specific process metrics

Coverage checks

Annotation-based rule synthesis

Process model

Resource model

Annotation model

System model Error Propagation Analysis

Monitoring

23. Quanopt Ltd.

Architecture of the prototype

•Process Model •Resource Model •Fault model

•Process Execution Log

•Diagnostic Rules •Propagation Rules •Tagging

•Dependability bottleneck •Process hotspots

•Runtime diagnostic metrics •Runtime alerts

24. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

25. Quanopt Ltd.

Future work

System model and fault model „libraries”

Hierarchical modelling

Hierarchical/Incremental CSP evaluation

Uncertain failure modes

Back annotation of monitoring results oQualitative abstraction

Precise modelling frontend

Connection with optimisation methods

26. Quanopt Ltd.

Conclusion

Design time analysis of business processes oWith the use of a resource model

oRoot cause analysis

oDetermine weak points

Rule based runtime diagnostic oProcess monitoring based on event processing

oRule synthesis

oCoverage test