Querying Business Processes Under Models of Uncertainty
Daniel Deutch, Tova MiloTel-Aviv University
ERPERP
HR HR SystemSystem
eComeCommm
CRMCRM
LogisticsLogistics
CustomerCustomer
BankBank
SupplierSupplier
Querying BPs Under Models of Uncertainty2
Outline
Introduction & Motivation
External Events
Partial Tracing
Related and Future work
Querying BPs Under Models of Uncertainty3
Introduction & Motivation
External Events
Partial Tracing
Related and Future work
Outline
Querying BPs Under Models of Uncertainty4
Web-based Business Processes are very popular
Querying possible \ likely execution flows of such applications
Allows for optimization, personalized ads, improved business logic,…
Queries over specification structure
Database approach to specification analysis
Querying Business ProcessIntroduction and Motivation
Querying BPs Under Models of Uncertainty6
BP specifications are compiled to running code
Modeled as nested DAGs Each DAG corresponds to a function\web-page
Node pairs model activities activation and completion Edges mark flow relation Atomic/compound activities Nesting models implementation relations \ links Guarding formulas (on external events) model choices
Recursion is allowed
BP SpecificationsIntroduction and Motivation
Querying BPs Under Models of Uncertainty7
Example SpecificationIntroduction and Motivation
F1
F3F2
chooseTravel
chooseTravel
F4
…
$searchType = “flights only”
Confirm
$searchType = “flights + hotels”
$searchType = “flights+hotels+cars”
…
…
$airline=“BA”
Login
Login
Flights
Flights
Advertise
Advertise
Confirm$airline=“AF”
Confirm
Login
Login
Flights
FlightsAdvertise
Advertise
Confirm
Hotels
Hotels…
$hotel=“Marriott”
$hotel=“CrownePlaza”
…
$choice=“confirm”
F1
$choice=“reset”
…
$airline=“AL”
…
Querying BPs Under Models of Uncertainty8
A a specific run is called an EX-flow
Describes activities that occurred in practice + their flow relation
Obtained from a BP by choosing one implementation for each compound activity
Zoom-in edges connect implementations to activation and completion nodes of the corresponding compound activity
Execution trace (Ex-trace) is the log recorded for an EX-flow
Traces size for a given BP may be unbounded
Execution FlowIntroduction and Motivation
Querying BPs Under Models of Uncertainty9
Example Execution FlowIntroduction and Motivation
chooseTravel
chooseTravel
Login
Login
Flights
$searchType=“flight only”
$airline = “BA”
Confirm
07:00
07:02
07:04
07:05
07:10
07:30
07:40
07:43
07:41
07:42
Flights
Confirm
Advertise
Advertise...
$choice = “confirm”
...
Querying BPs Under Models of Uncertainty10
External Effects, such as user choices, server response time, etc.
Partial Tracing, due to lack of storage, confidentiality, etc.
Thus the past is uncertain, and the future unknown
Sources of UncertaintyIntroduction and Motivation
Querying BPs Under Models of Uncertainty11
Introduction & Motivation
External Events
Partial Tracing
Related and Future work
Outline
Querying BPs Under Models of Uncertainty12
Even when traces of past executions are known, future executions are hard to predict
The effect of external events is modeled by logical formulas, guarding implementations
At run-time, formulas’ truth values determines the chosen implementations
UncertaintyExternal Events
Querying BPs Under Models of Uncertainty13
Example queries:
What is the possible behavior (flow) for users that do not finalize their reservation?
Which hotels can be reserved by British Airways fliers?
The set of all EX-flows conforming to a query may be large (possibly infinite)
But some are more interesting than others…
TOP-K most likely EX-flowsExternal Events
Querying BPs Under Models of Uncertainty14
We define likelihood of EX-flows
Then find the top-k most likely out of these conforming to a query (TOP-K-MATCHES)
What is the typical behavior (flow) for users that do not finalize their reservation?
Which hotels are preferred by British Airways fliers?
TOP-K answers reflect common usage patterns
TOP-K likely EX-flows (cont.)External Events
Querying BPs Under Models of Uncertainty15
Execution patterns
Intuitive, similar in structure to execution traces
Seek for occurrences (homomorphism) of the pattern within (any) trace’s sub-graph
May contain transitive nodes and edges
May contain a projection part
Query languageExternal Events
Querying BPs Under Models of Uncertainty16
Example QueryExternal Events
chooseTravel
chooseTravel
Any
Any
Start
Start
$Airline = “BA”Flights
Flights
Hotels
Hotels
Confirm
Confirm
chooseTravel
chooseTravel
chooseTravel
chooseTravel
Querying BPs Under Models of Uncertainty17
We distinct three classes of distributions, according to their level of dependency
Memory-less (markovian): no dependencies between formulas.
Bounded-memory: dependency in(at most) B last values of each formula.
General
Distribution ClassesExternal Events
Querying BPs Under Models of Uncertainty18
Our Example SpecificationExternal events
F1
F3F2
chooseTravel
chooseTravel
F4
…
$searchType = “flights only”
Confirm
$searchType = “flights + hotels”
$searchType = “flights+hotels+cars”
…
…
$airline=“BA”
Login
Login
Flights
Flights
Advertise
Advertise
Confirm$airline=“AF”
Confirm
Login
Login
Flights
FlightsAdvertise
Advertise
Confirm
Hotels
Hotels…
$hotel=“Marriott”
$hotel=“CrownePlaza”
…
$choice=“confirm”
F1
$choice=“reset”
…
$airline=“AL”
…
Querying BPs Under Models of Uncertainty20
For memory-less distribution, we find the TOP-K matches in PTIME (data complexity)
[compute a compact representation of output]
For bounded-memory distributions, NP-completeness in the data size, but we give powerful heuristics
In all settings, NP-completeness in the query size
For general distributions, we show undecidability
ResultsExternal Events
Querying BPs Under Models of Uncertainty21
For memory-less distributions, Dynamic Programming algorithm
Gradually computes a table holding the i’th most probable trace rooted at any activity
Implemented with satisfactory performance
Algorithms (intuition)External Events
Querying BPs Under Models of Uncertainty22
For bounded-memory, we build a memory-less BP where activity is a “state”, holding all relevant information for future computations.
This “explodes” exponentially the number of compound activity names.
Optimizations utilize conditional independencies between formulas.
Compute approximation of “actual memory”, and maintain only that.
Algorithms (intuition, cont.)External Events
Querying BPs Under Models of Uncertainty23
Introduction & Motivation
External Events
Partial Tracing
Related and Future work
Outline
Querying BPs Under Models of Uncertainty24
Partial Tracing, due to lack of storage, confidentiality,…
• Naïve tracing records all activities accurately
• Semi-Naïve tracing contains only partial information on the names of some activities
• Selective tracing may omit some activities occurrences
Tracing systems (called types) are represented by a renaming function and a deletion set
Types of Partial TracesPartial Tracing
Querying BPs Under Models of Uncertainty25
Trip
Trip
Search
Search
Hotel Flight
Flight
Credit1
Hotel Credit1
Luxury
Luxury
Search
Search
LuxHotel LuxFlight
LuxFlight
Credit2
LuxHotel Credit2
Credit2
Credit2
Credit1
Credit1
Example BPPartial Tracing
Querying BPs Under Models of Uncertainty26
Trip
Trip
Search
Search
Hotel
Hotel
Flight
Flight
Credit1
Credit1
Trip
Trip
Search
Search
LuxHotel
LuxHotel
Credit2
Credit2
LuxFlight
LuxFlight
Credit2
Credit2
Luxury
Luxury
Credit1
Credit1
Naïve TracesPartial Tracing
Querying BPs Under Models of Uncertainty27
Trip
Trip
Search
Search
Hotel
Hotel
Flight
Flight
Credit
Credit
Trip
Trip
Search
Search
Hotel
Hotel
Credit
Credit
Flight
Flight
Credit
Credit
Luxury
Luxury
Credit
Credit
Semi-Naïve TracesPartial Tracing
Querying BPs Under Models of Uncertainty28
Trip
Trip
Search
Search
Hotel
Hotel
Flight
Flight
Credit
Credit
Trip
Trip
Search
Search
Hotel
Hotel
Credit
Credit
Flight
Flight
Credit
Credit
Credit
Credit
Selective TracesPartial Tracing
Querying BPs Under Models of Uncertainty29
VLDB sneak preview…
Type inference: Given an input type and a query over its traces, infer a type representing exactly the qualifying traces
Type checking: Given also an output type, verify that the qualifying traces conform to it
Practical motivation stems from optimization of queries over execution traces repositories
Type Inference & CheckingPartial Tracing
Querying BPs Under Models of Uncertainty30
Type Inference• Impossible with only naïve tracing• Possible, but with exponential blowup for semi-naïve tracing
systems• In PTIME for selective tracing systems
Type Checking• NP-hard and solvable in EXPTIME for naïve and semi-naïve
tracing systems• Undecidable for selective tracing systems
ComplexityPartial Tracing
Querying BPs Under Models of Uncertainty31
Selective tracing is “ideal” for type inference
Other methods should be considered for type checking
Practical ImplicationPartial Tracing
Querying BPs Under Models of Uncertainty32
Given a partial trace, what is its most likely origin?
Or, more generally, given a pattern (query) of partial traces, what are the
most likely origins of partial traces of this pattern ?
Good news: All query evaluation algorithms extend to this context (even without knowing the tracing system…)
Lets talk about (top-k) queriesPartial Tracing
Querying BPs Under Models of Uncertainty33
Introduction & Motivation
External Events
Partial Tracing
Related and Future work
Outline
Querying BPs Under Models of Uncertainty34
(Probabilistic) Recursive State Machines with temporal logic as query language
Probabilistic Relational DBs
Probabilistic XML
Graph grammars with MSO (or FO) as query language
BP and Web applications mining
Related workRelated & Future Work
Querying BPs Under Models of Uncertainty35
Practical applications: • Web-sites design• On-line advertisements• Improved business logic
Enriched Query Language• Joins• Data values• Projection queries with further aggregation functions
Distribution
Optimization
Efficient Type checking under some restrictions
Inference of specifications/probability distributions
Future workRelated & Future Work