+ All Categories
Home > Documents > Scalable Trigger Processing Discussion of publication by Eric N. Hanson et al Int Conf Data...

Scalable Trigger Processing Discussion of publication by Eric N. Hanson et al Int Conf Data...

Date post: 20-Dec-2015
Category:
View: 213 times
Download: 0 times
Share this document with a friend
35
Scalable Trigger Processing Discussion of publication by Eric N. Hanson et al Int Conf Data Engineering 1999 CS561
Transcript

Scalable Trigger Processing

Discussion of publication byEric N. Hanson et al

Int Conf Data Engineering 1999

CS561

Motivation

Triggers popular for: Integrity constraint checking Alerting, logging, etc.

Commercial database systems Limited triggering capabilities 1 trigger/update-type on table; or at best 100.

But : Current technology doesn’t scale well And, internet and web-based applications

may need millions of triggers.

An Example Trigger

Example “stock ticker notification”: Stock holding: 100*IBM Query: Inform an agent whenever the price of the stock

holding crosses $10,000

Create Trigger stock-watch from quotes q

on update(q.price)when q.name=‘IBM’ and 100*q.price > 10,000do raise event ThresholdCrossed(100*q.price).

Note: We may need 1,000 or millions of such triggers Web interface may allow users to create such triggers

What Next?

Problem description TriggerMan system architecture Predicate index Trigger processing

Problem Definition

Given: Relational DB, Trigger statements, Data Stream Find: Triggers corresponding to each stream item Objective: Scalable trigger processing system

Assumptions: Number of distinct structures of trigger expressions is relatively

small All trigger expression structures small enough to fit in main

memory

The Problem, once more.

Requires millions of triggers (on huge data). Steps for trigger processing

Event monitoringCondition evaluationExecuting triggered action

Response time for database operations critical !

Related Work

Range Predicates, Marking-Based

[Hans96b, Ston90] (large memory, complicated

storage)

AI[Forg82,Mira87](smaller rule set)

Parallel Processing[Gupt89,Hell98]

IndexingECA Model(not scalable)

Overall Driving Idea

If large number of triggers are created, then many have the same format.

Triggers share same expression signature except that parameters substituted.

Group predicates from trigger conditions based on expression signatures into equivalence classes

Store them in efficient main memory data structures

TriggerMan System

Components

TriggerMan Datablade (lives inside Informix) Data Sources

Local/remote tables/streams; must capture updates and transmit to TriggerMan (place in a queue)

TriggerMan Client applications Create /drop triggers, etc.

TriggerMan Driver Periodically involve TmanTest() fn to perform condition testing

and action execution. TriggerMan console

Direct user interaction interface for trigger creation, system shutdown, etc.

TriggerMan Syntax

Trigger syntax

create trigger <triggerName> [in setName][optionalFlags]from fromList[on eventSpec][when condition][group by attributeList][having groupCondition]do action

Example : Salary Increases

Update Fred’s salary when Bob’s salary is updated

create trigger updateFred

from emp

on update (emp.salary)

when emp.name = ’Bob’

do execSQL ’update emp set salary=:NEW.emp.salary where emp.name=’’Fred’’’

Example : Real Estate Database

“If new house added which is in neighborhood that salesperson Iris reprensents then notify her”

House (hno,address,price,nno,spno)Salesperson (spno,name,phone)Represents (spno,nno)Neighborhood (nno,name,location)

create trigger IrisHouseAlerton insert to housefrom salesperson s, house h, represents rwhen s.name = ‘Iris’ and s.spno=r.spno and r.nno=h.nnodo raise event NewHouseInIrisNeighborhood(h.hno, h.address)

Trigger Condition Structure

Expression signature

Expression signature consists ofData source IDOperation code, e.g. insert, delete, etc.Generalized Expression (parameterized)

=

Emp.name CONSTANT

FROM: Data src: empON: Event : updateWHEN: boolean exp.

Condition structure (contd)

Steps to obtain canonical representation of WHEN clause Translate expression to CNF Group each conjunct by data source they refer to

Selection Predicate will be of form :

(C11 OR C12 OR ..) AND ... AND (Ck1 OR …),

where each Cij refers to same tuple variable. Each conjunct refers to zero, one, or more data sources Group conjuncts by set of sources they refer to

If one data source, then selection predicate If two data sources, then JOIN predicate

Triggers for stock ticker notification

Create trigger T1 from stock when stock.ticker = ‘GOOG’ and stock.value < 500 do notify_person(P1)

Create trigger T2 from stock when stock.ticker = ‘MSFT’ and stock.value < 30 do notify_person(P2)

Create trigger T3 from stock when stock.ticker = ‘ORCL’ and stock.value < 20 do notify_person(P3)

Create trigger T4 from stock when stock.ticker = ‘GOOG’ do notify_person(P4)

Expression Signature

Idea: Common structures in condition of triggers

Expression Signature: E1: stock.ticker = const1 and stock.value < const2

Expression Signature:

E2: stock.ticker = const3

Expression signature defines equivalence class of all instantiations of expression with different constants

T4: stock.ticker = ‘GOOG’

T1: stock.ticker = ‘GOOG’ and stock.value < 500T2: stock.ticker = ‘MSFT’ and stock.value < 30T3: stock.ticker = ‘ORCL’ and stock.value < 20

What to do now

Only a few distinct expression signatures, build data structures to represent them explicitly (in memory)

Create constant tables that store all different constants, and link them to their expression signature

Main Structures

A-treat NetworkNetwork for trigger condition testing

For a trigger to fire, all conditions must be true

Expression SignatureCommon structure in a trigger

E1: stock.ticker = const1 and stock.value < const2

Constant TablesConstants for each expression signature

A-Treat Network to represent a trigger

For each trigger condition stock.ticker = const1 and stock.value < const2

Root

stock.ticker = const1 stock.value < const2

alpha-node alpha-node

predicates

Node 1 Node 2

Condition Testing

A-Treat network is a discrimination network for trigger condition testing.

For a predicate to be satisfied, all its conjuncts should be true.

This is checked using A-Treat network.

A-Treat network (Hanson 1992)

Define rule SalesClerk

If emp.sal>30,000

And emp.dno=dept.dno

And dept.name=“sales”

And emp.jno=job.jno

And job.title=“clerk”

Then Action

Expression Signature TableEx.

ID

Data Source

Signature

Description

Constant Table

Number of Constants

Constant Organization

E1 stock … const_e1 2 Main Memory

E2 stock … const_e2 1 Main memory

E1: stock.ticker = const1 and stock.value < const2E2: stock.ticker = const3

Constant Tables Tables of constants in trigger conditions

Ex. ID Trigger ID Constant 1 Constant 2 Next Node Rest

E1 T1 GOOG 500 Node 2

E1 T2 MSFT 30 Node 2

E1 T3 ORCL 20 Node 2

T1: stock.ticker = ‘GOOG’ and stock.value < 500T2: stock.ticker = ‘MSFT’ and stock.value < 30T3: stock.ticker = ‘ORCL’ and stock.value < 20

Ex. ID Trigger ID Constant 1 Next Node Rest

E2 T4 GOOG Null

Const_e2

T4: stock.ticker = ‘GOOG’

Const_e1

Tables

Primary tables trigger_set (tsID, name, comments, creation_date, isEnabled) Trigger (triggerID, tsID, name, comments, trigger_text,

creation_date, isEnabled, …)

Trigger cache in main memory for recently accessed triggers.

Predicate Index

Tables expression_signature(sigID, dataSrcID, signatureDesc,

constTableName, constantSetSize, constantSetOrganization) const_tableN(exprID, triggerID, nextNetworkNode, const1, … constK,

restOfPredicate)

Root of predicate index linked to data source predicate indices Each data source contains an expression signature list Each expression signature links to its constant table. Index expressions on most selective conjunct (rest on fly).

Predicate Index

Goal: Given an update, identify all predicates that match it.

hash(src-ID)

Processing Trigger Definition

Parse the trigger and validate it Convert the when clause to conjunctive normal

form Group the conjuncts by the distinct sets of tuple

variables they refer to Form a trigger condition graph, that is, undirected

graph with node for each tuple variable and edge for join predicates.

Build A-Treat network

Processing trigger definition (2)

For each selection predicate If predicate with same signature not seen before

Add signature of predicate to list And, add signature to expression_signature table If signature has a constant placeholder in it, create a

constant table for the signature. Add constants

Else if predicate has constants, add a row to the constant table for the expression

Alternate Organizations

Storage for the expression signature’s equivalence class: Main memory lists Main memory index Non-indexed database table Indexed database table

For each expression signature, choose a structure depending on number of triggers.

efficiency

Scalability

Processing update descriptors

On getting an update descriptor (token) (data src ID, operator code, old/new tuple)

Locate data source predicate index from root of predicate index.

For each expression signature, find constant matching the token using index.

Check additional predicate clauses against the token. When all predicate clauses of a trigger have matched,

pin the trigger in main memory Bring in A-treat network representing that trigger to

process aremaining part of trigger, like join, etc. If trigger condition is satisfied, execute action.

Processing an Update

Root

Update Stock (ticker=GOOG, value=495)

Index ofstock.ticker=const1

E1 E2

const_e1 const_e2

Other source Predicate index…

Trigger ID Constant 1 Constant 2 Next Node

T1 GOOG 500 Node 2

T2 MSFT 30 Node 2

T3 ORCL 20 Node 2

E1: stock.ticker = const1 and stock.value < const2

const_e1

Concurrency

Better scalability even on single processor

Concurrency

Identified elements that can be parallelized Token-level

Multiple tokens processed in parallel Condition-level

Multiple selection conditions tested concurrently Rule-action-level

Multiple rule actions fired at the same time Data-level

Set of data values in the network processed in parallel

Conclusion : Overall Key Points

If a large number of triggers are created, many of them have almost the same format

Group triggers with same structure together into expression signature equivalence classes

Number of distinct signatures is small enough to fit into main memory (index)

Develop a selection predicate index structures Architecture to build a scalable trigger system.


Recommended