+ All Categories
Home > Documents > Taxonomy-Based Software Construction for Algorithmic Families · PDF...

Taxonomy-Based Software Construction for Algorithmic Families · PDF...

Date post: 15-Feb-2018
Category:
Upload: dangkhanh
View: 213 times
Download: 0 times
Share this document with a friend
55
Taxonomy-Based Software Construction for Algorithmic Families Bruce W. Watson with Loek Cleophas & Derrick Kourie [email protected] VaMoS 2016 2016.01.28
Transcript
Page 1: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

Taxonomy-Based Software Construction for Algorithmic Families

Bruce W. Watson

with Loek Cleophas & Derrick Kourie

[email protected]

VaMoS 2016 2016.01.28

Page 2: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

2

Aim & Motivation

RuZA Workshop, CSIR 2014/08/14

AimGive overview and examples of Taxonomies and of the TABASCO approach—TAxonomy-BAsed Software COnstruction

Motivation Understand and bring order to (algorithmic) domain, and construct reusable (algorithmic) software for it

VaMoS 2016 2016.01.28

Page 3: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

3

Random Quotes

RuZA Workshop, CSIR 2014/08/14

Bjarne Stroustrup“infrastructure software” has stronger quality and elegance requirements

C.A.R. (Tony) Hoare “…[your] taxonomies are to the field of algorithmics what the Standard Model is to Particle Physics…”

VaMoS 2016 2016.01.28

Page 4: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

4VaMoS 2016 2016.01.28

TABASCO Exercises

• Keyword pattern matching • Finite automata construction • Deterministic finite automata minimization • Minimal acyclic deterministic finite automata construction • Lempel-Ziv-style compression • Tree automata (pattern matching & acceptance) • Graph representations • Approximate & 2D pattern matching

Page 5: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

5VaMoS 2016 2016.01.28

Case Study: Generalised Stringology

• Regular Grammar and Regular Expression – Different types, transformations between them

• Problems – Membership/Acceptance – Keyword Pattern Matching (KPM)

• Finite Automaton – Nondeterministic with/without epsilon-transitions, deterministic

• Theoretical Results (1950s) – Equivalence of NFA and DFA (subset construction) – Equivalence of RG, RE, and FA – Solve by constructing and using FA based on RG/RE

Page 6: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

6VaMoS 2016 2016.01.28

Case Study: Generalised Stringology (cont.)

• In practice (1960s - now): – Many applications

• Natural language text search • DNA processing • Network intrusion and virus detection

– Many FA constructions, acceptance/KPM algorithms—O(102) • More efficient; for specific situations

– Difficult to find, understand, compare – Separation between theory and practice – Hard to compare and choose implementations

Page 7: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 8: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 9: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 10: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 11: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

11VaMoS 2016 2016.01.28

Motivation Arbology (Tree Formal Languages)

• Regular Tree Grammar (and Regular Tree Expression) – Different types, transformations between them

• Problems – Membership/Tree (Grammar) Acceptance (TGA), Tree Parsing – Tree Pattern Matching (TPM)

• Finite Tree Automaton (TA) – Nondeterministic with/without epsilon-transitions, deterministic – Undirected, root-to-frontier (RF), frontier-to-root (FR)

• Theoretical Results (1960s) – Equivalence of TAs (except DRFTA) (subset construction) – Equivalence of RTG, RTE, and TA (except DRFTA) – Solve by constructing and using TA based on RTG

Page 12: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

12VaMoS 2016 2016.01.28

Motivation Tree Formal Languages/Algorithmics

• In practice (ca. 1975 - now): – Quite a few application domains as well

• Code generation • Term rewriting • Model transformation

– Many TA constructions, TGA/TPM algorithms • More efficient; for specific situations

– Difficult to find, understand, compare – Separation between theory and practice – Hard to compare and choose implementations

Page 13: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

13VaMoS 2016 2016.01.28

Generic Domain “Attractions”

• Well-established theory • Algorithmic problems—related, with related solutions • Many algorithms • Many applications

Page 14: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

14VaMoS 2016 2016.01.28

Generic Domain Deficiencies

• Inaccessibility of theory and algorithms • Difficulty of understanding and comparing algorithms

– Difference in style – Difference in formality

• Separation between theory and practice • Lack of large collection of implementations • Difficulty of choosing between algorithms

Page 15: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

15VaMoS 2016 2016.01.28

TABASCO— Domain Deficiencies & Taxonomies

• Inaccessibility of theory and algorithms • Difficulty of understanding and comparing algorithms

– Difference in style – Difference in formality

• Separation between theory and practice • Lack of large collection of implementations • Difficulty of choosing between algorithms

• Classification (in particular Taxonomy) – Show commonality & variation in algorithm & data representation – Show correctness – Easily find and compare algorithms

Page 16: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

16VaMoS 2016 2016.01.28

TABASCO— Domain Deficiencies & Toolkits

• Inaccessibility of theory and algorithms • Difficulty of understanding and comparing algorithms

– Difference in style – Difference in formality

• Separation between theory and practice • Lack of large collection of implementations • Difficulty of choosing between algorithms

• Toolkit, GUI, DSL – Give insight into algorithm properties, performance – Understand and compare algorithms in practice – Allow easy choice and use

Page 17: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

17VaMoS 2016 2016.01.28

TABASCO—Steps

Process consists of multiple steps: 1. Selection of domain 2. Literature survey 3. Classification construction 4. Toolkit design 5. Toolkit implementation 6. Benchmarking 7. DSL/GUI design 8. DSL/GUI implementation

Page 18: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 19: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

. . . Eukaryotes

. . .

Plantae Animalia

. . .

Mammalia

. . .

Proboscidea

Elephantidae

Loxodonta Africana

Primates

. . .

Homo Sapiens

19VaMoS 2016 2016.01.28

Classifications Biological Taxonomies

• Classify organisms • From abstract, general

to concrete, specific • Properties (details) explicit • Allow comparison

Page 20: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

t-acceptor

rf

det

fr

det

match-set

rec

tabulate filter

tabulate

s-path

sp-matcher

det

aca-spm drfta-spm

20VaMoS 2016 2016.01.28

Classifications Algorithm Taxonomies

• Similar to biologicaltaxonomies

• Algorithm taxonomiesclassify algorithmsbased on essential details

• Depicted as tree/DAGNodes refer to algorithms,branches to details

• Algorithms solving one algorithmic problem – From abstract, general to concrete, specific – Root represents high-level algorithm

Page 21: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

P

+

S

+

E

AC

AC-OPT

AC-FAIL

KMP-FAIL

LS

OKW

INDICES

GS

NLAU OLAU

NFS

OPT

BMCW NLA

CW

CW

BM

BM

OKW

SP

LMIN

SSD

EGC

BMH

BMH

GS

S F FO (SO)EGC

RSA RFA RFO (RSO)

OBM

INDICES

OKW

MO

SL

MI

MO

FWD REV OM

SL

NONE SFC FAST SLFC

LSKP

Aho-Corasick

Commentz-Walter

Boyer-Moore

Knuth-Morris-Pratt

21VaMoS 2016 2016.01.28

Taxonomies Presentation & Correctness—Top-down• Root represents high-level algorithm

– With pre-/postcondition, invariants, ... – Correctness easily shown

• Adding detail – Obtains refinement/variation

(from literature or new) – Branch connecting

algorithm node to child node – Associated correctness arguments—correctness-preserving

• Correctness of root and of details on rootpath implycorrectness of node—correctness-by-construction approach (Dijkstra et al., Eindhoven; Kourie & Watson, 2012)

Page 22: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

P

+

S

+

E

AC

AC-OPT

AC-FAIL

KMP-FAIL

LS

OKW

INDICES

GS

NLAU OLAU

NFS

OPT

BMCW NLA

CW

CW

BM

BM

OKW

SP

LMIN

SSD

EGC

BMH

BMH

GS

S F FO (SO)EGC

RSA RFA RFO (RSO)

OBM

INDICES

OKW

MO

SL

MI

MO

FWD REV OM

SL

NONE SFC FAST SLFC

LSKP

Aho-Corasick

Commentz-Walter

Boyer-Moore

Knuth-Morris-Pratt

22VaMoS 2016 2016.01.28

Taxonomies Presentation & Correctness—Top-down• Allow comparison

– Commonalitieslead to common pathfrom root*

• Multiple pathsto same solution possible

• Main goal: improve understandingof algorithms and their relations,i.e. commonalities and variabilities

• Taxonomy forms domain model, classification – So do feature model, formal concept lattice, topic map, ...

Page 23: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

23VaMoS 2016 2016.01.28

Dijkstra refinement

!"#$%!&'(!)*%!(+,%*(-*.,#/0!123,%!

4)5%6!7*89*+-%:)#(;4!"<!#$.42!

#$)#'(!:$)#!.#!.(!68=/0!%5).>6!

9*+-%?@)(#)*A;*B0!@)(#)*A;*B!@;*!

:%90!)47!53!,$;4%!4+59%*!

CD8EFG8HII8HJDEA!K;*!L;9!#.#>%M

,;(.#.;4!"+47%*!53!4)5%N/O!<!:)(!

#$.42.4B!;@!PK;+47%*Q!;*!PR$.%@!

1-.%4#.(#Q!;*!PR;8@;+47%*Q

IF statementRefine {P}S{Q} to

{ P }if G

0

! { P ^ G0

} S0

{ Q }[] G

1

! { P ^ G1

} S1

{ Q }fi

{ Q }

if P =) G0

_ G1

For example

{ pre m and n are integers }if m � n ! x := m; y := n[] m n ! x := n; y := mfi

{ post x = mmax n ^ y = mmin n }

Note nondeterminism

!"#$%!&'(!)*%!(+,%*(-*.,#/0!123,%!

4)5%6!7*89*+-%:)#(;4!"<!#$.42!

#$)#'(!:$)#!.#!.(!68=/0!%5).>6!

9*+-%?@)(#)*A;*B0!@)(#)*A;*B!@;*!

:%90!)47!53!,$;4%!4+59%*!

CD8EFG8HII8HJDEA!K;*!L;9!#.#>%M

,;(.#.;4!"+47%*!53!4)5%N/O!<!:)(!

#$.42.4B!;@!PK;+47%*Q!;*!PR$.%@!

1-.%4#.(#Q!;*!PR;8@;+47%*Q

IF statementRefine {P}S{Q} to

{ P }if G

0

! { P ^ G0

} S0

{ Q }[] G

1

! { P ^ G1

} S1

{ Q }fi

{ Q }

if P =) G0

_ G1

For example

{ pre m and n are integers }if m � n ! x := m; y := n[] m n ! x := n; y := mfi

{ post x = mmax n ^ y = mmin n }

Note nondeterminism

Page 24: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

24VaMoS 2016 2016.01.28

Dijkstra refinement

!"#$%!&'(!)*%!(+,%*(-*.,#/0!123,%!

4)5%6!7*89*+-%:)#(;4!"<!#$.42!

#$)#'(!:$)#!.#!.(!68=/0!%5).>6!

9*+-%?@)(#)*A;*B0!@)(#)*A;*B!@;*!

:%90!)47!53!,$;4%!4+59%*!

CD8EFG8HII8HJDEA!K;*!L;9!#.#>%M

,;(.#.;4!"+47%*!53!4)5%N/O!<!:)(!

#$.42.4B!;@!PK;+47%*Q!;*!PR$.%@!

1-.%4#.(#Q!;*!PR;8@;+47%*Q

IF statementRefine {P}S{Q} to

{ P }if G

0

! { P ^ G0

} S0

{ Q }[] G

1

! { P ^ G1

} S1

{ Q }fi

{ Q }

if P =) G0

_ G1

For example

{ pre m and n are integers }if m � n ! x := m; y := n[] m n ! x := n; y := mfi

{ post x = mmax n ^ y = mmin n }

Note nondeterminism

!"#$%!&'(!)*%!(+,%*(-*.,#/0!123,%!

4)5%6!7*89*+-%:)#(;4!"<!#$.42!

#$)#'(!:$)#!.#!.(!68=/0!%5).>6!

9*+-%?@)(#)*A;*B0!@)(#)*A;*B!@;*!

:%90!)47!53!,$;4%!4+59%*!

CD8EFG8HII8HJDEA!K;*!L;9!#.#>%M

,;(.#.;4!"+47%*!53!4)5%N/O!<!:)(!

#$.42.4B!;@!PK;+47%*Q!;*!PR$.%@!

1-.%4#.(#Q!;*!PR;8@;+47%*Q

DO loops

For invariant I and variant expression V we get

{ P }{ I }do G ! { I ^ G }

S0

{ I ^ (V decreased) }od

{ I ^ ¬G }{ Q }

Remember to check P =) I and I ^ ¬G =) Q

Page 25: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

25VaMoS 2016 2016.01.28

• Detail choice and order dependon personal preference& domain understanding

• Inclusion of different ordersfor single algorithm leads todirected acyclic graph

• Initial version by Watson& Zwaan (1992-1996)

• Revised & extended – Cleophas (2003) – Cleophas, Watson

& Zwaan (2004; 2010)

P

+

S

+

E

AC

AC-OPT

AC-FAIL

KMP-FAIL

LS

OKW

INDICES

GS

NLAU OLAU

NFS

OPT

BMCW NLA

CW

CW

BM

BM

OKW

SP

LMIN

SSD

EGC

BMH

BMH

GS

S F FO (SO)EGC

RSA RFA RFO (RSO)

OBM

INDICES

OKW

MO

SL

MI

MO

FWD REV OM

SL

NONE SFC FAST SLFC

LSKP

Aho-Corasick

Commentz-Walter

Boyer-Moore

Knuth-Morris-Pratt

Taxonomies Example: Keyword Pattern Matching

Page 26: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

26VaMoS 2016 2016.01.28

Taxonomies Example: Keyword Pattern Matching

CW

P

+

S

+

E

-

ACAC-OPT

AC-FAIL KMP-FAIL

LS

OKW

INDICES

GS

NLAU OLAU

NFS

OPT

BMCW NLA

CW

BM

BM

OKW

SPPBP

OKW

SHOBPLMIN

SSD

EGC

BMHBMH

GS

S F FO SO

EGC

RSA RFA RFO (RSO)

backward(suffix,factor,

factor oracle -based)

forward (prefix-based)

shiftfunctions

(leading tosublinear

algorithms)

choice of f(P) & dR,f (automatonrecognizingf(P)R)

Page 27: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 28: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 29: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 30: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 31: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 32: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 33: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 34: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

34VaMoS 2016 2016.01.28

Algorithm (and Problem) Details (e.g.)

4.1. INTRODUCTION AND RELATED WORK 45

okw (Problem detail 4.77) The set of keywords contains one keyword.

indices (Algorithm detail 4.82) Represent substrings by indices into the completestrings, converting a string-based algorithm into an indexing-based algo-rithm.

cw (Algorithm detail 4.90) Consider any shift distance that does not lead tothe missing of any matches. Such shift distances are called safe.

nla (Algorithm detail 4.103) The left and right lookahead symbols are not takeninto account when computing a safe shift distance. The computation of ashift distance is done by using two precomputed shift functions applied tothe current longest partial match.

lla (Algorithm detail 4.104) The left lookahead symbol is taken into accountwhen computing a safe shift distance.

cw-opt (Algorithm detail 4.108) Compute a shift distance using a single precom-puted shift function applied to the current longest partial match and theleft lookahead symbol.

bmcw (Algorithm detail 4.116) Compute a shift distance using a single precom-puted shift function which is applied to the current longest partial matchand the left lookahead symbol. The function yields shifts that are no greaterthan the function in detail (cw-opt).

near-opt (Algorithm detail 4.121) Compute a shift distance using a single precom-puted shift function applied to the current longest partial match and the leftlookahead symbol. The function is derived from the one in detail (bmcw),and it yields shifts which are no greater.

norm (Algorithm detail 4.127) Compute a shift distance as in (nla) but addi-tionally use a third shift function applied to the lookahead symbol. Theshift distance obtained is that of the normal Commentz-Walter algorithm.

bm (Algorithm detail 4.135) Compute a shift distance using one shift functionapplied to the lookahead symbol, and another shift function applied to thecurrent longest partial match. The shift distance obtained is that of theBoyer-Moore algorithm.

rla (Algorithm detail 4.137) The right lookahead symbol is taken into accountwhen computing a safe shift distance.

Page 35: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

35VaMoS 2016 2016.01.28

Taxonomies Example: Keyword Pattern Matching

CW

P

+

S

+

E

-

ACAC-OPT

AC-FAIL KMP-FAIL

LS

OKW

INDICES

GS

NLAU OLAU

NFS

OPT

BMCW NLA

CW

BM

BM

OKW

SPPBP

OKW

SHOBPLMIN

SSD

EGC

BMHBMH

GS

S F FO SO

EGC

RSA RFA RFO (RSO)

backward(suffix,factor,

factor oracle -based)

forward (prefix-based)

shiftfunctions

(leading tosublinear

algorithms)

choice of f(P) & dR,f (automatonrecognizingf(P)R)

Page 36: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

t-acceptor

rf

det

fr

det

match-set

rec

tabulate filter

tabulate

s-path

sp-matcher

det

aca-spm drfta-spm

filter

tfilt sfilt ifilt cfilt

van Dinther, 1987

Brainerd, 1967 & 1969Turner, 1986van Dinther, 1987Weisgerber & Wilhelm, 1989 Hemerik & Katoen, 1989 Ferdinand, Seidl & Wilhelm, 1994Wilhelm & Mauer, 1995

Chase, 1987Hemerik & Katoen, 1989Ferdinand, Seidl & Wilhelm, 1994 Cleophas, 2008

Aho, Ganapathi & Tjang, 1985, 1988van de Meerakker, 1988Weisgerber & Wilhelm, 1989Ferdinand, Seidl & Wilhelm, 1994Wilhelm & Mauer, 1995Cleophas, Hemerik & Zwaan, 2005 & 2006

36VaMoS 2016 2016.01.28

Taxonomies Example: Tree Acceptance

Page 37: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

t-acceptor

rf

det

fr

det

match-set

rec

tabulate filter

tabulate

s-path

sp-matcher

det

aca-spm drfta-spm

filter

tfilt sfilt ifilt cfilt

37VaMoS 2016 2016.01.28

Tree Acceptance Taxonomy One Algorithm Path

Page 38: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

|[ const G = . . .;t : . . .;

var b : B| let M = . . . be a ta such that L(M) = L(G);b := t � L(M)

]|

|[ const G = . . .;t : . . .;

var b : B| b := t � L(G)]|

(t-acceptor)

()

38VaMoS 2016 2016.01.28

Tree Acceptance Taxonomy One Algorithm Path

Page 39: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

(t-acceptor, fr, det) |[ const G = . . .;t : . . .;

var b : B| let M = . . . be a dfrta such that L(M) = L(G);

b := Traverse(�) ⇤ Qra

func Traverse(⇥ n : D) : Q =|[ var q1, . . . , qn : Q| let a = t(n);if n > 0 �

Traverse := Ra(Traverse(n · 1), . . . ,Traverse(n · n))[] n = 0 �

Traverse := Ra()f i

]| ]|

39VaMoS 2016 2016.01.28

Tree Acceptance Taxonomy One Algorithm Path

Construction of automaton separate issue

Page 40: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

t-acceptor

rf

det

fr

det

match-set

rec

tabulate filter

tabulate

s-path

sp-matcher

det

aca-spm drfta-spm

filter

tfilt sfilt ifilt cfilt

t-matcher

rf

ra-loops

det

det

fr

det

match-set

rec

tabulate filter

tabulate

s-path

sp-matcher

det

aca-spm drfta-spm

filter

tfilt sfilt ifilt cfilt

40VaMoS 2016 2016.01.28

Taxonomies Tree Acceptance and Tree Pattern Matching

Page 41: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

41VaMoS 2016 2016.01.28

Taxonomies Tree Automata Constructions

• About 50 different constructionsin tree acceptance and tree pattern matching taxonomies – differ in e.g. direction, epsilons, determinism, advanced techniques

• Construction presentation – uniform style – defines state set, transition relation, ... – gives example – correctness arguments – related constructions and literature – identified by sequence of labels indicating details, e.g.

(TPM-TA:ALL-SUB:REM-Epsilon:FR:SUBSET)

Page 42: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

42VaMoS 2016 2016.01.28

Automata Construction Taxonomy144 CHAPTER 6. FA CONSTRUCTION ALGORITHMS

rem-ε-dual

pd pd

filt

a-s

e-mark

sym

a-s

filt

filt

filtsym

rem-ε

filt

Wfilt Xfilt

6.65BS (6.39)

6.15

6.636.19

6.83

6.85

MYG (6.44)

6.43 6.68

6.35

b-mark

ASU (6.86)

6.27

use-s

subsetuse-s

subset

use-s

subset use-s

subset

use-s

subset

use-s

subset

use-s

subset use-s

subset

Ant. (6.55)

Brz. (6.57)p. 158

Figure 6.1: A taxonomy of finite automata construction algorithms. The larger graphrepresents the main part of the taxonomy, while the smaller graph represents the twoinstantiations of the filt detail that are discussed in this dissertation. The numbersappearing at some of the vertices correspond to the algorithm or construction numbers inthe text of this chapter. In some cases, the algorithm is not presented explicitly, and thepage number is given instead. The use of duality is clearly shown by the symmetry inthe graph. The algorithms in the dashed-line subtree (on the right of the graph) are nottreated in this dissertation, since they are the duals of algorithms in the left half and it isnot clear that the duals would be more efficient or enlightening.

Page 43: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

43VaMoS 2016 2016.01.28

DFA Minimization7.1. INTRODUCTION 193

ASU (7.21)

Hopcroft-Ullman (7.24)

(7.28)

imperative program

(7.18)

(7.19) (7.22)

eq. classes eq. classes

lists

optimized list update

Hopcroft (7.26)

Brzozowski (§ 7.2)

(§ 7.4.6)

pointwise

memoization

approx. from below

Improved

Equivalence of states (§ 7.3)

equivalence relation

approx. from above

Naive

(§ 7.4.1–7.4.5, 7.4.7)

(§ 7.4.1–7.4.5)

layerwise unordered state pairs

(7.27)

(7.23)

(p. 207)

(p. 212)

Figure 7.1: The family trees of finite automata minimization algorithms. Brzozowski’sminimization algorithm is unrelated to the others, and appears as a separate (single vertex)tree. Each algorithm presented in this chapter appears as a vertex in this tree. For eachalgorithm that appears explicitly in this chapter, the construction number appears inparentheses (indicating where it appears in this chapter). For algorithms that do notappear explicitly, a reference to the section or page number is given. Edges denote arefinement of the solution (and therefore explicit relationships between algorithms). Theyare labeled with the name of the refinement.

Page 44: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

44VaMoS 2016 2016.01.28

Taxonomies Advantages and Disadvantages

+ Algorithm comparison easier + Clear and correct algorithm presentation + Orders field, usable as teaching aid + Well suited for exploratory algorithmics + Formal specifications + Aids in construction of toolkit - Takes much time and effort (abstraction (bottom-up!), sequential addition of details) - Overkill for some domains?

Page 45: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

45VaMoS 2016 2016.01.28

TABASCO—Steps

Process consists of multiple steps: 1. Selection of domain 2. Literature survey 3. Classification construction 4. Toolkit design 5. Toolkit implementation 6. Benchmarking 7. DSL/GUI design 8. DSL/GUI implementation

Page 46: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 47: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 48: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 49: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 50: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

VaMoS 2016 2016.01.28

Page 51: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

func Traverse(⇥ n : D) : Q =|[ var q1, . . . , qn : Q| let a = t(n);if n > 0 �

Traverse := Ra(Traverse(n · 1), . . . ,Traverse(n · n))[] n = 0 �

Traverse := Ra()f i

]|

private static AbstractAutomatonState Traverse(AbstractDFRTA M, Node n) { AbstractTAState[] childStates = new AbstractTAState[n.children().size()]; for (int i=0; i < n.children().size(); i++) { childStates[i] = Traverse(M, n.children().get(i)); } if (n.children().size() > 0) { state = M.nextState(childStates, (RankedSymbol)n.symbol()); } else { state = M.nextState(childStates, (RankedSymbol)n.symbol()); } return state; }

51VaMoS 2016 2016.01.28

Toolkit vs Taxonomy

Page 52: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

52VaMoS 2016 2016.01.28

Algorithm Performance13.4. RESULTS 299

0

5

10

15

20

25

30

0 2 4 6 8 10 12 14

MB/s

Shortest keyword length

CW-WBMCW-NORM

AC-OPTAC-FAIL

Figure 13.8: Algorithm performance (in megabytes/second) versus the length of the short-est keyword in a given set. The performance of the CW-WBM and CW-NORM algorithmsare almost coincidental (shown as the ascending solid line).

The performance of the CW algorithms, which declined with increasing keyword setsize, was consistently better than the AC-OPT algorithm. In some cases, the CW-NORMalgorithm displayed a five to ten-fold improvement over the AC-OPT algorithm.

13.4.2 Performance versus minimum keyword length

For each algorithm, the average number of megabytes processed per second was graphedagainst the length of the shortest keyword in a set. For the multiple-keyword tests thegraphs are superimposed in Figure 13.8.

Predictably, the AC-OPT algorithm has performance that is independent of the key-word set. The AC-FAIL algorithm has slightly lower performance, improving with longerminimum keywords. The average performance of the CW algorithms improves almostlinearly with increasing minimum keyword lengths. The low performance of the CW al-gorithms for short minimum keyword lengths is explained by the fact that the CW-WBMand CW-NORM shift functions are bounded above by the length of the minimum keyword(see Chapter 4). For sets with minimum keywords no less than than four characters, theCW algorithms outperform the AC algorithms.

As predicted, the CW-NORM algorithm outperforms the CW-WBM algorithm. Theperformance ratio of the CW-WBM algorithm to the CW-NORM algorithm is shown inFigure 13.9. The figure indicates that the performance gap is wide with small minimumkeyword lengths, and diminishes with increasing minimum keyword lengths. (This effect

Page 53: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

53VaMoS 2016 2016.01.28

Ongoing and Future Work

• Existing taxonomies and toolkits developed over 20 years – update and integrate

• e.g. >50 new keyword pattern matching algorithms in 2001-2010 – need to be selective...

– multiple DSLs and GUIs on top • bioinformatics, computational linguistics, network intrusion detection • student view?

• Application to other algorithmic or data structure fields

Page 54: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

54VaMoS 2016 2016.01.28

Concluding Remarks

Overview of taxonomy construction and TAxonomy-BAsed Software COnstruction

– algorithmic domains – bringing order and improving understanding

• bonus: exploratory algorithmics

– also aimed at development of large-scale toolkit • allows comparison in practice

– benchmarking data – algorithm selection

• with DSLs/GUIs to simplify usage

– TABASCO is the only such method that takes correctness-by-construction into account

Page 55: Taxonomy-Based Software Construction for Algorithmic Families · PDF file16-07-2015 · Taxonomy-Based Software Construction for Algorithmic Families ... aca-spm drfta-spm 20 ... Eindhoven;

55VaMoS 2016 2016.01.28

References

• L. Cleophas, B.W. Watson, D.G. Kourie, A. Boake & S. Obiedkov,TABASCO: Using Concept-Based Taxonomies in Domain Engineering.SACJ, 37:30–40, December 2006.

• L. Cleophas & B.W. Watson, Taxonomy-based softwareconstruction of SPARE Time: a case study.In IEE Proceedings – Software, 152(1), February 2005.

• L. Cleophas & B.W. Watson, Applying and spicing upTABASCO: taxonomy-based software and how toincrease its usability. In Formal Aspects of Computing—Essays dedicated to Derrick Kourie on the occasionof his 65th Birthday, 173–183, Shaker Verlag, 2013.

• D.G. Kourie & B.W. Watson, The Correctness-by-Construction Approach to Programming, Springer, 2012.

• B.W. Watson, D.G. Kourie & L. Cleophas, Experience with Correctness-by-Construction. To appear in Science of Computer Programming, special issue on New Ideas and Emerging Results in Understanding Software, 2013.


Recommended