Higher Order Applicative XML (Monterey 2002)

Post on 10-Jun-2015

48 views 0 download

description

Slides for the paper "Higher Order Applicative XML", given at the Workshop on Radical Innovations of Software and Systems Engineering in the Future, Venice, Italy, October 2002. Published in Springer LNCS 2941, pages 91-107. The Springer URL is http://link.springer.com/chapter/10.1007%2F978-3-540-24626-8_6, with DOI 10.1007/978-3-540-24626-8_6 . A preprint is available at http://www.academia.edu/1413571/Higher_order_applicative_XML_documents .

transcript

Venice, 7-11 Oct 2002 Monterey Workshop 2002 1

Higher Order Applicative XML

Carlos Delgado Klooswith P.T. Breuer, V. Luque, L. Sánchez

Universidad Carlos III de Madrid

www.it.uc3m.es

Venice, 7-11 Oct 2002Monterey Workshop 2002 2

What is the most successful format to represent data?

XML

Venice, 7-11 Oct 2002Monterey Workshop 2002 3

XML represents structureXML allows to represent hierarchical informationThe elements of the hierarchy are not predefinedXML allows to invent languages of multiple brackets

{ []() }

Venice, 7-11 Oct 2002Monterey Workshop 2002 4

Structure: Hierarchy"The structure of concepts is formally called a hierarchy and since ancient times has been a basic structure for all western knowledge. Kingdoms, empires, churches, armies have all been structured into hierarchies. Tables of contents of reference material are so structured, mechanical assemblies, computer software, all scientific and technical knowledge is so structured..."

-- Robert M. Pirsig:Zen and the Art of Motorcycle Maintenance

Venice, 7-11 Oct 2002Monterey Workshop 2002 5

Areas of application

Accounting

MarketingBusiness

EducationCommunication

Banking

Automotive

Insurances

HumanresourcesHealth

ERP

Chemistry

MathematicsNews

Law

Workflow

Software

Tourism

Venice, 7-11 Oct 2002Monterey Workshop 2002 6

Success of XMLXML has had a lot of success, much more than their authors could expect

Not just for the reason they expectedSeparation of form and content

But for a reason, they had not thought ofData has to travel trough the net

The tree structure is a format useful for any kind of data

XML is used as a data transfer mechanism

Venice, 7-11 Oct 2002Monterey Workshop 2002 7

What is XML?

"XML is ASCII for the 21st century."

-- Henry S. Thompson,U Edinburgh & W3C

Venice, 7-11 Oct 2002Monterey Workshop 2002 8

HTML and JavaScript<HTML> <HEAD><TITLE>JavaScript</TITLE></HEAD>

<BODY> Text<P> i=1<BR> i=2<BR> i=3<BR></BODY></HTML>

Venice, 7-11 Oct 2002Monterey Workshop 2002 9

HTML and JavaScript<HTML> <HEAD><TITLE>JavaScript</TITLE></HEAD>

<BODY> Text<P> <SCRIPT LANGUAGE="JavaScript"> <!-- for (i=0; i<3; ++i) document.write("i=" + i + "<BR>"); // --> </SCRIPT> </BODY></HTML>

Venice, 7-11 Oct 2002Monterey Workshop 2002 10

Objective

Extend XML to "higher order texts"do not add to itdo not change itdo not write a separate language

do reinterpret XML semantics in a larger universedo conserve the initial semanticsdo anything that is natural in a categoric sense

Venice, 7-11 Oct 2002Monterey Workshop 2002 11

The problem with XML

XML can express data basic types and free data types

XML cannot express functiona separate language is used to traverse XML data

Advantage or disadvantage?

Venice, 7-11 Oct 2002Monterey Workshop 2002 12

An idea: Syntax

Let<f> a </f>

mean "apply function f to argument a"

Venice, 7-11 Oct 2002Monterey Workshop 2002 13

An idea: SemanticsLet XML documents take arguments

An abstract document f isfunction :: [Doc] -> [Doc]

A simple document s is a string or other basic type

string, integer:: DocVirtual document is calculated, sometimes trivially so

apply function f to argument aXML documents use f which are free datatype constructors

Venice, 7-11 Oct 2002Monterey Workshop 2002 14

Example<element name="double"type="string" maxOccurs="unbounded">

<resultType maxOccurs="unbounded">string

</resultType>

<def var="x">x x</def>

</element>

Venice, 7-11 Oct 2002Monterey Workshop 2002 15

<element name="Point">

<complexType>

<element name="x" type="integer"/>

<element name="y" type="integer"/>

</complexType>

<resultType>float</resultType>

<def>sqrt(sqr(x)+sqr(y))

</def>

</element>

Example

Venice, 7-11 Oct 2002Monterey Workshop 2002 16

Language Syntax: Basics

Juxtaposition is concatenation of lists

<a>hello</a>is both in type a and in type a*<a>hello</a> <a>there</a>is of type a*

Venice, 7-11 Oct 2002Monterey Workshop 2002 17

Language Syntax: Basics

Function application is via tags,but functions are by default the free datatype constructor

<a>hello</a> is of type a

Function definition needs a special metatag

<def name=a var=x><def.val> x x </def.val> ... </def>

Venice, 7-11 Oct 2002Monterey Workshop 2002 18

Document Layout

Header section (DTD)types of functions and other tags

Definition sectionsemantics of functions

Text section (document)XML or HOAX document interpreted according to definitions given

Venice, 7-11 Oct 2002Monterey Workshop 2002 19

Example<!DOCTYPE example [ <!ELEMENT example (ANY*)> <!ELEMENT ANY* dbl (ANY*)>]><example> <def name="dbl" var="x"> <def.val> x x </def.val> <dbl><dbl>bye</dbl></dbl> </def></example><example> byebyebyebye</example>

Venice, 7-11 Oct 2002Monterey Workshop 2002 20

Definition of XML application

DTD or XML Schema can be usedDTD is used in paper for brevity

DTD normally specifies syntaxin HOAX

syntax = typetype = parserthere are more types than in XML

HOAX function types are not free type declarations

a "result type" precedes the function name

DTD declarations are local in HOAX

Venice, 7-11 Oct 2002Monterey Workshop 2002 21

Meta TagsHOAX introduces 4 (really 2) special tags

<def> for binding a name to a value<var> for expressing abstraction<eval> for application of a function to its arguments<ref> for dereferencing a variable

Venice, 7-11 Oct 2002Monterey Workshop 2002 22

Example: notesScope of a definition is strictly defined

<def ...> ... </def>

A definition has three parts<def name=... var=... val=...> ...

Attributes can be made into tags<def name= ... var=...><def.val> ... </def.val>

Type definitions use elements and C types

<!DOCTYPE [<!ELEMENT res fun (arg)>]>

Venice, 7-11 Oct 2002Monterey Workshop 2002 23

Meta Tagsonly <def> is common<var> normally appears within a <def>

<eval> and <ref> are unusual.<def name=“a” var=“b” val=“c”>…</def> means<def name=“a”> <def.val><var name=“b”>c</var> </def.val>…</def>

Venice, 7-11 Oct 2002Monterey Workshop 2002 24

BNF of HOAXHOAX ::= <def name=t><def.val>xs</def.val> xs' </def>| <var name=t> xs </var>| <eval name=t> xs </eval>| <t> xs </t>| <ref name=t> | t

where xs is a sequence of HOAX documentsFunction names may be applied as tags. Simple variable names don't need ref tags.

Venice, 7-11 Oct 2002Monterey Workshop 2002 25

TypesThe type system has to be adjusted to XML

certain things are indistinguishable, and hence have the same type

a string is indistinguishable from a singleton list of stringsa sequence of strings is indistinguishable from a stringa list of a's followed by a list of a's is indistinguishable from a list of a's, in general

Venice, 7-11 Oct 2002Monterey Workshop 2002 26

TypesSolved by using the parsers of XML as types

have inclusionsa | a* means "the parser of a*'s will parse a"type equality is not identity, but it does not matter

Venice, 7-11 Oct 2002Monterey Workshop 2002 27

HOAX types areparsers of HOAX texts

ambiguous parsersalternative parsers p | q produce all possible outcomessequence of parsers p q partition the input in all possible ways and produce all possible results for each part, then resequence resultsp* = p p* | ε

Venice, 7-11 Oct 2002Monterey Workshop 2002 28

Algebra of typesThe model gives rise to inclusions and equalities

p ⊆ p*, p p*⊆ p*, ...

Type equality is difficult to decide, but type satisfaction is easy

each type is precisely a mechanism for checking type satisfactioneach type is a mechanism for evaluating a texteach document text has a "canonical type" calculated from its presentation.

Venice, 7-11 Oct 2002Monterey Workshop 2002 29

Canonical types ofdocument texts

"s" :: #PCDATA<t> xs </t> :: t

if the construction meets the declared type constraints for the tag t.

<f> xs </f> :: rwhere r is the result type declared for f, provided xs :: a and a is the declared argument type for f.

<def name=“a” val=“b”> c </def> :: twhere t is the type calculated for c given the hypothesis that the type of a, wherever it appears in c, is the same as the type calculated for b.

Venice, 7-11 Oct 2002Monterey Workshop 2002 30

HOAX semanticsTwo HOAX documents can look distinct but mean the same thing

The semantics is defined by "how to parse" instructions

The instructions are predicated on a type"how to parse x using type t"

if t is the result type for function tags f which takes argument type a, then <f> x </f> is parsed by t by first parsing x with a, then applying the definition of function f to the results of the parse, then parsing with t.

Venice, 7-11 Oct 2002Monterey Workshop 2002 31

Can prove ...

That each document has a unique parse semantics

means that the semantics is well-defined

That the semantics of functional application is substitution

means that the parse of a document with references is the parse of the document with the references replaced by the text to which they refer.

Venice, 7-11 Oct 2002Monterey Workshop 2002 32

AdvantagesHOAX puts the semantics back into XML

not only functions were missing, but reductions of any form to any other form

Do not need a second "transformation language"

HOAX documents are self-transforming in situ

Clearly arbitrarily higher order functionality

"abstract documents are first-class documents"

Venice, 7-11 Oct 2002Monterey Workshop 2002 33

Disadvantages

Might prefer to separate form and functionThere is no mechanism for saying where to perform a reduction - here or thereWhat happens in the case of unavailable data?Ditto function?

Venice, 7-11 Oct 2002Monterey Workshop 2002 34

Open issuesReflexivity

Can HOAX be expressed entirely within HOAXProbably, but meaningfully? Usefully?

Circular reference trailsHOAX is careful to apply definitions only in a well-defined scope.External references drag in their definitions, but with scope localized to the reference.Do mutually dependent external definitions resolve?

Venice, 7-11 Oct 2002Monterey Workshop 2002 35

ConclusionExperiment in extending XML to support functional semanticsPrototypedProposed approach to semantics of XML

use types=parsers, since XML is pure syntaxuse interpreter=parser, since XML is pure syntaxergo type=interpreter=parser

Venice, 7-11 Oct 2002Monterey Workshop 2002 36

ConclusionBy giving an interpretation to tags, one can easily reduce documents to others (without XSLT!)Exploratory work

Venice, 7-11 Oct 2002Monterey Workshop 2002 37

Enjoy Venice!

Venice, 7-11 Oct 2002Monterey Workshop 2002 38

Enjoy Venice!

Venice, 7-11 Oct 2002Monterey Workshop 2002 39