Venice, 7-11 Oct 2002 Monterey Workshop 2002 1
Higher Order Applicative XML
Carlos Delgado Klooswith P.T. Breuer, V. Luque, L. Sánchez
Universidad Carlos III de Madrid
www.it.uc3m.es
Venice, 7-11 Oct 2002Monterey Workshop 2002 2
What is the most successful format to represent data?
XML
Venice, 7-11 Oct 2002Monterey Workshop 2002 3
XML represents structureXML allows to represent hierarchical informationThe elements of the hierarchy are not predefinedXML allows to invent languages of multiple brackets
{ []() }
Venice, 7-11 Oct 2002Monterey Workshop 2002 4
Structure: Hierarchy"The structure of concepts is formally called a hierarchy and since ancient times has been a basic structure for all western knowledge. Kingdoms, empires, churches, armies have all been structured into hierarchies. Tables of contents of reference material are so structured, mechanical assemblies, computer software, all scientific and technical knowledge is so structured..."
-- Robert M. Pirsig:Zen and the Art of Motorcycle Maintenance
Venice, 7-11 Oct 2002Monterey Workshop 2002 5
Areas of application
Accounting
MarketingBusiness
EducationCommunication
Banking
Automotive
Insurances
HumanresourcesHealth
ERP
Chemistry
MathematicsNews
Law
Workflow
Software
Tourism
Venice, 7-11 Oct 2002Monterey Workshop 2002 6
Success of XMLXML has had a lot of success, much more than their authors could expect
Not just for the reason they expectedSeparation of form and content
But for a reason, they had not thought ofData has to travel trough the net
The tree structure is a format useful for any kind of data
XML is used as a data transfer mechanism
Venice, 7-11 Oct 2002Monterey Workshop 2002 7
What is XML?
"XML is ASCII for the 21st century."
-- Henry S. Thompson,U Edinburgh & W3C
Venice, 7-11 Oct 2002Monterey Workshop 2002 8
HTML and JavaScript<HTML> <HEAD><TITLE>JavaScript</TITLE></HEAD>
<BODY> Text<P> i=1<BR> i=2<BR> i=3<BR></BODY></HTML>
Venice, 7-11 Oct 2002Monterey Workshop 2002 9
HTML and JavaScript<HTML> <HEAD><TITLE>JavaScript</TITLE></HEAD>
<BODY> Text<P> <SCRIPT LANGUAGE="JavaScript"> <!-- for (i=0; i<3; ++i) document.write("i=" + i + "<BR>"); // --> </SCRIPT> </BODY></HTML>
Venice, 7-11 Oct 2002Monterey Workshop 2002 10
Objective
Extend XML to "higher order texts"do not add to itdo not change itdo not write a separate language
do reinterpret XML semantics in a larger universedo conserve the initial semanticsdo anything that is natural in a categoric sense
Venice, 7-11 Oct 2002Monterey Workshop 2002 11
The problem with XML
XML can express data basic types and free data types
XML cannot express functiona separate language is used to traverse XML data
Advantage or disadvantage?
Venice, 7-11 Oct 2002Monterey Workshop 2002 12
An idea: Syntax
Let<f> a </f>
mean "apply function f to argument a"
Venice, 7-11 Oct 2002Monterey Workshop 2002 13
An idea: SemanticsLet XML documents take arguments
An abstract document f isfunction :: [Doc] -> [Doc]
A simple document s is a string or other basic type
string, integer:: DocVirtual document is calculated, sometimes trivially so
apply function f to argument aXML documents use f which are free datatype constructors
Venice, 7-11 Oct 2002Monterey Workshop 2002 14
Example<element name="double"type="string" maxOccurs="unbounded">
<resultType maxOccurs="unbounded">string
</resultType>
<def var="x">x x</def>
</element>
Venice, 7-11 Oct 2002Monterey Workshop 2002 15
<element name="Point">
<complexType>
<element name="x" type="integer"/>
<element name="y" type="integer"/>
</complexType>
<resultType>float</resultType>
<def>sqrt(sqr(x)+sqr(y))
</def>
</element>
Example
Venice, 7-11 Oct 2002Monterey Workshop 2002 16
Language Syntax: Basics
Juxtaposition is concatenation of lists
<a>hello</a>is both in type a and in type a*<a>hello</a> <a>there</a>is of type a*
Venice, 7-11 Oct 2002Monterey Workshop 2002 17
Language Syntax: Basics
Function application is via tags,but functions are by default the free datatype constructor
<a>hello</a> is of type a
Function definition needs a special metatag
<def name=a var=x><def.val> x x </def.val> ... </def>
Venice, 7-11 Oct 2002Monterey Workshop 2002 18
Document Layout
Header section (DTD)types of functions and other tags
Definition sectionsemantics of functions
Text section (document)XML or HOAX document interpreted according to definitions given
Venice, 7-11 Oct 2002Monterey Workshop 2002 19
Example<!DOCTYPE example [ <!ELEMENT example (ANY*)> <!ELEMENT ANY* dbl (ANY*)>]><example> <def name="dbl" var="x"> <def.val> x x </def.val> <dbl><dbl>bye</dbl></dbl> </def></example><example> byebyebyebye</example>
Venice, 7-11 Oct 2002Monterey Workshop 2002 20
Definition of XML application
DTD or XML Schema can be usedDTD is used in paper for brevity
DTD normally specifies syntaxin HOAX
syntax = typetype = parserthere are more types than in XML
HOAX function types are not free type declarations
a "result type" precedes the function name
DTD declarations are local in HOAX
Venice, 7-11 Oct 2002Monterey Workshop 2002 21
Meta TagsHOAX introduces 4 (really 2) special tags
<def> for binding a name to a value<var> for expressing abstraction<eval> for application of a function to its arguments<ref> for dereferencing a variable
Venice, 7-11 Oct 2002Monterey Workshop 2002 22
Example: notesScope of a definition is strictly defined
<def ...> ... </def>
A definition has three parts<def name=... var=... val=...> ...
Attributes can be made into tags<def name= ... var=...><def.val> ... </def.val>
Type definitions use elements and C types
<!DOCTYPE [<!ELEMENT res fun (arg)>]>
Venice, 7-11 Oct 2002Monterey Workshop 2002 23
Meta Tagsonly <def> is common<var> normally appears within a <def>
<eval> and <ref> are unusual.<def name=“a” var=“b” val=“c”>…</def> means<def name=“a”> <def.val><var name=“b”>c</var> </def.val>…</def>
Venice, 7-11 Oct 2002Monterey Workshop 2002 24
BNF of HOAXHOAX ::= <def name=t><def.val>xs</def.val> xs' </def>| <var name=t> xs </var>| <eval name=t> xs </eval>| <t> xs </t>| <ref name=t> | t
where xs is a sequence of HOAX documentsFunction names may be applied as tags. Simple variable names don't need ref tags.
Venice, 7-11 Oct 2002Monterey Workshop 2002 25
TypesThe type system has to be adjusted to XML
certain things are indistinguishable, and hence have the same type
a string is indistinguishable from a singleton list of stringsa sequence of strings is indistinguishable from a stringa list of a's followed by a list of a's is indistinguishable from a list of a's, in general
Venice, 7-11 Oct 2002Monterey Workshop 2002 26
TypesSolved by using the parsers of XML as types
have inclusionsa | a* means "the parser of a*'s will parse a"type equality is not identity, but it does not matter
Venice, 7-11 Oct 2002Monterey Workshop 2002 27
HOAX types areparsers of HOAX texts
ambiguous parsersalternative parsers p | q produce all possible outcomessequence of parsers p q partition the input in all possible ways and produce all possible results for each part, then resequence resultsp* = p p* | ε
Venice, 7-11 Oct 2002Monterey Workshop 2002 28
Algebra of typesThe model gives rise to inclusions and equalities
p ⊆ p*, p p*⊆ p*, ...
Type equality is difficult to decide, but type satisfaction is easy
each type is precisely a mechanism for checking type satisfactioneach type is a mechanism for evaluating a texteach document text has a "canonical type" calculated from its presentation.
Venice, 7-11 Oct 2002Monterey Workshop 2002 29
Canonical types ofdocument texts
"s" :: #PCDATA<t> xs </t> :: t
if the construction meets the declared type constraints for the tag t.
<f> xs </f> :: rwhere r is the result type declared for f, provided xs :: a and a is the declared argument type for f.
<def name=“a” val=“b”> c </def> :: twhere t is the type calculated for c given the hypothesis that the type of a, wherever it appears in c, is the same as the type calculated for b.
Venice, 7-11 Oct 2002Monterey Workshop 2002 30
HOAX semanticsTwo HOAX documents can look distinct but mean the same thing
The semantics is defined by "how to parse" instructions
The instructions are predicated on a type"how to parse x using type t"
if t is the result type for function tags f which takes argument type a, then <f> x </f> is parsed by t by first parsing x with a, then applying the definition of function f to the results of the parse, then parsing with t.
Venice, 7-11 Oct 2002Monterey Workshop 2002 31
Can prove ...
That each document has a unique parse semantics
means that the semantics is well-defined
That the semantics of functional application is substitution
means that the parse of a document with references is the parse of the document with the references replaced by the text to which they refer.
Venice, 7-11 Oct 2002Monterey Workshop 2002 32
AdvantagesHOAX puts the semantics back into XML
not only functions were missing, but reductions of any form to any other form
Do not need a second "transformation language"
HOAX documents are self-transforming in situ
Clearly arbitrarily higher order functionality
"abstract documents are first-class documents"
Venice, 7-11 Oct 2002Monterey Workshop 2002 33
Disadvantages
Might prefer to separate form and functionThere is no mechanism for saying where to perform a reduction - here or thereWhat happens in the case of unavailable data?Ditto function?
Venice, 7-11 Oct 2002Monterey Workshop 2002 34
Open issuesReflexivity
Can HOAX be expressed entirely within HOAXProbably, but meaningfully? Usefully?
Circular reference trailsHOAX is careful to apply definitions only in a well-defined scope.External references drag in their definitions, but with scope localized to the reference.Do mutually dependent external definitions resolve?
Venice, 7-11 Oct 2002Monterey Workshop 2002 35
ConclusionExperiment in extending XML to support functional semanticsPrototypedProposed approach to semantics of XML
use types=parsers, since XML is pure syntaxuse interpreter=parser, since XML is pure syntaxergo type=interpreter=parser
Venice, 7-11 Oct 2002Monterey Workshop 2002 36
ConclusionBy giving an interpretation to tags, one can easily reduce documents to others (without XSLT!)Exploratory work
Venice, 7-11 Oct 2002Monterey Workshop 2002 37
Enjoy Venice!
Venice, 7-11 Oct 2002Monterey Workshop 2002 38
Enjoy Venice!
Venice, 7-11 Oct 2002Monterey Workshop 2002 39