Date post: | 06-Dec-2014 |
Category: |
Education |
Upload: | tonny-madsen |
View: | 1,367 times |
Download: | 0 times |
L0080 - 2010-11-02
Redistribution and other use of this material requires written permission from The RCP Company.
ITU - MDD – Textural Languages and Grammars
This presentation describes the use and design of textural domain specific language - DSL. It has two basic purposes:Introduce you to some of the more important design criteria in language designIntroduce you to BNF
This presentation is developed for MDD 2010 course at ITU, Denmark.
Some materials are taken from Artur Boronat, University of Leicester with permissions.
L0080 - 2010-11-02
2
Textual DSLs versus Graphical DSLs
With both textual and graphical syntax you can model any meta model verify constraints in real time (Eclipse) write ordinary EMF models
Graphical Editors are good to show structural relationships Graphical Editors are sexy and often appeals to management
Textual Editors are better for „algorithmic“ aspects, or hierarchical models Textual Editors integrate better with CVS etc. (diff, merge) Textual Editors often provides better support for quick assist and quick fixes
If you must work with a DSL many hours every day, consider textual DSLs over graphical DSLs!’
L0080 - 2010-11-02
3
What is needed for a successful DSL
You must know
The intended audience
The expressiveness of the language
The extendibility of the language
L0080 - 2010-11-02
4
The intended audience
Who will be using the language Temporary help in a call center Full-time Clerks Accountants Engineers – like yourselves
Any mismatch between the language and the audience is likely to alienate your audience
Keywords: concise vs. redundant intuitive simple to write and read
(host “heimdal”(ipif 192.167.54.55)(disk:type SSD 500GB))
Define host name = “heimdal”with IP interface
IPv4 address = 192.167.54.55end IP addresswith disk
type = SSDsize = 500GB
end diskEnd host
L0080 - 2010-11-02
5
The expressiveness of the language
What type of information must be expressed in the language: Simple static information Complex network of objects Algorithms Must the DSL be Turing complete?
What type of constructs must be supported Lists Tables Recursive data structures Variables Macros Conditions and loops Functions Comments
E = MC2
L0080 - 2010-11-02
6
The extendibility of the language
Can you be absolute certain that the language will never be extended? “The last language…”
Backward compatibility If your grammar is successful, then it will be used – so what should you do
with all the existing “stuff” when you need to extend your grammar with new functionality
Open versus closed grammars An open grammar can be extended in the future with new constructs or
statements in a natural manner A closed grammar can be very difficult to extend while retaining backward
compatibility
heimdal 192.167.54.55 SSD 500(host “heimdal”
(ipif 192.167.54.55)(disk type SSD 500GB))
L0080 - 2010-11-02
7
Internal versus External DSLs
Textual DSLs are normally divided into two distinct groups:
Internal DSL (or extension language) Using the syntax and semantics of a host language or system – e.g. Java
or Excel You can argument that any API is in itself an DSL
External DSL A completely new language
L0080 - 2010-11-02
8
Some ExamplesInternal DSL
4GL – attempt in the 1990s to raise the productivity of programmers by adding more high-level constructs to 3GL languages (such as C, Pascal, etc)
JSP – current attempt to ease the creation of web pages by adding Java as an embedded scripting language
Some languages that have been designed to be a good host language for DSL: Tcl Python Groovy Ruby
final IFormCreator detailsSection = myForm.addSection("Details",table.getBinding().getSingleSelection());
detailsSection.addObjectMessages();detailsSection.addField("contact.name(w=200)");detailsSection.addField("logoFileName(w=200)");
detailsSection.addField("loyalty").arg(Constants.ARG_PREFERRED_CONTROL,RadioGroup.class.getName());
# Assume $remote_server, $my_user_id, $my_password, and $my_command# were read in earlier in the script.spawn telnet $remote_server expect "username:"send "$my_user_id\r" expect "password:"send "$my_password\r" expect "%"send "$my_command\r" expect "%"set results $expect_out(buffer)send "exit\r" expect eof
L0080 - 2010-11-02
9
Consists of two layers a lexical layer a grammar layer
Lexical syntax: the spelling of words and punctuation
Grammars: the key formalism for describing syntax; universal progr
amming language
Example if (<expr>) <statement> if <expr> then <statement> <statement> if <expr> ( if <expr> <statement>)
Formal Syntax of an DSL
if
expression statement
!=
a
constantvariable
if (a != null) b = a.getArg();
“If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;”
assignment
a expression
null
… …
L0080 - 2010-11-02
10
Tokens/Terminals: units in programming language
Lexical syntax: correspondence between the written representation (spelling) a nd the tokens or terminals in a grammar
Keywords: alphabetic character sequences - unit in a language if , while
Punctuation Like ‘(‘, ‘)’, ‘:=‘
Whitespace
Lexical Syntax
if (a != null) b = a.getArg();
“If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;”
L0080 - 2010-11-02
11
Concrete syntax: describes its written representation, including lexical details su ch as the placement of keywords and punctuation marks
- Context free grammar or grammars: notation for specifying concrete syntax
Context-free means that the understanding of the next token may not depend on the current context
Example from PL/1 of a context dependent grammar: IF IF THEN THEN ELSE ELSE; Legal if “IF” is a Boolean variable and “THEN” and “ELSE” are valid
procedures
Most modern languages have a context-free grammar Notable exceptions are C and C++!!!
Context-Free Grammars
L0080 - 2010-11-02
12
Grammar has four parts A set of tokens or terminals
atomic symbols that can not be sub-divided
A set of nonterminals Identifiers or variables representing constructs
A set of rules ( called productions) identify the component of construct <nonterminal >::= terminal | <nonterminal>
A starting nonterminal represents a main construct
Describing Context-Free Grammars
assignment => ID = expression;expression => expression + term
| expression - term| term
term => term * factor| term / factor| factor
factor => ( expression )| ID| NUMBER
L0080 - 2010-11-02
13
The productions are rules for building strings Parse Trees: show how a string can be built
Notation to write grammar - Backus Naur Form (BNF) Extended BNF (EBNF)
Writing Grammars
L0080 - 2010-11-02
14
Backus-Naur Form (Backus Normal Form)
Grammar Rules or Productions: define symbols.
Non-terminal Symbols: anything that is defined on the left-side of some production.
Terminal Symbols: things that are not defined by productions. They can be literals, symbols, and other lexemes of the language defined by lexical rules.
Identifiers: id::= [A-Za-z_]* Delimiters: ; Operators: = + - * / %
assignment_stmt ::= id = expression;
The non-terminal symbol being defined.
The definition (production)
L0080 - 2010-11-02
15
Different notations (same meaning): assignment_stmt ::= id = expression + term <assignment-stmt> => <id> = <expr> + <term> AssignmentStmt id = expression + term
::=, =>, mean "consists of" or "defined as"
Alternatives ( " | " ):
Concatenation:
Backus-Naur Form (Backus Normal Form)
expression => expression + term| expression - term| term
number => DIGIT number | DIGIT
L0080 - 2010-11-02
16
Backus-Naur Form (Backus Normal Form)
Another way to write alternatives:
Null symbol,: or @used to allow a production to match nothing.
Example: a variable is an identifier followed by an optional subscript
Expression => expression + term=> expression - term=> term
variable => identifier subscriptsubscript => [ expression ] | e
L0080 - 2010-11-02
17
Here is a grammar for assignment with arithmetic operations, e.g. y = (2*x + 5)*x - 7;
Arithmetic Grammar
assignment => ID = expression;expression => expression + term
| expression - term| term
term => term * factor| term / factor| factor
factor => ( expression )| ID| NUMBER
L0080 - 2010-11-02
18
Example of a Parse Tree
if
expression statement
!=
a
constantvariable
“If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;”
assignment
a expression
null
… …
L0080 - 2010-11-02
19
Grammar Rules:expr => expr + expr | expr expr
| ( expr ) | NUMBER
Expression: 2 + 3 * 4
Two possible parse trees:
expr
expr expr
expr
+
* expr
expr
expr
+
* expr
expr expr NUMBER (2)
NUMBER (3)
NUMBER (4)
NUMBER (2)
NUMBER (3)
NUMBER (4)
Ambiguity
L0080 - 2010-11-02
20
The standard metalanguage: Extended BNF Terminal symbols of the language are quoted [ and ] include optional symbols { and } indicate repetition ( and ) are used to group items | definition-separator-symbol stands for or. In EBNF we need to quote ( and
) literals as '(' ... ')’ * repetition symbol = defining symbol ; terminator symbol
EBNF – Extended BNF
L0080 - 2010-11-02
21
BNF versus EBNF
BNF:
EBNF:
expression ::= expression + term | expression - term| term
term ::= term * factor| term / factor| factor
factor ::= ( expression )| id | number
expression ::= term { (‘+’|’-’) term }term ::= factor { (‘*’|’/’) factor }factor ::= '(' expression ')'
| id | number
L0080 - 2010-11-02
22
More Information
“Model-Driven Software Development” by Thomas Stahl and Markus Völter Chapter 8.1 “DSL Construction”
Good high-level introduction to DSLs “Domain-Specific Languages: An Annotated Bibliography”
http://homepages.cwi.nl/~arie/papers/dslbib/ Not very interesting except for the very long annotated bibliography!
“BNF and EBNF: What are they and how do they work?” by Lars Marius Garshol - http://www.garshol.priv.no/download/text/bnf.html
About Backus-Naur Form and parsing “Internal Domain-Specific Languages”
http://fragmental.tw/research-on-dsls/domain-specific-languages-dsls/internal-dsls/
Outlines use of DSL in Java and Ruby
L0080 - 2010-11-02
23
Exercise 1
Design a textual language (grammar) for a travel package. How would you like to specify a travel package in text? Do you think, you can teach a travel administrator to use the language? Is the grammar open – i.e. can it be extended in the future?
For the grammar you can use ID, STRING and NUMBER as terminators where relevant