1
Intro – Lecture 9
1
Chair of Software Engineering
Introduction to Programming
Michela Pedroni
16 November 2003
Intro – Lecture 9
2
Chair of Software Engineering
Lecture 9:Describing the Syntax
Intro – Lecture 9
3
Chair of Software Engineering
Why describe the syntax formally?
We know syntax descriptions for human languages:
e.g. grammar for German, French, …
Expressed in natural language
Good enough for human use
Ambiguous, like human languages themselves
Intro – Lecture 9
4
Chair of Software Engineering
Syntax: Conditional
A conditional instruction consists, in order, of:An “If part”, of the form if condition.A “Then part” of the form then compound.Zero or more “Else if parts”, each of the formelseif condition then compound.Zero or one “Else part” of the form else compoundThe keyword end.
Here each condition is a boolean expression, and each compound is a compound instruction.
Intro – Lecture 9
5
Chair of Software Engineering
Why describe the syntax formally?
Programming languages need better descriptions:
More precise: must tell us unambiguously whether given program text is legal or not
Use formalism similar to mathematics
Can be fed into compilers for automatic processing of programs
Intro – Lecture 9
6
Chair of Software Engineering
Why describe the syntax formally?
Compilers use algorithms to
Determine if input is correct program text
Analyse program text to extract specimens
Translate program text to machine instructions
Compilers need strict formal definition of programming language
2
Intro – Lecture 9
7
Chair of Software Engineering
Formal Description of Syntax
Use formal language to describe programming languages.
Languages used to describe other languages are called Meta-Languages
Meta-Language used to describe Eiffel:BNF-E (Variant of the Backus-Naur-Form, BNF)
Intro – Lecture 9
8
Chair of Software Engineering
History
1954 FORTRAN: First widely recognized programming language (developed by John Backus et Al.) 1958 ALGOL 58: Joint work of European and American groups1960 ALGOL 60: Preparation showed a need for a formal description John Backus (member of ALGOL team) proposed Backus-Normal-Form (BNF)1964: Donald Knuth suggested acknowledging Peter Naur for his contribution Backus-Naur-FormMany variants since then, e.g. graphical variant by Niklaus Wirth
Intro – Lecture 9
9
Chair of Software Engineering
Formal description of a language
BNF lets us describe syntactical properties of a language
Remember: Description of a programming language also includes lexical and semantic properties other tools
Intro – Lecture 9
10
Chair of Software Engineering
Formal Description of Syntax
A language is a set of phrases
A phrase is a finite sequence of tokens from a certain “vocabulary”
Not every possible sequence is a phrase of the language
A grammar specifies which sequences are phrases and which are not
BNF is used to define a grammar for a programming language
Intro – Lecture 9
11
Chair of Software Engineering
Grammar
DefinitionA Grammar for a language is a finite set of rules forproducing phrases, such that:
1. Any sequence obtained by a finite number of applications of rules from the grammar is a phrase of the language.
2. Any phrase of the language can be obtained by a finite number of applications of rules from the grammar.
Intro – Lecture 9
12
Chair of Software Engineering
Elements of a grammar: Terminals
Terminals
Tokens of the language that are not defined by a production of the grammar.E.g. keywords from Eiffel such as if, then, endor symbols such as the semicolon “;” or the assignment “:=”
3
Intro – Lecture 9
13
Chair of Software Engineering
Elements of a grammar: Nonterminals
Nonterminals
Names of syntactical structures or substructures used to build phrases.
Intro – Lecture 9
14
Chair of Software Engineering
Elements of a grammar: Productions
Productions
Rules that define nonterminals of the grammar using a combination of terminals and (other) nonterminals
Intro – Lecture 9
15
Chair of Software Engineering
An example production
Terminal
Nonterminal
Production
Conditional:
if
else
endthenCondition Instruction
Instruction
Intro – Lecture 9
16
Chair of Software Engineering
BNF Elements: Concatenation
Graphical representation:
BNF: A B
Meaning: A followed by B
A B
Intro – Lecture 9
17
Chair of Software Engineering
Graphical representation:
BNF: [ A ]
Meaning: A or nothing
BNF Elements: Optional
A
Intro – Lecture 9
18
Chair of Software Engineering
Graphical representation:
BNF: A | B
Meaning: either A or B
BNF Elements: Choice
A
B
4
Intro – Lecture 9
19
Chair of Software Engineering
Graphical representation:
BNF: { A }*
Meaning: sequence of zero or more A
BNF Elements: Repetition
A
Intro – Lecture 9
20
Chair of Software Engineering
Graphical representation:
BNF: { A }+
Meaning: sequence of one or more A
BNF Elements: Repetition, once or more
A
Intro – Lecture 9
21
Chair of Software Engineering
BNF elements: Overview
ARepetition (at least once): { A }+
Repetition (zero or more): { A }*A
Choice: A | BA
B
AOptional: [ A ]
A BConcatenation: A B
Intro – Lecture 9
22
Chair of Software Engineering
BNF Elements Combined
written in BNF:
Conditional:
if
else
endthencondition instruction
instruction
,[ instructioninstruction else endif thencondition ]
Conditional
Intro – Lecture 9
23
Chair of Software Engineering
BNF: Conditional with elseif
,
,
,
Conditional
Then_part_list
Else_partThen_part_listif end[ ]
Then_part elseif }*{ Then_part
Then_part , Boolean_expression then Compound
Else_part else Compound
Intro – Lecture 9
24
Chair of Software Engineering
Different Grammar for Conditional
Conditional
If_part
Then_part
Else_list
Elseif_part
Boolean_expressionif
If_part Then_part Else_list end
Compoundthen
Boolean_expression Then_partelseif
,
,
,
,
,
Elseif_part Compound{ ]else}* [
5
Intro – Lecture 9
25
Chair of Software Engineering
BNF elements: Overview
ARepetition (at least once): { A }+
Repetition (zero or more): { A }*A
Choice: A | BA
B
AOptional: [ A ]
A BConcatenation: A B
Intro – Lecture 9
26
Chair of Software Engineering
BNF-E
Used in official description of Eiffel.Every Production is one of
ConcatenationA , B C [ D ]
ChoiceA , B | C | D
RepetitionA , { B terminal ... }*
A , [ B { terminal B }* ]
Intro – Lecture 9
27
Chair of Software Engineering
BNF-E Rules
Every nonterminal must appear on the left-hand side of exactly one production, called its defining production
Every production must be of one kind: Concatenation, Choice or Repetition
Intro – Lecture 9
28
Chair of Software Engineering
Conditional with elseif (BNF)
,
,
,
Conditional
Then_part_list
Else_partThen_part_listif end[ ]
Then_part elseif }*{ Then_part
Then_part , Boolean_expression then Compound
Else_part else Compound
Intro – Lecture 9
29
Chair of Software Engineering
BNF-E: Conditional
,
,
,
Conditional
Then_part_list
Else_partThen_part_listif end[ ]
Then_part , Boolean_expression then Compound
Else_part else Compound
elseif }+{ Then_part ...
Intro – Lecture 9
30
Chair of Software Engineering
Recursive grammars
Constructs may be nested
Express this in BNF with recursive grammars
Recursion: circular dependency of productions
6
Intro – Lecture 9
31
Chair of Software Engineering
Conditionals can be nested within conditionals:
Recursive grammars
,Else_part else Compound
Compound
Instruction
Instruction …}*{
...
,
, CallConditional Loop| | |
;
Intro – Lecture 9
32
Chair of Software Engineering
Production name can be used in its own definition
Definition of Then_part_list with repetition:
Recursive definition of Then_part_list:
Recursive grammars
,Then_part_list …}*{ Then_part
,Then_part_list Then_part elseif ][ Then_part_list
elseif
Intro – Lecture 9
33
Chair of Software Engineering
Guidelines for Grammars
Keep productions short.
easier to readbetter assessment of language size
Conditional ,if Boolean_expression then Compound{ elseif Boolean_expression then Compound }*
[ else Compound ] end
Intro – Lecture 9
34
Chair of Software Engineering
Guidelines for Grammars
Treat lexical constructs like terminalsIdentifiersConstant values
Identifier , Letter (Letter | Digit | "_")*Integer_constant , Digit+
Floating_point , [-] Digit* “." Digit+
Letter , "A" | "B" | ... | "Z" | "a" | ... | "z"Digit , "0" | "1" | ... | "9“
Intro – Lecture 9
35
Chair of Software Engineering
Guidelines for Grammars
Use unambiguous productions.Applicable production can be found by looking at one lexical element at a time
Conditional , if Then_part_list [ Else_part ] end
Compound , { Instruction }*
Instruction , Conditional | Loop | Call | ...
Intro – Lecture 9
36
Chair of Software Engineering
Writing a Parser
One feature per Production
Concatenation:Sequence of feature calls for Nonterminals, checks for Terminals
Choice:Conditional with Compound per alternative
Repetition:Loop
7
Intro – Lecture 9
37
Chair of Software Engineering
Writing a Parser: EiffelParse
Automatic generation of abstract syntax tree for phrase
Based on BNF-E
One class per production
Classes inherit from predefined classes AGGREGATE, CHOICE, REPETITION, TERMINAL
Feature production defines Production
Intro – Lecture 9
38
Chair of Software Engineering
Writing a Parser: Tools
Yooc
Translates BNF-E to EiffelParse classes
Yacc / Bison
Translates BNF to C parser
Intro – Lecture 9
39
Chair of Software Engineering
End lecture 9