Date post: | 04-Apr-2018 |
Category: |
Documents |
Upload: | mohsan-naqi |
View: | 239 times |
Download: | 0 times |
of 48
7/31/2019 parallel computing tutorial
1/48
Defining Program Syntax
Chapter Two Modern Programming Languages, 2nd ed. 1
7/31/2019 parallel computing tutorial
2/48
Syntax And Semantics
Programming language syntax: how
programs look, their form and structure
Syntax is defined using a kind of formalgrammar
Programming language semantics: what
programs do, their behavior and meaning
Semantics is harder to definemore on this in
Chapter 23
Chapter Two Modern Programming Languages, 2nd ed. 2
7/31/2019 parallel computing tutorial
3/48
Outline
Grammar and parse tree examples
BNF and parse tree definitions
Constructing grammars Phrase structure and lexical structure
Other grammar forms
Chapter Two Modern Programming Languages, 2nd ed. 3
7/31/2019 parallel computing tutorial
4/48
An English Grammar
Chapter Two Modern Programming Languages, 2nd ed. 4
A sentence is a nounphrase, a verb, and a
noun phrase.
A noun phrase is anarticle and a noun.
A verb is
An article is
A noun is...
::=
::=
::= loves | hates|eats
::= a | the
::= dog | cat| rat
7/31/2019 parallel computing tutorial
5/48
How The Grammar Works
The grammar is a set of rules that say how
to build a treeaparse tree
You put at the root of the tree
The grammars rules say how children can
be added at any point in the tree
For instance, the rule
says you can add nodes , , and
, in that order, as children of Chapter Two Modern Programming Languages, 2nd ed. 5
::=
7/31/2019 parallel computing tutorial
6/48
A Parse Tree
Chapter Two Modern Programming Languages, 2nd ed. 6
the dog the catloves
7/31/2019 parallel computing tutorial
7/48
A Programming Language
Grammar
An expression can be the sum of twoexpressions, or the product of twoexpressions, or a parenthesized
subexpression
Or it can be one of the variables a,b or c
Chapter Two Modern Programming Languages, 2nd ed. 7
::= + | * | ( )
| a | b | c
7/31/2019 parallel computing tutorial
8/48
A Parse Tree
Chapter Two Modern Programming Languages, 2nd ed. 8
+
( )
*
( )
a b
((a+b)*c)
c
7/31/2019 parallel computing tutorial
9/48
Outline
Grammar and parse tree examples
BNF and parse tree definitions
Constructing grammars Phrase structure and lexical structure
Other grammar forms
Chapter Two Modern Programming Languages, 2nd ed. 9
7/31/2019 parallel computing tutorial
10/48
Chapter Two Modern Programming Languages, 2nd ed. 10
::=
::=
::= loves | hates|eats
::= a | the
::= dog | cat| rattokens
non-terminal
symbols
start symbol
a production
7/31/2019 parallel computing tutorial
11/48
BNF Grammar Definition
A BNF grammar consists of four parts:
The set oftokens
The set ofnon-terminal symbols
The start symbol
The set ofproductions
Chapter Two Modern Programming Languages, 2nd ed. 11
7/31/2019 parallel computing tutorial
12/48
Definition, Continued
The tokens are the smallest units of syntax Strings of one or more characters of program text
They are atomic: not treated as being composed fromsmaller parts
The non-terminal symbols stand for larger piecesof syntax
They are strings enclosed in angle brackets, as in
They are not strings that occur literally in program text
The grammar says how they can be expanded intostrings of tokens
The start symbol is the particular non-terminal thatforms the root of any parse tree for the grammar
Chapter Two Modern Programming Languages, 2nd ed. 12
7/31/2019 parallel computing tutorial
13/48
Definition, Continued
Theproductions are the tree-building rules Each one has a left-hand side, the separator ::=,
and a right-hand side
The left-hand side is a single non-terminal
The right-hand side is a sequence of one or more things,
each of which can be either a token or a non-terminal
A production gives one possible way of building a
parse tree: it permits the non-terminal symbol onthe left-hand side to have the things on the right-
hand side, in order, as its children in a parse tree
Chapter Two Modern Programming Languages, 2nd ed. 13
7/31/2019 parallel computing tutorial
14/48
Alternatives
When there is more than one productionwith the same left-hand side, an abbreviated
form can be used
The BNF grammar can give the left-handside, the separator ::=, and then a list of
possible right-hand sides separated by the
special symbol |
Chapter Two Modern Programming Languages, 2nd ed. 14
7/31/2019 parallel computing tutorial
15/48
Example
Chapter Two Modern Programming Languages, 2nd ed. 15
Note that there are six productions in this grammar.It is equivalent to this one:
::= + | * | ( )
| a | b | c
::= +
::= *
::= ( ) ::= a
::= b
::= c
7/31/2019 parallel computing tutorial
16/48
Empty
The special nonterminal is for
places where you want the grammar to
generate nothing For example, this grammar defines a typical
if-then construct with an optional else part:
Chapter Two Modern Programming Languages, 2nd ed. 16
::= if then
::= else |
7/31/2019 parallel computing tutorial
17/48
Parse Trees
To build a parse tree, put the start symbol at
the root
Add children to every non-terminal,following any one of the productions for
that non-terminal in the grammar
Done when all the leaves are tokens Read off leaves from left to rightthat is
the string derived by the tree
Chapter Two Modern Programming Languages, 2nd ed. 17
7/31/2019 parallel computing tutorial
18/48
Practice
Chapter Two Modern Programming Languages, 2nd ed. 18
Show a parse tree for each of these strings:
a+b
a*b+c
(a+b)
(a+(b))
::= + | * | ( )
| a |b | c
7/31/2019 parallel computing tutorial
19/48
Compiler Note
What we just did isparsing: trying to find aparse tree for a given string
Thats what compilers do for every programyou try to compile: try to build a parse treefor your program, using the grammar forwhatever language you used
Take a course in compiler construction tolearn about algorithms for doing thisefficiently
Chapter Two Modern Programming Languages, 2nd ed. 19
7/31/2019 parallel computing tutorial
20/48
Language Definition
We use grammars to define the syntax ofprogramming languages
The language defined by a grammar is the
set of all strings that can be derived by someparse tree for the grammar
As in the previous example, that set is ofteninfinite (though grammars are finite)
Constructing grammars is a little likeprogramming...
Chapter Two Modern Programming Languages, 2nd ed. 20
7/31/2019 parallel computing tutorial
21/48
Outline
Grammar and parse tree examples
BNF and parse tree definitions
Constructing grammars Phrase structure and lexical structure
Other grammar forms
Chapter Two Modern Programming Languages, 2nd ed. 21
7/31/2019 parallel computing tutorial
22/48
Constructing Grammars
Most important trick: divide and conquer
Example: the language of Java declarations:
a type name, a list of variables separated by
commas, and a semicolon Each variable can be followed by an
initializer:
Chapter Two Modern Programming Languages, 2nd ed. 22
float a;
boolean a,b,c;
int a=1, b, c=1+2;
7/31/2019 parallel computing tutorial
23/48
Example, Continued
Easy if we postpone defining the comma-
separated list of variables with initializers:
Primitive type names are easy enough too:
(Note: skipping constructed types: class
names, interface names, and array types)
Chapter Two Modern Programming Languages, 2nd ed. 23
::= ;
::=boolean |byte | short | int
| long | char | float | double
7/31/2019 parallel computing tutorial
24/48
Example, Continued
That leaves the comma-separated list of
variables with initializers
Again, postpone defining variables withinitializers, and just do the comma-
separated list part:
Chapter Two Modern Programming Languages, 2nd ed. 24
::=
| ,
7/31/2019 parallel computing tutorial
25/48
Example, Continued
That leaves the variables with initializers:
For full Java, we would need to allow pairs
of square brackets after the variable name
There is also a syntax for array initializers And definitions for and
Chapter Two Modern Programming Languages, 2nd ed. 25
::=
| =
7/31/2019 parallel computing tutorial
26/48
Outline
Grammar and parse tree examples
BNF and parse tree definitions
Constructing grammars Phrase structure and lexical structure
Other grammar forms
Chapter Two Modern Programming Languages, 2nd ed. 26
7/31/2019 parallel computing tutorial
27/48
Where Do Tokens Come From?
Tokens are pieces of program text that wedo not choose to think of as being built fromsmaller pieces
Identifiers (count), keywords (if),operators (==), constants (123.4), etc.
Programs stored in files are just sequences
of characters How is such a file divided into a sequence
of tokens?
Chapter Two Modern Programming Languages, 2nd ed. 27
7/31/2019 parallel computing tutorial
28/48
Lexical Structure And
Phrase Structure
Grammars so far have definedphrase
structure: how a program is built from a
sequence of tokens
We also need to define lexical structure:
how a text file is divided into tokens
Chapter Two Modern Programming Languages, 2nd ed. 28
7/31/2019 parallel computing tutorial
29/48
One Grammar For Both
You could do it all with one grammar by
using characters as the only tokens
Not done in practice: things like white spaceand comments would make the grammar
too messy to be readable
Chapter Two Modern Programming Languages, 2nd ed. 29
::= if then
::= else |
7/31/2019 parallel computing tutorial
30/48
7/31/2019 parallel computing tutorial
31/48
Separate Compiler Passes
The scannerreads the input file and dividesit into tokens according to the first grammar
The scanner discards white space andcomments
Theparserconstructs a parse tree (or atleast goes through the motionsmore about
this later) from the token stream accordingto the second grammar
Chapter Two Modern Programming Languages, 2nd ed. 31
7/31/2019 parallel computing tutorial
32/48
Historical Note #1
Early languages sometimes did not separate
lexical structure from phrase structure
Early Fortran and Algol dialects allowed spacesanywhere, even in the middle of a keyword
Other languages like PL/I allow keywords to be
used as identifiers
This makes them harder to scan and parse
It also reduces readability
Chapter Two Modern Programming Languages, 2nd ed. 32
7/31/2019 parallel computing tutorial
33/48
Historical Note #2
Some languages have afixed-formatlexical
structurecolumn positions are significant
One statement per line (i.e. per card)
First few columns for statement label
Etc.
Early dialects of Fortran, Cobol, and Basic
Most modern languages arefree-format:
column positions are ignored
Chapter Two Modern Programming Languages, 2nd ed. 33
7/31/2019 parallel computing tutorial
34/48
Outline
Grammar and parse tree examples
BNF and parse tree definitions
Constructing grammars Phrase structure and lexical structure
Other grammar forms
Chapter Two Modern Programming Languages, 2nd ed. 34
7/31/2019 parallel computing tutorial
35/48
Other Grammar Forms
BNF variations
EBNF variations
Syntax diagrams
Chapter Two Modern Programming Languages, 2nd ed. 35
7/31/2019 parallel computing tutorial
36/48
BNF Variations
Some use or = instead of ::=
Some leave out the angle brackets and use a
distinct typeface for tokens Some allow single quotes around tokens, for
example to distinguish | as a token from
|as a meta-symbol
Chapter Two Modern Programming Languages, 2nd ed. 36
7/31/2019 parallel computing tutorial
37/48
EBNF Variations
Additional syntax to simplify somegrammar chores:
{x} to mean zero or more repetitions of x
[x] to mean x is optional (i.e. x | ) () for grouping
| anywhere to mean a choice among alternatives
Quotes around tokens, if necessary, todistinguish from all these meta-symbols
Chapter Two Modern Programming Languages, 2nd ed. 37
7/31/2019 parallel computing tutorial
38/48
EBNF Examples
Anything that extends BNF this way iscalled an Extended BNF: EBNF
There are many variations
Chapter Two Modern Programming Languages, 2nd ed. 38
::= { ;}
::= if then [else ]
::= { ( | ) ;}
::= a[1]
::= a[1]
7/31/2019 parallel computing tutorial
39/48
Syntax Diagrams
Syntax diagrams (railroad diagrams)
Start with an EBNF grammar
A simple production is just a chain of boxes(for nonterminals) and ovals (for terminals):
Chapter Two Modern Programming Languages, 2nd ed. 39
if then elseexpr stmt stmtif-stmt ::= if then else
7/31/2019 parallel computing tutorial
40/48
Bypasses
Square-bracket pieces from the EBNF get
paths that bypass them
Chapter Two Modern Programming Languages, 2nd ed. 40
if then elseexpr stmt stmtif-stmt
::= if then [else ]
7/31/2019 parallel computing tutorial
41/48
Branching
Use branching for multiple productions
Chapter Two Modern Programming Languages, 2nd ed. 41
exp
exp + exp
exp * exp
( exp )
a
b
c
::= + | * | ( )
| a |b | c
7/31/2019 parallel computing tutorial
42/48
Loops
Use loops for EBNF curly brackets
Chapter Two Modern Programming Languages, 2nd ed. 42
::= {+ }
expaddend
+
7/31/2019 parallel computing tutorial
43/48
Syntax Diagrams, Pro and Con
Easier for people to read casually
Harder to read precisely: what will the parse
tree look like? Harder to make machine readable (for
automatic parser-generators)
Chapter Two Modern Programming Languages, 2nd ed. 43
7/31/2019 parallel computing tutorial
44/48
Formal Context-Free Grammars
In the study of formal languages andautomata, grammars are expressed in yetanother notation:
These are called context-free grammars
Other kinds of grammars are also studied:regular grammars (weaker), context-sensitive grammars (stronger), etc.
Chapter Two Modern Programming Languages, 2nd ed. 44
S
aSb | XX cX |
7/31/2019 parallel computing tutorial
45/48
Many Other Variations
BNF and EBNF ideas are widely used
Exact notation differs, in spite of occasional
efforts to get uniformity
But as long as you understand the ideas,
differences in notation are easy to pick up
Chapter Two Modern Programming Languages, 2nd ed. 45
7/31/2019 parallel computing tutorial
46/48
Example
Chapter Two Modern Programming Languages, 2nd ed. 46
WhileStatement:while (Expression ) Statement
DoStatement:
do Statementwhile (Expression ) ;
BasicForStatement:
for ( ForInitopt;Expressionopt; ForUpdateopt)
Statement
[from The Java Language Specification,
Third Edition, James Gosling et. al.]
7/31/2019 parallel computing tutorial
47/48
Conclusion
We use grammars to define programming
language syntax, both lexical structure and
phrase structure
Connection between theory and practice
Two grammars, two compiler passes
Parser-generators can write code for those two
passes automatically from grammars
Chapter Two Modern Programming Languages, 2nd ed. 47
7/31/2019 parallel computing tutorial
48/48
Conclusion, Continued
Multiple audiences for a grammar
Novices want to find out what legal programslook like
Expertsadvanced users and language systemimplementerswant an exact, detaileddefinition
Toolsparser and scanner generatorswant
an exact, detailed definition in a particular,machine-readable form