Programming LanguagesThird Edition
Chapter 6 Part IISyntax / Grammars and Parsing
Objectives
• Understand context-free grammars and BNFs• Become familiar with parse trees • Understand ambiguity, associativity, and
precedence• Read Sections 6.2 – 6.4, pp. 204-220
Programming Languages, Third Edition 2
ParsingContext-Free Grammars and BNFs
• Context-free grammar: consists of – a series of grammar rules (Productions)
• Each rule has a single phrase structure name on the left, then a ::= metasymbol, followed by a sequence of symbols or other phrase structure names on the right
– Nonterminals: names for phrase structures, since they are broken down into further phrase structures
– Start symbol: one of the Nonterminals– Terminals: words or token symbols that cannot be
broken down further
Programming Languages, Third Edition 3
Example 1: Unsigned Integers
<num> ::= <digit> | <num> <digit><digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Terminals: 0, 1, … , 9Nonterminals: <num> , <digit>Start Symbol: <num>Productions: there are 12Metasymbols: “::=“ , “|”
Programming Languages, Third Edition 4
Example 1 (cont’d)
• Derivation: the process of building in a language by beginning with the start symbol and replacing left-hand sides by choices of right-hand sides in the rules
• Let’s derive the number 123 (on board)
• Parse tree: graphical depiction of the replacement process in a derivation
• Let’s draw parse tree for 123 (on board)
Programming Languages, Third Edition 5
Example 1 (cont’d)
<num> ::= <digit> | <num> <digit><digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
• Notice recursion in one of rules• Notice recursive symbol is on left• This is a left-recursive grammar• This is a left-associative grammar• Notice how parse tree cascades to left
Programming Languages, Third Edition 6
Example 2: Unsigned Integers
<num> ::= <digit> | <digit> <num> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Only made one change, so now grammar is• Right-recursive• Right-associative• Let’s draw parse tree for 123
Programming Languages, Third Edition 7
Ex 3: Simple Expression Grammar
<expr> ::= <expr> + <expr> | <expr> * <expr> |( <expr> ) | <num>
<num> ::= <digit> | <num> <digit> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Let’s derive parse tree for: 3 + 4 + 5
Programming Languages, Third Edition 8
Ex 3 (cont’d)
<expr> ::= <expr> + <expr> | <expr> * <expr> |( <expr> ) | <num>
<num> ::= <digit> | <num> <digit> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Is there another parse tree for: 3 + 4 + 5
Programming Languages, Third Edition 9
Ex 3 (cont’d)
<expr> ::= <expr> + <expr> | <expr> * <expr> |( <expr> ) | <num>
<num> ::= <digit> | <num> <digit> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
A grammar is ambiguous if there are two parse trees for the same string
Programming Languages, Third Edition 10
Ex 3 (cont’d)
<expr> ::= <expr> + <expr> | <expr> * <expr> |( <expr> ) | <num>
<num> ::= <digit> | <num> <digit> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Ambiguity is undesirable
Let’s see why it’s undesirable: Derive parse trees for 3 + 4 * 5
Programming Languages, Third Edition 11
Ex 3 (cont’d)
<expr> ::= <expr> + <expr> | <expr> * <expr> |( <expr> ) | <num>
<num> ::= <digit> | <num> <digit> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
So what was the problem?
Which tree provides correct arithmetic interpretation?
Programming Languages, Third Edition 12
Ex 3 (cont’d)
Can we modify the grammar to “fix” the problem? YES!Add more levels of productions:
<expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor><factor> ::= ( <expr> ) | <num><num> ::= <digit> | <num> <digit><digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Programming Languages, Third Edition 13
Ex 3 (cont’d)
<expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor><factor> ::= ( <expr> ) | <num><num> ::= <digit> | <num> <digit><digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Redraw parse trees for 3 + 4 + 5 and 3 + 4 * 5
Programming Languages, Third Edition 14
Chapter 6Final Thoughts
• A grammar is context-free when nonterminals appear singly on the left sides of productions– There is no context under which only certain
replacements can occur• Anything not expressible using context-free
grammars is a semantic, not a syntactic, issue• BNF form of language syntax makes it easier to
write translators• Parsing stage can be automated (e.g. yacc tool in
Unix, Python)
Programming Languages, Third Edition 15
Chapter 6Final Thoughts
• Syntax establishes structure, not meaning– But meaning is related to syntax
• Syntax-directed semantics: process of associating the semantics of a construct to its syntactic structure– Must construct the syntax so that it reflects the
semantics to be attached later
Programming Languages, Third Edition 16