Home >Documents >Top Down Parser

Top Down Parser

Date post:23-Jan-2016
View:13 times
Download:0 times
Share this document with a friend
  • 1Top-Down Parsing

    The parse tree is created top to bottom.

    Top-down parser Recursive-Descent Parsing

    Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.)

    It is a general parsing technique, but not widely used.

    Not efficient

    Predictive Parsing

    no backtracking


    needs a special form of grammars (LL(1) grammars).

    Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.

  • 2Top-Down Parsing

    Begin with the start symbol at the root of the parse tree

    Build the parse tree from the top down

  • 3Top-Down Parsing

    S aSbS | bSaS | e S

    a S b S

    b S a S

    e e


  • 4Parsing Decisions

    Which nonterminal in the parse tree should be expanded?

    Which of its grammar rules should be used to expand it?

  • 5Nondeterministic Parser

    Expand any nonterminal.

    Expand it using a grammar rule that occurs in the derivation of the input string.

  • 6Backtracking Parser

    Expand the leftmost nonterminal in the parse tree.

    Try a grammar rule for the nonterminal. If it does not work out, try another one.

  • 7Backtracking Parser

    S aSa | bSb | a | b S

    b b b

    a S a

  • 8Backtracking Parser

    S aSa | bSb | a | b S

    b b b

    b S b

    a S a

  • 9Backtracking Parser

    S aSa | bSb | a | b S

    b b b

    b S b

    b S b

    a S a

  • 10

    Backtracking Parser

    S aSa | bSb | a | b S

    b b b

    b S b

    b S b

    b S b

  • 11

    Backtracking Parser

    S aSa | bSb | a | b S

    b b b

    b S b

    b S b


  • 12

    Backtracking Parser

    S aSa | bSb | a | b S

    b b b

    b S b

    b S b


  • 13

    Backtracking Parser

    S aSa | bSb | a | b S



    b S b


  • 14

    Recursive Descent Parsing

    Basic idea:

    Write a routine to recognize each lhs

    This produces a parser with mutually recursive routines.

    Good for hand-coded parsers.

    Ex: A aBb (This is only the production rule for A)

    proc A {

    - match the current token with a, and move to the next token;

    - call B;

    - match the current token with b, and move to the next token;


  • 15

    Recursive Descent Parsing (cont.)

    When to apply e-productions.

    A aA | bB | e

    If all other productions fail, we should apply an e-production. For

    example, if the current token is not a or b, we may apply the


  • 16

    Recursive Descent Parsing (cont.)

    A aBb | bAB

    proc A {

    case of the current token {

    a: - match the current token with a, and move to the next token;

    - call B;

    - match the current token with b, and move to the next token;

    b: - match the current token with b, and move to the next token;

    - call A;

    - call B;



  • 17

    Recursive Descent Parser for a Simple Declaration Statement

    Decl_stmt type idlist;

    Type int|float

    Idlist id|id ,idlist

    Proc declstmt()


    Call type();

    Call idlist();


    Proc type()


    case of the current token {

    int : match the current

    token with int, move to the

    next token

    float : match the

    currenttoken with float,

    move to the next token;



    Write the code for the nonterminal idlist

  • 18

    AaB | b will correspond to

    A() {

    if (lookahead == 'a')



    else if (lookahead == 'b')

    match ('b');

    else error();


  • 19

    Recursive descent parser for expression







    parse() {token = get_next_token();if (E() and token == '$')then return trueelse return false


    E() {if (T())then return Eprime()else return false


    Eprime() {if (token == '+')then token=get_next_token()

    if (T())then return Eprime()else return false

    else if (token==')' or token=='$')then return true else return false

    }The remaining procedures are similar.

  • 20

    When Top down parsing doesnt Work Well

    Consider productions S S a | a:

    In the process of parsing S we try the above rules

    Applied consistently in this order, get infinite loop

    Could re-order productions, but search will have

    lots of backtracking and general rule for ordering is


    Problem here is left-recursive grammar:

  • 21

    Left Recursion

    E E + T | TT T * F | FF n | (E)


    E + T

    E + T

  • 22

    Elimination of Left recursion

    Consider the left-recursive grammar

    S S a | b

    S generates all strings starting with a b and followed

    by a number of a

    Can rewrite using right-recursion

    S b S

    S a S | e

  • 23

    Elimination of left Recursion. Example

    Consider the grammar

    S 1 | S 0 ( b = 1 and a = 0 )

    can be rewritten as

    S 1 S

    S 0 S | e

  • 24

    More Elimination of Left Recursion

    In general

    S S a1 | | S an | b1 | | bm

    All strings derived from S start with one of b1,,bmand continue with several instances of a1,,an

    Rewrite as

    S b1 S | | bm S

    S a1 S | | an S | e

  • 25

    General Left Recursion

    The grammar

    S A a | d (1)

    A S b (2)

    is also left-recursive because

    S + S b a

    This left recursion can also be eliminated by first

    substituting (2) into (1)

    There is a general algorithm (e.g. Aho, Sethi, Ullman


  • 26

    Predictive Parsing

    Wouldnt it be nice if

    the r.d. parser just knew which production to expand next?


    switch ( something ) {

    case L1: return E1();

    case L2: return E2();

    otherwise: print syntax error;


    whats something, L1, L2?

    the parser will do lookahead (look at next token)

  • 27

    Predictive parsing (Contd..)

    Modification of Recursive descent top down parsing

    in which parser predicts which production to use

    By looking at the next few tokens

    No backtracking

    Predictive parsers accept LL(k) grammars

    L means left-to-right scan of input

    L means leftmost derivation

    k means predict based on k tokens of lookahead

    In practice, LL(1) is used

  • 28

    LL(1) Languages

    For each non-terminal and input token there

    may be a UNIQUE choice of production that

    could lead to success.

    LL(k) means that for each non-terminal and k

    tokens, there is only one production that could

    lead to success

  • 29

    But First: Left Factoring

    Consider the grammar

    E T + E | T

    T int | int * T | ( E )

    Impossible to predict because

    For T two productions start with int

    For E it is not clear how to predict

    A grammar must be left-factored before use

    for predictive parsing

  • 30

    Left-Factoring Example

    Starting with the grammar

    E T + E | T

    T int | int * T | ( E )

    Factor out common prefixes of productions

    E T X

    X + E |

    T ( E ) | int Y

    Y * T |

  • 31

    Left-Factoring (cont.)

    In general,

    A ab1 | ab2 where a is non-empty and the first symbols of b1 and b2 (if they have one)are different.

    when processing a we cannot know whether expand

    A to ab1 or

    A to ab2But, if we re-write the grammar as follows

    A aA

    A b1 | b2 so, we can immediately expand A to aA

  • 32

    Left-Factoring -- Algorithm

    For each non-terminal A with two or more alternatives (production rules) with a common non-empty prefix, let say

    A ab1 | ... | abn | g1 | ... | gm

    convert it into

    A aA | g1 | ... | gmA b1 | ... | bn

  • 33

    Left-Factoring Example1

    A abB | aB | cdg | cdeB | cdfB

    A aA | cdA

    A bB | B

    A g | eB | fB

  • 34

    Predictive Parser (example)

    stmt if ...... |

    while ...... |

    begin ...... |

    for .....

    When we are trying to write the non-terminal stmt, if the current token is if we have to choose first production rule.

    When we are trying to write the non-terminal stmt, we can uniquely

    choose the production rule by just looking the current token.

    We eliminate the left recursion in the grammar, and left factor it. But it

    may not be suitable for predictive parsing (not LL(1) grammar).

  • 35

    Non-Recursive Predictive Parsing -- LL(1) Parser

    Non-Recursive predictive parsing is a table-driven parser.

    It is a top-down parser.

    It is also known as LL(1) Parser.

    input buffer

    stack Non-recursive output

    Predictive Parser

    Parsing Table

  • 36

    LL(1) Parser

    input buffer

    our string to be parsed. We will assume that its end is marked with a special symbol $.


    a production rule representing a step of the derivation sequence (left-most derivation) of the string in the input buffer.

  • 37


    contains the grammar symbols

    at the bottom of the stack, there is a special end marker symbol $.

    initially the stack contains only the symbol $ and the starting symbol S. $S initial stack

Click here to load reader

Embed Size (px)