+ All Categories
Home > Documents > Top-down Parsing By Georgi Boychev, Rafal Kala, Ildus Mukhametov.

Top-down Parsing By Georgi Boychev, Rafal Kala, Ildus Mukhametov.

Date post: 16-Dec-2015
Category:
Upload: maude-golden
View: 221 times
Download: 0 times
Share this document with a friend
Popular Tags:
19
Top-down Parsing By Georgi Boychev, Rafal Kala, Ildus Mukhametov
Transcript

Top-down Parsing

By Georgi Boychev, Rafal Kala, Ildus Mukhametov

Outline

Introduction. Directional top-down parsing. Imitating leftmost derivations. The pushdown automaton. Breadth-first top-down parsing. Depth-first (backtracking) parsers. Conclusion.

Introduction

• Parsers can check whether a word matches a certain grammar, and provide one or more syntactic analyses.

• There are two basic types:

– Top-down parsing

• Directional (goes from left to right).

• Non-directional.

– Bottom-up parsing

Introduction

Today we will discuss directional top-down parsing.

Directional Top-down Parsing

• Begin with the start symbol S.

• Apply productions until we arrive at the input string.

• We draw the prediction right under the part of the input it predicts.

Imitating Leftmost derivations

The grammar form consists of both terminals and non-terminals.

If a terminal symbol is in front, we match it with the current input symbol, if non-terminal is in front, we pick one of its right-hand sides.

This way we all the time replace leftmost non-terminal, and in the end, if we succeed, we have imitated a leftmost derivation.

Example

This is our grammar:

Input sentence is aabb.

Example ctd.

We try to rederive the input aabb from the start symbol S. The first symbol of our prediction is non-terminal, so we have to replace it by one of its right-hand sides.

S → aB | bA

We apply the first option, because the terminals match. Now we have to parse abb, and we match terminals again..

B → b | bS | aBB

Example ctd.

We're now left with BB for bb.

B → b | bS | aBB

Then we have to replace leftmost B by one of its choices (B → b). In the end we receive the following derivation:

S → aB → aaBB → aabB → aabb

Push-down automaton.

A stack is FILO list. The PDA operates by popping the stack (that contains stack alphabet) and reading an input symbol.

These two symbols give us a choice of several lists of stack symbols to be pushed back on the stack.

So there is a mapping of (input symbol, stack symbol) pairs to lists of stack symbols. The automaton accepts the input sentence when the stack is empty at the end of the input.

Example

Grammar: Input: aabb

PDA:

Breadth-first Top-Down Parsing

Two different strategies to go through decision tree – breadth-first and depth-first.

In breadth-first we maintain a list of all possible predictions.

We process it in the following way:

If there's non-terminal on top, we replace the prediction stack by several new predictions stacks, depending on the choices for this non-terminal

If we have a terminal, we can eliminate all the prediction stacks that do not match.

Example

Grammar: S → AB | DC

A → a | aA

B → bc | bBc

D → ab | aDb

C → c | cC

Input: aabc

Example cntd.

Example cntd.

Depth-first (Backtracking) Parsers

The breadth-first method uses too much memory, because it stores a list of all possible predictions.

The depth-first method doesn't have this problem because we look at only one path at a time.

Firstly we examine the path, if it turns out to be a failure, we roll back our actions and continue with other possibilities.

Backtracking

• Sometimes we have multiple right-hand sides and we have to choose one.

• But if we choose the wrong one, we come to a dead end.

• So, we have to go back to the point where we made the choice, and try an alternative path.

• We do this until we succeed, or run out of choices.

Example

Backtracking over a terminal is done by moving a vertical line backwards.

Conclusion

We always process the leftmost symbol of the prediction.

If this symbol is a terminal, we have no choice: we have to match it with the current input symbol or reject the parse.

If this symbol is a non-terminal, we have to make a prediction, it has to be replaced by one of its right-hand sides. Thus, we always process the leftmost non-terminal first, so we get a leftmost derivation.

As a result, a top-down method recognizes the nodes of the parse tree in pre-order: the parent is identified before any of its children.


Recommended