Date post: | 16-Dec-2015 |
Category: |
Documents |
Upload: | maude-golden |
View: | 221 times |
Download: | 0 times |
Outline
Introduction. Directional top-down parsing. Imitating leftmost derivations. The pushdown automaton. Breadth-first top-down parsing. Depth-first (backtracking) parsers. Conclusion.
Introduction
• Parsers can check whether a word matches a certain grammar, and provide one or more syntactic analyses.
• There are two basic types:
– Top-down parsing
• Directional (goes from left to right).
• Non-directional.
– Bottom-up parsing
Directional Top-down Parsing
• Begin with the start symbol S.
• Apply productions until we arrive at the input string.
• We draw the prediction right under the part of the input it predicts.
Imitating Leftmost derivations
The grammar form consists of both terminals and non-terminals.
If a terminal symbol is in front, we match it with the current input symbol, if non-terminal is in front, we pick one of its right-hand sides.
This way we all the time replace leftmost non-terminal, and in the end, if we succeed, we have imitated a leftmost derivation.
Example ctd.
We try to rederive the input aabb from the start symbol S. The first symbol of our prediction is non-terminal, so we have to replace it by one of its right-hand sides.
S → aB | bA
We apply the first option, because the terminals match. Now we have to parse abb, and we match terminals again..
B → b | bS | aBB
Example ctd.
We're now left with BB for bb.
B → b | bS | aBB
Then we have to replace leftmost B by one of its choices (B → b). In the end we receive the following derivation:
S → aB → aaBB → aabB → aabb
Push-down automaton.
A stack is FILO list. The PDA operates by popping the stack (that contains stack alphabet) and reading an input symbol.
These two symbols give us a choice of several lists of stack symbols to be pushed back on the stack.
So there is a mapping of (input symbol, stack symbol) pairs to lists of stack symbols. The automaton accepts the input sentence when the stack is empty at the end of the input.
Breadth-first Top-Down Parsing
Two different strategies to go through decision tree – breadth-first and depth-first.
In breadth-first we maintain a list of all possible predictions.
We process it in the following way:
If there's non-terminal on top, we replace the prediction stack by several new predictions stacks, depending on the choices for this non-terminal
If we have a terminal, we can eliminate all the prediction stacks that do not match.
Depth-first (Backtracking) Parsers
The breadth-first method uses too much memory, because it stores a list of all possible predictions.
The depth-first method doesn't have this problem because we look at only one path at a time.
Firstly we examine the path, if it turns out to be a failure, we roll back our actions and continue with other possibilities.
Backtracking
• Sometimes we have multiple right-hand sides and we have to choose one.
• But if we choose the wrong one, we come to a dead end.
• So, we have to go back to the point where we made the choice, and try an alternative path.
• We do this until we succeed, or run out of choices.
Conclusion
We always process the leftmost symbol of the prediction.
If this symbol is a terminal, we have no choice: we have to match it with the current input symbol or reject the parse.
If this symbol is a non-terminal, we have to make a prediction, it has to be replaced by one of its right-hand sides. Thus, we always process the leftmost non-terminal first, so we get a leftmost derivation.
As a result, a top-down method recognizes the nodes of the parse tree in pre-order: the parent is identified before any of its children.