+ All Categories
Home > Documents > ParsingParsing. 2 Front-End: Parser Checks the stream of words and their parts of speech for...

ParsingParsing. 2 Front-End: Parser Checks the stream of words and their parts of speech for...

Date post: 22-Dec-2015
Category:
Upload: elisabeth-cobb
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
56
Parsing Parsing
Transcript
Page 1: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

ParsingParsingParsingParsing

Page 2: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

2

Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser

Checks the stream of words and their parts of speech for grammatical correctness

scanner parsersourcecode

tokens IR

errors

Page 3: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

3

Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser

Determines if the input is syntactically well formed

scanner parsersourcecode

tokens IR

errors

Page 4: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

4

Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser

Guides context-sensitive (“semantic”) analysis (type checking)

scanner parsersourcecode

tokens IR

errors

Page 5: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

5

Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser

Builds IR for source program

scanner parsersourcecode

tokens IR

errors

Page 6: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

6

Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis Natural language analogy:

consider the sentence

He wrote the program

Page 7: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

7

Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis

He wrote the program

noun verb article noun

Page 8: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

8

Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis

He wrote the program

noun verb article noun

subject predicate object

Page 9: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

9

Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis Natural language analogy

He wrote the program

noun verb article noun

subject predicate object

sentence

Page 10: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

10

Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis Programming language

if ( b <= 0 ) a = b

bool expr assignment

if-statement

Page 11: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

11

Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysissyntax errors

int* foo(int i, int j)){ for(k=0; i j; ) fi( i > j ) return j;}

Page 12: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

Compiler Compiler ConstructionConstruction

Compiler Compiler ConstructionConstruction

Sohail Aslam

Lecture 11

Page 13: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

13

Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysisint* foo(int i, int j))

{

for(k=0; i j; )

fi( i > j )

return j;

}

extra parenthesis

Missing expression

not a keyword

Page 14: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

14

Semantic AnalysisSemantic AnalysisSemantic AnalysisSemantic Analysis Grammatically correct

He wrote the computer

noun verb article noun

subject predicate object

sentence

Page 15: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

15

Semantic AnalysisSemantic AnalysisSemantic AnalysisSemantic Analysis semantically (meaning) wrong!

He wrote the computer

noun verb article noun

subject predicate object

sentence

Page 16: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

16

Semantic AnalysisSemantic AnalysisSemantic AnalysisSemantic Analysisint* foo(int i, int j){ for(k=0; i < j; j++ ) if( i < j-2 ) sum = sum+i return sum;}

undeclared var

return type

mismatch

Page 17: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

17

Role of the ParserRole of the ParserRole of the ParserRole of the Parser Not all sequences of tokens

are program. Parser must distinguish

between valid and invalid sequences of tokens.

Page 18: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

18

Role of the ParserRole of the ParserRole of the ParserRole of the Parser Not all sequences of tokens

are program. Parser must distinguish

between valid and invalid sequences of tokens.

Page 19: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

19

Role of the ParserRole of the ParserRole of the ParserRole of the ParserWhat we need

An expressive way to describe the syntax

An acceptor mechanism that determines if input token stream satisfies the syntax

Page 20: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

20

Role of the ParserRole of the ParserRole of the ParserRole of the ParserWhat we need

An expressive way to describe the syntax

An acceptor mechanism that determines if input token stream satisfies the syntax

Page 21: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

21

Role of the ParserRole of the ParserRole of the ParserRole of the ParserWhat we need

An expressive way to describe the syntax

An acceptor mechanism that determines if input token stream satisfies the syntax

Page 22: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

22

Study of ParsingStudy of ParsingStudy of ParsingStudy of Parsing Parsing is the process of

discovering a derivation for some sentence

Page 23: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

23

Study of ParsingStudy of ParsingStudy of ParsingStudy of Parsing Mathematical model of

syntax – a grammar G.

Algortihm for testing membership in L(G).

Page 24: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

24

Study of ParsingStudy of ParsingStudy of ParsingStudy of Parsing Mathematical model of

syntax – a grammar G.

Algortihm for testing membership in L(G).

Page 25: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

25

Context Free GrammarsContext Free GrammarsContext Free GrammarsContext Free GrammarsA CFG is a four tuple

G=(S,N,T,P) S is the start symbol N is a set of non-terminals T is a set of terminals P is a set of productions

Page 26: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

26

Why Not Regular Why Not Regular Expressions?Expressions?Why Not Regular Why Not Regular Expressions?Expressions?Reason:

regular languages do not have enough power to express syntax of programming languages.

Page 27: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

27

Limitations of Regular Limitations of Regular LanguagesLanguagesLimitations of Regular Limitations of Regular LanguagesLanguages

Finite automaton can’t remember number of times it has visited a particular state

Page 28: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

28

Example of CFGExample of CFGExample of CFGExample of CFG

Context-free syntax is specified with a CFG

Page 29: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

29

Example of CFGExample of CFGExample of CFGExample of CFG Example

SheepNoise → SheepNoise baa| baa

This CFG defines the set of noises sheep make

Page 30: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

30

Example of CFGExample of CFGExample of CFGExample of CFG We can use the

SheepNoise grammar to create sentences

We use the productions as rewriting rules

Page 31: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

31

Example of CFGExample of CFGExample of CFGExample of CFGSheepNoise → SheepNoise baa

| baa

Rule Sentential Form- SheepNoise2 baa

Page 32: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

32

Example of CFGExample of CFGExample of CFGExample of CFGSheepNoise → SheepNoise baa

| baa

Rule Sentential Form- SheepNoise1 SheepNoise baa2 baa baa

Page 33: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

33

Example of CFGExample of CFGExample of CFGExample of CFG

And so on ...

Rule Sentential Form- SheepNoise1 SheepNoise baa1 SheepNoise baa baa2 baa baa baa

Page 34: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

34

Example of CFGExample of CFGExample of CFGExample of CFG While it is cute, this

example quickly runs out intellectual steam

To explore uses of CFGs, we need a more complex grammar

Page 35: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

35

Example of CFGExample of CFGExample of CFGExample of CFG While it is cute, this

example quickly runs out intellectual steam

To explore uses of CFGs, we need a more complex grammar

Page 36: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

36

More Useful GrammarMore Useful GrammarMore Useful GrammarMore Useful Grammar1 expr → expr op expr2 | num3 | id4 op → +5 | –6 | *7 | /

Page 37: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

37

Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)

Grammar rules in a similar form were first used in the description of the Algol60 Language.

Page 38: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

38

Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF) The notation was developed

by John Backus and adapted by Peter Naur for the Algol60 report.

Thus the term Backus-Naur Form (BNF)

Page 39: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

39

Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF) The notation was developed

by John Backus and adapted by Peter Naur for the Algol60 report.

Thus the term Backus-Naur Form (BNF)

Page 40: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

40

Derivation:Derivation:Derivation:Derivation: Let us use the expression

grammar to derive the sentence

x – 2 * y

Page 41: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

41

Derivation: Derivation: x – 2 x – 2 ** y yDerivation: Derivation: x – 2 x – 2 ** y yRule Sentential Form

- expr1 expr op expr2 <id,x> op expr5 <id,x> – expr1 <id,x> – expr op

expr

Page 42: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

42

Derivation: Derivation: x – 2 x – 2 ** y yDerivation: Derivation: x – 2 x – 2 ** y y

Rule Sentential Form2 <id,x> – <num,2> op

expr6 <id,x> – <num,2>

expr3 <id,x> – <num,2>

<id,y>

Page 43: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

43

DerivationDerivationDerivationDerivation Such a process of rewrites

is called a derivation.

Process or discovering a derivations is called parsing

Page 44: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

44

DerivationDerivationDerivationDerivation Such a process of rewrites

is called a derivation.

Process or discovering a derivations is called parsing

Page 45: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

45

DerivationDerivationDerivationDerivation

We denote this derivation as:

expr →* id – num * id

Page 46: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

46

DerivationsDerivationsDerivationsDerivations At each step, we choose a

non-terminal to replace

Different choices can lead to different derivations.

Page 47: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

47

DerivationsDerivationsDerivationsDerivations At each step, we choose a

non-terminal to replace

Different choices can lead to different derivations.

Page 48: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

48

DerivationsDerivationsDerivationsDerivations Two derivations are of

interest

1. Leftmost derivation

2. Rightmost derivation

Page 49: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

49

DerivationsDerivationsDerivationsDerivations Leftmost derivation:

replace leftmost non-terminal (NT) at each step

Rightmost derivation: replace rightmost NT at each step

Page 50: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

50

DerivationsDerivationsDerivationsDerivations Leftmost derivation:

replace leftmost non-terminal (NT) at each step

Rightmost derivation: replace rightmost NT at each step

Page 51: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

51

DerivationsDerivationsDerivationsDerivations The example on the

preceding slides was leftmost derivation

There is also a rightmost derivation

Page 52: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

52

Rightmost DerivationRightmost DerivationRightmost DerivationRightmost DerivationRule Sentential Form

- expr1 expr op expr3 expr op <id,x>6 expr <id,x>1 expr op expr

<id,x>

Page 53: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

53

Derivation: Derivation: x – 2 x – 2 ** y yDerivation: Derivation: x – 2 x – 2 ** y y

Rule Sentential Form2 expr op <num,2>

<id,x>5 expr – <num,2>

<id,x>3 <id,x> – <num,2>

<id,y>

Page 54: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

54

DerivationsDerivationsDerivationsDerivations In both cases we have

expr →* id – num id

Page 55: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

55

DerivationsDerivationsDerivationsDerivations The two derivations produce

different parse trees.

The parse trees imply different evaluation orders!

Page 56: ParsingParsing. 2 Front-End: Parser  Checks the stream of words and their parts of speech for grammatical correctness scannerparser source code tokens.

56

DerivationsDerivationsDerivationsDerivations The two derivations produce

different parse trees.

The parse trees imply different evaluation orders!


Recommended