+ All Categories
Home > Documents > Chapter 9

Chapter 9

Date post: 05-Jan-2016
Category:
Upload: ernst
View: 26 times
Download: 0 times
Share this document with a friend
Description:
Chapter 9. Compilers and Language Translation. The Compilation Process. Phase I: Lexical analysis Phase II: Parsing Phase III: Semantics and code generation Phase IV: Code Optimization. Introduction. High-level languages are more difficult to “ translate ” than assembly languages. - PowerPoint PPT Presentation
Popular Tags:
51
Chapter 9 Chapter 9 Compilers and Language Compilers and Language Translation Translation
Transcript
Page 1: Chapter 9

Chapter 9Chapter 9

Compilers and Language Compilers and Language TranslationTranslation

Page 2: Chapter 9

The Compilation ProcessThe Compilation Process

Phase I: Lexical analysisPhase I: Lexical analysis Phase II: ParsingPhase II: Parsing Phase III: Semantics and code Phase III: Semantics and code

generationgeneration Phase IV: Code OptimizationPhase IV: Code Optimization

Page 3: Chapter 9

IntroductionIntroduction

High-level languages are more High-level languages are more difficult to “translate” than assembly difficult to “translate” than assembly languages.languages.

Assembly language and machine Assembly language and machine language are related 1-to-1.language are related 1-to-1.

The relationship between a high-level The relationship between a high-level language and machine language is 1-language and machine language is 1-to-many. to-many.

Page 4: Chapter 9

CompilerCompiler

The piece of software that The piece of software that translates high-level translates high-level programming language codes programming language codes into machine language codes.into machine language codes.

Two distinct goals of compiler:Two distinct goals of compiler:• CorrectnessCorrectness• Efficient and conciseEfficient and concise

Example: 2xExample: 2x00+2x+2x11+…+2x+…+2x5000050000

Page 5: Chapter 9

Objectfile

The Compilation ProcessThe Compilation Process

Scanner ParserCode

Generator

Optimizer

Page 6: Chapter 9

Lexical AnalysisLexical Analysis

The compiler examines the individual The compiler examines the individual characters in the source program characters in the source program and groups them into syntactical and groups them into syntactical units, called units, called tokenstokens, that will be , that will be analyzed in succeeding stages.analyzed in succeeding stages.

Analogous to grouping letters into Analogous to grouping letters into words prior to analyzing text.words prior to analyzing text.

Page 7: Chapter 9

ParsingParsing

During this stage the sequence of During this stage the sequence of tokens formed by the scanner is tokens formed by the scanner is checked to see whether it is checked to see whether it is syntactically correct according to the syntactically correct according to the rules of the programming language.rules of the programming language.

Equivalent to checking whether the Equivalent to checking whether the words in the text form grammatically words in the text form grammatically correct sentences.correct sentences.

Page 8: Chapter 9

Semantic Analysis and Code Semantic Analysis and Code GenerationGeneration

If the high-level language statement If the high-level language statement is structurally correct, then the is structurally correct, then the compiler analyzes its meaning and compiler analyzes its meaning and generates the proper sequence of generates the proper sequence of machine language instructions to machine language instructions to carry out these actions.carry out these actions.

Page 9: Chapter 9

Code OptimizationCode Optimization

The compiler takes the generated The compiler takes the generated code and see whether it can be made code and see whether it can be made more efficient, either by making it more efficient, either by making it run faster, or having it occupy less run faster, or having it occupy less memory.memory.

Page 10: Chapter 9

Phase I: Lexical AnalysisPhase I: Lexical Analysis Scanner, or lexical analyzer, groups Scanner, or lexical analyzer, groups

input characters into tokens.input characters into tokens. Example:Example:a = b + 319 - delta;a = b + 319 - delta;

The scanner discards nonessential The scanner discards nonessential characters, such as blanks and tabs, characters, such as blanks and tabs, and the group the remaining and the group the remaining characters into high-level syntactic characters into high-level syntactic symbols such as symbols, numbers, symbols such as symbols, numbers, and operators. and operators.

Page 11: Chapter 9

Token ClassificationsToken Classifications

Token typeToken type Classification Classification numbernumber

symbolsymbol 11

numbernumber 22 Others: =(3),+(4),-(5),;(6); ==(7), Others: =(3),+(4),-(5),;(6); ==(7),

if(8), else (9), ( 10, ) 11if(8), else (9), ( 10, ) 11

Page 12: Chapter 9

Phase II: ParsingPhase II: Parsing

During the parsing phase, a compiler During the parsing phase, a compiler determines whether the tokens determines whether the tokens recognized by the scanner fit recognized by the scanner fit together in a grammatically together in a grammatically meaningful way.meaningful way.

Analogous to the operation of Analogous to the operation of “diagramming a sentence”. “diagramming a sentence”.

Page 13: Chapter 9

ExampleExample

To prove the To prove the sequence of words:sequence of words:

The man bit the The man bit the dogdog

is a correctly formed is a correctly formed sentence.sentence.

Page 14: Chapter 9

Another ExampleAnother Example

The man bit theThe man bit the

Page 15: Chapter 9

Programming Language Programming Language ExampleExample

Statement: a = b + c Statement: a = b + c

Page 16: Chapter 9

Parse TreeParse Tree

The structure shown in the previous The structure shown in the previous example is called a parse tree.example is called a parse tree.

It starts from the individual tokens It starts from the individual tokens a,=,b,+,c and show how these a,=,b,+,c and show how these tokens can be grouped together into tokens can be grouped together into predefined grammatical categories predefined grammatical categories such as <symbol>, <addition such as <symbol>, <addition operator> and <expression> until operator> and <expression> until the desired goal is reached. (in this the desired goal is reached. (in this case, <assignment statement>) case, <assignment statement>)

Page 17: Chapter 9

Grammars, Languages and BNFGrammars, Languages and BNF

How does a parser know how to construHow does a parser know how to construct the parse tree?ct the parse tree?

The parser must be given a formal descriThe parser must be given a formal description of the syntax, the grammatical strption of the syntax, the grammatical structure, of the language that it is going to ucture, of the language that it is going to analyze.analyze.

Most widely used notation for representiMost widely used notation for representing the syntax of programming language ng the syntax of programming language is called is called BNFBNF, an acronym for Backus-Na, an acronym for Backus-Naur form.ur form.

Page 18: Chapter 9

BNFBNF

The syntax of a language is specified The syntax of a language is specified as a set of rules, also called as a set of rules, also called productions.productions.

The entire collection of rules is called The entire collection of rules is called a grammar.a grammar.

BRN rule:BRN rule:left-hand side::=“definition”left-hand side::=“definition”

Page 19: Chapter 9

BNF ExampleBNF Example

<assignment <assignment statement>::=<symbol>=<expressistatement>::=<symbol>=<expression>on>

The rule says that the syntactical The rule says that the syntactical construct called <assignment construct called <assignment statement> is defined as a statement> is defined as a <symbol> followed by the token = <symbol> followed by the token = followed by the syntactical construct followed by the syntactical construct called <expression>called <expression>

Page 20: Chapter 9

Terminal/NonterminalsTerminal/Nonterminals

BNF uses two types of objects on the rigBNF uses two types of objects on the right hand side of a productions:ht hand side of a productions:• Terminals: actual tokens of the language recTerminals: actual tokens of the language rec

ognized and returned by a scanner.ognized and returned by a scanner.• Nonterminals: an intermediate grammatical Nonterminals: an intermediate grammatical

category used to help explain and organize tcategory used to help explain and organize the language.he language.

Page 21: Chapter 9

Goal SymbolGoal Symbol

The goal symbol is the highest-level nonThe goal symbol is the highest-level nonterminal.terminal.

When goal symbol has been produced, tWhen goal symbol has been produced, the parser has finished building the tree, he parser has finished building the tree, and the statements have been successfuand the statements have been successfully parsed.lly parsed.

The collection of all statements that can The collection of all statements that can be successfully parsed is called the be successfully parsed is called the langlanguageuage defined by a grammar. defined by a grammar.

Page 22: Chapter 9

Meta-symbolsMeta-symbols Meta-symbol: used to describe the Meta-symbol: used to describe the

characteristics of another language.characteristics of another language. BNF has five meta-symbols:BNF has five meta-symbols:

<<>>::= ::= | :OR, | :OR, Ex:<digit>:=0|1|2|3|4|5|6|7|8|Ex:<digit>:=0|1|2|3|4|5|6|7|8|99 : null string: null stringEx:<signed integer>:= <sign><number> Ex:<signed integer>:= <sign><number>

<sign>:= +|-|<sign>:= +|-|

Page 23: Chapter 9

Fundamental Rule of ParsingFundamental Rule of Parsing

If, by repeated applications of the If, by repeated applications of the rules of the grammar, a parser can rules of the grammar, a parser can convert the sequence of input tokens convert the sequence of input tokens into the goal symbol, then that into the goal symbol, then that sequence of tokens is a syntactically sequence of tokens is a syntactically valid statement of the language.valid statement of the language.

Page 24: Chapter 9

ExampleExample

A three-rule grammarA three-rule grammar1.1. <sentence>::=<noun><verb><sentence>::=<noun><verb>

2.2. <noun>::= bees|dogs<noun>::= bees|dogs

3.3. <verb>::=buzz|bite<verb>::=buzz|bite• Example 1: Dogs bite.Example 1: Dogs bite.• Example 2: Bees dogs.Example 2: Bees dogs.

Page 25: Chapter 9

Another ExampleAnother Example

Grammar for a simplified Grammar for a simplified assignment statementassignment statement

1.1. <assignment <assignment statement>::=<variable>=<expression>statement>::=<variable>=<expression>

2.2. <expression>::=<variable>|<expression>::=<variable>|<variable>+<variable><variable>+<variable>

3.3. <variable>::= x|y|z<variable>::= x|y|z

Page 26: Chapter 9

Generated Parse TreeGenerated Parse Tree

Page 27: Chapter 9

Wrong PathWrong Path

Page 28: Chapter 9

How to parse?How to parse?

The process of parser is a complex The process of parser is a complex sequence of applying rules, building sequence of applying rules, building grammatical constructs, seeing grammatical constructs, seeing whether things are moving toward whether things are moving toward the correct answer (the goal symbol). the correct answer (the goal symbol). If not, “undo” the rule just applied If not, “undo” the rule just applied and try another.and try another.

Look-ahead parsing algorithm: Look-ahead parsing algorithm: “looking down the road” a few tokens “looking down the road” a few tokens to see what would happen if a certain to see what would happen if a certain choice were made.choice were made.

Page 29: Chapter 9

Example Example

Not possible to build a parse tree with the grammar.

Page 30: Chapter 9

Major ChallengeMajor Challenge

Design a grammar that:Design a grammar that:• Includes every valid statement that we Includes every valid statement that we

want to be in the languagewant to be in the language• Excludes every invalid statement that Excludes every invalid statement that

we do not want to be in the languagewe do not want to be in the language

Page 31: Chapter 9

Assignment Statement (2Assignment Statement (2ndnd try) try)

1.1. <assignment <assignment statement>::=<variable>=<expression>statement>::=<variable>=<expression>

2.2. <expression>::=<variable>|<expression>::=<variable>|<expression>+<expression> <expression>+<expression> (recursive definition)(recursive definition)

3.3. <variable>::= x|y|z<variable>::= x|y|z

Page 32: Chapter 9

Resulting Parse TreeResulting Parse Tree

Page 33: Chapter 9

Using Recursive DefinitionUsing Recursive Definition

Page 34: Chapter 9

Validity vs. AmbiguityValidity vs. Ambiguity

It is possible to construct two parse It is possible to construct two parse trees of x=x+y+z using the 2trees of x=x+y+z using the 2ndnd grammar.grammar. Two different meanings. Two different meanings.

X=(x+y)+zX=(x+y)+z x=x+(y+z)x=x+(y+z)

Page 35: Chapter 9

If-else grammarIf-else grammar

Page 36: Chapter 9

Parse TreeParse Tree

Page 37: Chapter 9

Phase III: Semantics and Code Phase III: Semantics and Code GenerationGeneration

1.1. <sentence>::=<noun><verb><sentence>::=<noun><verb>2.2. <noun>::= bees|dogs<noun>::= bees|dogs3.3. <verb>::=buzz|bite<verb>::=buzz|bite

Possible combinations:Possible combinations:• Dogs bite.Dogs bite.• Dogs bark.Dogs bark.• Bees bite.Bees bite.• Bees bark.Bees bark.

Not all combinations make sense.Not all combinations make sense.

Page 38: Chapter 9

Semantics and Code Semantics and Code GenerationGeneration

A compiler examines the semantics A compiler examines the semantics of a programming language of a programming language statement. It analyzes the meaning statement. It analyzes the meaning of the tokens and tries to understand of the tokens and tries to understand the actions they perform.the actions they perform.

If the statement is meaningless, it is If the statement is meaningless, it is semantically rejected. Otherwise it is semantically rejected. Otherwise it is translated into machine language.translated into machine language.

Page 39: Chapter 9

ExampleExample

The statementThe statement sum=a+b;sum=a+b;

is syntactically correct.is syntactically correct. But what if the variables are defined as fBut what if the variables are defined as f

ollows:ollows:char a;char a;

double b;double b;

int sum;int sum;

Page 40: Chapter 9

Semantic RecordsSemantic Records

Each nonterminal symbol is associated Each nonterminal symbol is associated with a semantic record, a data structure with a semantic record, a data structure that stores information about a nontermthat stores information about a nonterminal, such as the actual name of the objeinal, such as the actual name of the object and its data type.ct and its data type.

Page 41: Chapter 9

Semantic Records (II)Semantic Records (II)

Grows gradually.Grows gradually.

Page 42: Chapter 9

Another SituationAnother Situation

Page 43: Chapter 9

Two-Stage ProcessTwo-Stage Process

Semantic analysis: a pass over the Semantic analysis: a pass over the parse tree to determine whether all parse tree to determine whether all branches of the tree are semantically branches of the tree are semantically valid.valid.

Code generation: the compiler makes Code generation: the compiler makes a 2a 2ndnd pass over the parse tree to pass over the parse tree to produce the translated code. produce the translated code.

Page 44: Chapter 9

ExampleExample

Page 45: Chapter 9

Example (cont’d)Example (cont’d)

Page 46: Chapter 9

Example (cont’d)Example (cont’d)

Page 47: Chapter 9

Example (cont’d)Example (cont’d)

Page 48: Chapter 9

Example (cont’d)Example (cont’d)

Page 49: Chapter 9

Code OptimizationCode Optimization

To make the code more efficient:To make the code more efficient:• Local optimizationLocal optimization• Global optimizationGlobal optimization

Different from programmer Different from programmer optimization with compiler tools such optimization with compiler tools such as:as:• Visual development environmentsVisual development environments• On-line debuggersOn-line debuggers• Reusable code librariesReusable code libraries

Page 50: Chapter 9

Local OptimizationLocal Optimization

Look at a very small block of Look at a very small block of instructions and try to improve it.instructions and try to improve it.

Possible approachesPossible approaches• Constant evaluation: x=1+1;Constant evaluation: x=1+1;• Strength reduction: x=x*2; Strength reduction: x=x*2; • Eliminating unnecessary operationsEliminating unnecessary operations

Page 51: Chapter 9

Global OptimizationGlobal Optimization

Look at large segments of program Look at large segments of program and decide how to improve and decide how to improve performance.performance.

A much harder problem.A much harder problem.


Recommended