Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | carmel-palmer |
View: | 216 times |
Download: | 0 times |
Unit-1 IntroductionPREPARED BY:
PROF. HARISH I RATHOD
COMPUTER ENGINEERING DEPARTMENT
GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE
COMPILER DESIGN (170701)
Introduction
• Programming languages are notations for describing computations to people and to machines.
• The world depend on programming languages because,• All the software running on all the computer is written in some
programming language.• But before a program can be run, it first must be translated into
a form in which it can be executed by a computer.• The software systems that do this translations are called
compilers.GPERI – CD - UNIT-1 2
Language Processors
• Compiler:• It is a program that can read a program in one language (the
source language) and translate it into an equivalent program in another language (the target language).
• The role of compiler is to report any errors in the source program that it detects during the translation process.
GPERI – CD - UNIT-1 3
Language Processors
• Compiler:
• If the target program is an executable machine language program, it can then be called by the user to process inputs and produce outputs.
GPERI – CD - UNIT-1 4
CompilerSource program
Target program
Fig 1: A compiler
Target ProgramInput Output
Fig 2: Running the target program
Language Processors
• Interpreter:• Instead of producing a target program as a translation,• It appears to directly execute the operations specified in the
source program on input supplied by the user.
GPERI – CD - UNIT-1 5
Target ProgramInput
Output
Fig 2: Running the target program
Source program
Language Processors
• Difference between compiler and interpreter:• The machine language target program produced by a compiler
is usually much faster than an interpreter.• An interpreter can give better error diagnostics than a compiler,
because it execute the source program statement by statement.
• In compiler, several other programs may be required to create an executable target program.
GPERI – CD - UNIT-1 6
Language Processors
• .
GPERI – CD - UNIT-1 7
Preprocessor
Compiler
Assembler
Linker/Loader
Source program
Modified source program
Target assembly program
Re-locatable machine code
target machine code
Library filesRe-locatable object file.
Language Processors
• A source program divided into modules stored in separate files.
• The task of collecting a source program is sometimes entrusted to a separate program, called preprocessor.
• The modified source program is then fed to a compiler.• The compiler produce an assembly-language program as its
output.GPERI – CD - UNIT-1 8
Language Processors
• The assembly language is then processed by a program called an assembler.
• An assembler produces re-locatable machine code as its output.
• Large program are often compiled in pieces, • so the re-locatable machine code may have to linked together
with other re-locatable object files and library files into the code that actually runs on the machine.
GPERI – CD - UNIT-1 9
Language Processors
• The linker resolves (decides) external memory addresses, where the code in one file may refer to location in another file.
• The loader then puts together all of the executable object files into memory for execution.
GPERI – CD - UNIT-1 10
Structure of Compiler (Front end and Back end)
• We treated a compiler as a single box, • That maps a source program into a semantically equivalent
target program.• If we open this box there are two parts to this mapping: • Analysis and • Synthesis.
GPERI – CD - UNIT-1 11
Structure of Compiler (Front end and Back end)
• Analysis part:• Breaks up the source program into constituent pieces and
impose (execute or carry out) a grammatical structure on them.• Then use this structure to create an intermediate
representation of the source program.• If this part detects that the source program is either
syntactically ill formed or semantically unsound,• Then it must provide informative messages, so the user can
take corrective action.GPERI – CD - UNIT-1 12
Structure of Compiler (Front end and Back end)
• Analysis part:• This part also collect information about the source program
and store it in a data structure called a symbol table.• Analysis determines the operations implied by the source
program which are recorded in a tree structure• The analysis part is often called the front end of the compiler
GPERI – CD - UNIT-1 13
Structure of Compiler (Front end and Back end)
• Synthesis part:• Synthesis takes the tree structure and translates the operations
therein into the target program. • or• It constructs the target program from the intermediate
representation and the information in the symbol table.• The synthesis part is the back end.
GPERI – CD - UNIT-1 14
Analysis of the source program
• Lexical Analysis (Linear Analysis): • source program reads from left to right and grouped into token
e.g. • constants, • variables names, • keywords etc. (check for valid token set).
GPERI – CD - UNIT-1 15
Analysis of the source program
• Hierarchical Analysis (Syntax Analysis or Parsing):• Grouped tokens into grammatical phase and construct parse
tree (check for valid syntax).• Semantic Analysis: • Certain checks are performed to ensure that the components of
a program fit together meaningfully.• i.e. its tasks is to determine the meaning of the source program
(check for the semantic errors )
GPERI – CD - UNIT-1 16
Phases of compiler
• .
GPERI – CD - UNIT-1 17
Symbol Table
Lexical Analyzer
Character stream
Syntax Analyzer
Token stream
Semantic Analyzer
Syntax tree
Intermediate Code Generator
Syntax tree
Machine Independent Code Optimizer
Intermediate representation
Code Generator
Machine Dependent Code Optimizer
Target machine code
Intermediate representation
Target machine code
Lexical Analysis
• First phase of compiler.• Also called lexical analysis or scanning.• The lexical analyzer reads the stream of characters of the
source program and groups the character into meaningful sequences called lexeme.
• For each lexeme lexical analyzer produces token as output.• The form of token is:
(token-name, attribute-value)GPERI – CD - UNIT-1 18
Lexical Analysis
• The token is pass to the next phase, syntax analysis.• In token,• The first component token-name is an abstract symbol that is
used during syntax analysis.• The second component attribute-value points to an entry in
the symbol table for this token.
GPERI – CD - UNIT-1 19
Lexical Analysis
• Example:• A source program contain assignment statement.
position = initial + rate * 60
• It could be group into the following lexeme and mapped into the following tokens.
GPERI – CD - UNIT-1 20
Lexical Analysis
• position is a lexeme, mapped into a token (id,1),• Where:• id (identifier) is an abstract symbol, and • 1 points to the symbol table entry for position.
• The assignment symbol = is lexeme, mapped into a token (=), no need attribute value, omitted second component.
• Initial is a lexeme, mapped into the token (id,2)• Where:• 2 points to the symbol table entry for initial.
GPERI – CD - UNIT-1 21
Lexical Analysis
• + is a lexeme, mapped into token (+).• rate is a lexeme, mapped into a token (id,3),• Where:• 3 points to the symbol table entry for rate.
• * is a lexeme, mapped into token (*).• 60 is a lexeme, mapped into token (60).
(id,1) (=) (id,2) (+) (id,3) (*) (60)
GPERI – CD - UNIT-1 22
Lexical Analysis
• .
GPERI – CD - UNIT-1 23
Syntax Analysis (parsing)
• The second phase of compiler.• It uses the first component of the tokens produced by the
lexical analyzer to create a tree like intermediate representation.• Known as syntax tree in which:• Interior node represent an operation and• child node represent the arguments of the operations.
GPERI – CD - UNIT-1 24
Lexical Analysis
• .
GPERI – CD - UNIT-1 25
Semantic Analysis
• Uses the syntax tree and the information in the symbol table to check the source program for semantic consistency.
• It also gathers types information and saves it in either the syntax tree or the symbol table for the next phase use.
• Its important task is type checking,• where compiler checks that each operator has matching
operands.
GPERI – CD - UNIT-1 26
Semantic Analysis
• For example:• Many programming language require an array index to be an
integer;• The compiler must report an error if a floating point number is
used to index as an array.• Also permit some type conversion.• For example: a binary arithmetic operator may be applied to
either a pair of integers or to a pair of floating points number.
GPERI – CD - UNIT-1 27
Lexical Analysis
• .
GPERI – CD - UNIT-1 28
Intermediate Code Generation
• During the process of translating, compiler may construct one or more intermediate represent. (Syntax tree)
• They are commonly used during syntax and semantic analysis.• After syntax and semantic analysis of the source program,• Many compilers generate an explicit low-level or machine like
intermediate representation.• It have two important properties:
GPERI – CD - UNIT-1 29
Intermediate Code Generation
• It have two important properties:• It should be easy to produce,• It should be easy to translate into the target machine .
• We consider an intermediate form called three-address code.• Consist of a sequence of assembly-like instructions with three
operands per instruction.
GPERI – CD - UNIT-1 30
Intermediate Code Generation
• Each operand can act like a register.
t1 = inttofloat(60)
t2 = id3 * t1
t3 = id2 * t2
id1 = t3
GPERI – CD - UNIT-1 31
Lexical Analysis
• .
GPERI – CD - UNIT-1 32
Code optimization
• Attempts to improve the intermediate code so that better target code result
t1 = id3 * 60.0
id1 = id2 + t1
GPERI – CD - UNIT-1 33
Lexical Analysis
• .
GPERI – CD - UNIT-1 34
Code Generation
• Final phase of compiler to generate the target code.• Memory location are selected for each variable used by the
program. • Intermediate instruction are translated into sequence of m/c
instruction having similar meaning.• For example using register R1 and R2.
GPERI – CD - UNIT-1 35
Code Generation
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2
STF id1, R1
GPERI – CD - UNIT-1 36
Lexical Analysis
• .
GPERI – CD - UNIT-1 37
Symbol Table Management
• It is the data structure which contains a record for each identifier with its attribute list.
• As a identifier identified by scanner (lexical analyzer) it will be entered into symbol table.
• Essential function of compiler is to record the identifiers with its attributs (type, scope, storage location, etc.)
GPERI – CD - UNIT-1 38
The grouping of phases
• Compiler front and back ends:
• Front ends: analysis :• It consists of those phases, or parts of phases, that depend
primarily on the source language and are largely independent of the target machine.
GPERI – CD - UNIT-1 39
The grouping of phases
• Compiler front and back ends:
• Back end: synthesis (machine dependent):• It includes those portions of the compiler that,• depend on the target machine, and generally, those portions
do not depend on the source language.
GPERI – CD - UNIT-1 40
The grouping of phases
• Advantage of Analysis – Synthesis concept: • One can take the front end of a compiler and redo its associated
back end to produce a compiler for the same source language on a different machine.
• If the back end design carefully,• it may not even be necessary to redesign too much of the back
end.
GPERI – CD - UNIT-1 41
Compiler Construction Tools
• Software development tools are available to implement one or more compiler phases.
• Scanner generators• Parser generators• Syntax-directed translation engines• Automatic code generators• Data-flow engines
GPERI – CD - UNIT-1 42