Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | randolf-dalton |
View: | 213 times |
Download: | 0 times |
D. M. Akbar Hussain: Department of Software & Media Technology
1
Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code).
IntroductionIntroduction
D. M. Akbar Hussain: Department of Software & Media Technology
2
CompilerCompiler
CompilerCompiler
Error Messages
Source Code
Target Code
?
D. M. Akbar Hussain: Department of Software & Media Technology
3
What is InvolvedWhat is Involved
• Programming Languages
• Components of a Compiler
• Applications
• Formal Languages
D. M. Akbar Hussain: Department of Software & Media Technology
4
Programming LanguagesProgramming Languages
• We use natural languages to communicate
• We use programming languages to speak with computers
D. M. Akbar Hussain: Department of Software & Media Technology
5
Component of CompilersComponent of Compilers
• Analysis
• Lexical Analysis
• Syntax Analysis
• Semantic Analysis
• Synthesis
• Intermediate Code Generation
• Code Optimization
• Code Generation
D. M. Akbar Hussain: Department of Software & Media Technology
6
The InputThe Input
Read string input– sequence of characters– Character set:
• ASCII• ISO Latin-1• ISO 10646 (16-bit = unicode) Ada, Java• Others (EBCDIC, JIS, etc)
D. M. Akbar Hussain: Department of Software & Media Technology
7
The OutputThe Output
Tokens: kind, name
– Punctuation ( ) ; , [ ]
– Operators + - ** :=
– Keywords begin end if while try catch
– Identifiers Square_Root
– String literals “press Enter to continue”
– Character literals ‘x’
– Numeric literals • Integer • Floating_point
D. M. Akbar Hussain: Department of Software & Media Technology
8
Lexical AnalysisLexical Analysis
Free form languagesAll modern languages are free form – White space, tabs, new line, carriage return does not
matter just Ignore these.– Ordering of token is only important
Fixed format languages– Layout is critical
• Fortran, label in cols 1-6• COBOL, area A B• Lexical analyzer must know about layout to find tokens
D. M. Akbar Hussain: Department of Software & Media Technology
9
Relevant FormalismsRelevant Formalisms
Type 3 (Regular) Grammars Regular Expressions Finite State Machines
Useful for program construction, even if hand-written
D. M. Akbar Hussain: Department of Software & Media Technology
10
Interface to Lexical AnalyzerInterface to Lexical Analyzer
Either: Convert entire file to a file of tokens– Lexical analyzer is separate phase
Or: Parser calls lexical analyzer to supply next token– This approach avoids extra I/O– Parser builds tree incrementally, using successive tokens as tree
nodes
D. M. Akbar Hussain: Department of Software & Media Technology
11
Performance IssuesPerformance Issues
Speed– Lexical analysis can become bottleneck
– Minimize processing per character• Skip blanks fast
• I/O is also an issue (read large blocks)
– We compile frequently• Compilation time is important
– Especially during development
– Communicate with parser through global variables
D. M. Akbar Hussain: Department of Software & Media Technology
12
SyntaxSyntax Analysis Analysis (SA) (SA)
D. M. Akbar Hussain: Department of Software & Media Technology
13
SemanticSemantic Analyser Analyserzerzer
D. M. Akbar Hussain: Department of Software & Media Technology
14
Intermediate Code GenerationIntermediate Code Generation
D. M. Akbar Hussain: Department of Software & Media Technology
15
Code OptimizationCode Optimization
D. M. Akbar Hussain: Department of Software & Media Technology
17
Data structure toolsData structure tools
Syntax tree: Literal table: Symbol table:
D. M. Akbar Hussain: Department of Software & Media Technology
18
Error handlerError handler
One of the difficult part of a compiler. Must handle a wide range of errors Must handle multiple errors. Must not get stuck. Must not get into an infinite loop (typical simple-minded
strategy:count errors, stop if count gets too high).
D. M. Akbar Hussain: Department of Software & Media Technology
20
Sample compilerSample compiler
TINY: Pages 22-26, 4-pass compiler for the TINY language based on Pascal.
C-Minus based on C : Pages 26-27 and Appendix A.
D. M. Akbar Hussain: Department of Software & Media Technology
21
TINY ExampleTINY Example
read x;
if x > 0 then
fact := 1;
repeat
fact := fact * x;
x := x - 1
until x = 0;
write fact
end
D. M. Akbar Hussain: Department of Software & Media Technology
22
C-Minus ExampleC-Minus Example
int fact( int x ){ if (x > 1) return x * fact(x-1); else return 1;}
void main( void ){ int x; x = read(); if (x > 0) write( fact(x) );}
D. M. Akbar Hussain: Department of Software & Media Technology
23
Structure of the TINY CompilerStructure of the TINY Compiler
globals.h main.c
util.h util.c
scan.h scan.c
parse.h parse.c
symtab.h symtab.c
analyze.h analyze.c
code.h code.c
cgen.h cgen.c
D. M. Akbar Hussain: Department of Software & Media Technology
24
Conditional Compilation OptionsConditional Compilation Options
NO_PARSE: Builds a scanner-only compiler.
NO_ANALYZE: Builds a compiler that parses and scans only.
NO_CODE: Builds a compiler that performs semantic analysis, but generates no code.
D. M. Akbar Hussain: Department of Software & Media Technology
25
Listing Options (built in - not flags)Listing Options (built in - not flags)
EchoSource: Echoes the TINY source program to the listing, together with line numbers.
TraceScan: Displays information on each token as the scanner recognizes it.
TraceParse: Displays the syntax tree in a linearlised format. TraceAnalyze: Displays summary information on the symbol table
and type checking. TraceCode: Prints code generation-tracing comments to the code
file.