+ All Categories
Home > Documents > Compiler Ch1

Compiler Ch1

Date post: 13-Nov-2014
Category:
Upload: api-3712520
View: 158 times
Download: 1 times
Share this document with a friend
Popular Tags:
30
Chapter 1 CSE309N Chapter 1 Chapter 1 Introduction to Compiling Introduction to Compiling
Transcript
Page 1: Compiler Ch1

Chapter 1

CSE309N

Chapter 1Chapter 1Introduction to CompilingIntroduction to Compiling

Page 2: Compiler Ch1

Chapter 1

CSE309N

Introduction to CompilersIntroduction to Compilers

As a Discipline, Involves Multiple CS&E Areas Programming Languages and Algorithms Theory of Computing & Software

Engineering Computer Architecture & Operating

Systems Has Deceivingly Simplistic Intent:

CompilerSource program

Target Program

Error messages

Diverse & Varied

Page 3: Compiler Ch1

Chapter 1

CSE309N

Classifications of CompilersClassifications of Compilers

Compilers Viewed from Many Perspectives

However, All utilize same basic tasks to accomplish their actions

Single Pass

Multiple Pass

Load & Go

Construction

Debugging

OptimizingFunctional

Page 4: Compiler Ch1

Chapter 1

CSE309N

The ModelThe Model

The TWO Fundamental Parts:

We Will Discuss Both in This Class, andFOCUS on analysis.

Analysis: Decompose Source into an intermediate representation

Synthesis: Target program generation from representation

Page 5: Compiler Ch1

Chapter 1

CSE309N

Important Notes

Today: There are many Software Tools for helping with the Analysis Part. This Wasn’t the Case in Early Days. (some) analysis is also important in: Structure / Syntax directed editors: Force

“syntactically” correct code to be entered Takes input as a sequence of commands to

build a source program.

Performs:

– Text-creation

– Text modifications

– Analyzes the source program

Page 6: Compiler Ch1

Chapter 1

CSE309N

Important Notes (Continue)

Pretty Printers: Standardized version for program structure (i.e., blank space, indenting, etc.) Analyzes the source program and prints it in such a way that

the structure of the program becomes clearly visible. Examples

Comments may appear in a special font

Statements may appear with an amount of indentations proportional to the depth of their nesting in a hierarchical organization of the stmts.

Static Checkers: A “quick” compilation to detect rudimentary errors Examples

Detects parts of the program that can never be executed

A variable used before it is defined

Interpreters: “real” time execution of code a “line-at-a-time”

Page 7: Compiler Ch1

Chapter 1

CSE309N

Important Notes (Continue)

Compilation Is Not Limited to Programming Language Applications Text Formatters

LATEX & TROFF Are Languages Whose Commands Format Text ( paragraphs, figures, mathematical structures etc)

Silicon Compilers Textual / Graphical: Take Input and Generate

Circuit Design

Database Query Processors Database Query Languages Are Also a

Programming Language Input is compiled Into a Set of Operations for

Accessing the Database

Page 8: Compiler Ch1

Chapter 1

CSE309N

The Many The Many PhasesPhases of a Compiler of a Compiler

Source Program

Lexical Analyzer1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

Page 9: Compiler Ch1

Chapter 1

CSE309N

Language-Processing SystemLanguage-Processing System

Skeleton Source Program

Pre-Processor1

Compiler2

Assembler3

RelocatableMachine Code

4

Loader Link/Editor

5

Executable

Library,relocatable object files

Source program

Target Assembly program

Page 10: Compiler Ch1

Chapter 1

CSE309N

Three Phases: Linear / Lexical Analysis:

L-to-R Scan to Identify Tokenstoken: sequence of chars having a collective meaning

Hierarchical Analysis:

Grouping of Tokens Into Meaningful Collection

Semantic Analysis:

Checking to ensure Correctness of Components

The Analysis Task For Compilation

Page 11: Compiler Ch1

Chapter 1

CSE309N

Phase 1. Lexical Analysis

Easiest Analysis - Identify tokens which are the basic building blocks

For Example:

All are tokens

Blanks, Line breaks, etc. are scanned out

Position := initial + rate * 60 ;_______ __ _____ _ ___ _ __ _

Page 12: Compiler Ch1

Chapter 1

CSE309N Phase 2. Phase 2. Hierarchical AnalysisHierarchical AnalysisParsingParsing or or Syntax AnalysisSyntax Analysis

For previous example,

we would have

Parse Tree:

identifier

identifier

expression

identifier

expression

number

expression

expression

expression

assignment statement

position

:=

+

*

60

initial

rate

Nodes of tree are constructed using a grammar for the language

Page 13: Compiler Ch1

Chapter 1

CSE309N

What is a Grammar?What is a Grammar?

Grammar is a Set of Rules Which Govern the Interdependencies & Structure Among the Tokens

statement is an assignment statement, or while statement, or

if statement, or ...

assignment statement

expression is an

is an identifier := expression ;

(expression), or

expression + expression, or expression * expression, or number, or

identifier, or ...

Page 14: Compiler Ch1

Chapter 1

CSE309N Why Have We Divided Analysis Why Have We Divided Analysis in This Manner?in This Manner?

Lexical Analysis - Scans Input, Its Linear Actions Are Not Recursive Identify Only Individual “words” that are the the Tokens

of the Language

Recursion Is Required to Identify Structure of an Expression, As Indicated in Parse Tree Verify that the “words” are Correctly Assembled into

“sentences”

What is Third Phase? Determine Whether the Sentences have One and Only

One Unambiguous Interpretation … and do something about it! e.g. “John Took Picture of Mary Out on the Patio”

Page 15: Compiler Ch1

Chapter 1

CSE309N

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis

Find More Complicated Semantic Errors and Support Code Generation

Parse Tree Is Augmented With Semantic Actions

position

initial

rate

:=+

*

60

Compressed Tree

position

initial

rate

:=+

*

inttoreal

60

Conversion Action

Page 16: Compiler Ch1

Chapter 1

CSE309N

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis

Most Important Activity in This Phase:

Type Checking - Legality of Operands

Many Different Situations:

Real := int + char ;

A[int] := A[real] + int ;

while char <> int do

…. Etc.

Page 17: Compiler Ch1

Chapter 1

CSE309N Supporting Phases/ Activities for Analysis

Symbol Table Creation / Maintenance Contains Info (storage, type, scope, args) on Each

“Meaningful” Token, Typically Identifiers Data Structure Created / Initialized During Lexical

Analysis Utilized / Updated During Later Analysis & Synthesis

Error Handling Detection of Different Errors Which Correspond to All

Phases What Kinds of Errors Are Found During the Analysis

Phase? What Happens When an Error Is Found?

Page 18: Compiler Ch1

Chapter 1

CSE309N

The Many The Many PhasesPhases of a Compiler of a Compiler

Source Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

Page 19: Compiler Ch1

Chapter 1

CSE309N

The Synthesis Task For Compilation Intermediate Code Generation

Abstract Machine Version of Code - Independent of Architecture Easy to Produce and Easy to translate into target program

Code Optimization Find More Efficient Ways to Execute Code Replace Code With More Optimal Statements

Final Code Generation Generate Relocatable Machine Dependent Code

Page 20: Compiler Ch1

Chapter 1

CSE309N

Reviewing the Entire ProcessReviewing the Entire Process

Errors

position := initial + rate * 60

lexical analyzer

syntax analyzer

semantic analyzer

intermediate code generator

id1 := id2 + id3 * 60

:=

id1id2

id3

+*

60

:=

id1id2l

id3

+*

inttoreal

60

Symbol Table

position ....

initial ….

rate….

Page 21: Compiler Ch1

Chapter 1

CSE309N

Reviewing the Entire ProcessReviewing the Entire Process

Errors

intermediate code generator

code optimizer

final code generator

temp1 := inttoreal(60)

temp2 := id3 * temp1

temp3 := id2 + temp2

id1 := temp3

temp1 := id3 * 60.0

id1 := id2 + temp1

MOVF id3, R2

MULF #60.0, R2MOVF id2, R1ADDF R2, R1MOVF R1, id1

position ....

initial ….

rate….

Symbol Table

3 address code

Page 22: Compiler Ch1

Chapter 1

CSE309N

AssemblersAssemblers

Assembly code: names are used for instructions, and names are used for memory addresses.

Two-pass Assembly: First Pass: all identifiers are assigned to memory

addresses (0-offset)e.g. substitute 0 for a, and 4 for b

Second Pass: produce relocatable machine code:

MOV a, R1

ADD #2, R1MOV R1, b

0001 01 00 00000000 *

0011 01 10 000000100010 01 00 00000100 *

relocationbit

Load

Store

add

Page 23: Compiler Ch1

Chapter 1

CSE309N

Loaders and Link-EditorsLoaders and Link-Editors

Loader: taking relocatable machine code, altering the addresses and placing the altered instructionsinto memory.

Link-editor: taking many (relocatable) machine code programs (with cross-references) and produce a single file. Need to keep track of correspondence between variable

names and corresponding addresses in each piece of code.

Page 24: Compiler Ch1

Chapter 1

CSE309N Compiler Cousins:Compiler Cousins: PreprocessorsPreprocessors Provide Input to Compilers

1. Macro Processing

#define in C: does text substitution before compiling

#define X 3

#define Y A*B+C

#define Z getchar()

Page 25: Compiler Ch1

Chapter 1

CSE309N

2. File Inclusion

#include in C - bring in another file before compiling

defs.h

//////

//////

//////

main.c

#include “defs.h”

…---…---…---…---…---…---…---…---…---

//////

//////

//////

…---…---…---…---…---…---…---…---…---

Page 26: Compiler Ch1

Chapter 1

CSE309N

3. Rational Preprocessors

Augment “Old” Languages With Modern Constructs

Add Macros for If - Then, While, Etc.

#Define Can Make C Code More Pascal-like

#define begin {

#define end }

Page 27: Compiler Ch1

Chapter 1

CSE309N 4. Language Extensions for a Database System

EQUEL - Database query language embedded in C

## Retrieve (DN=Department.Dnum) where

## Department.Dname = ‘Research’

is Preprocessed into:

ingres_system(“Retr…..Research’”,____,____);

a procedure call in a programming language.

Page 28: Compiler Ch1

Chapter 1

CSE309N

The Grouping of Phases

Front End : Analysis + Intermediate Code Generation

Back End : Code Generation + Optimization

vs.

Number of Passes:

A pass: requires r/w intermediate files

Fewer passes: more efficiency.

However: fewer passes require more sophisticated memory management and compiler phase interaction.

Tradeoffs ……..

Page 29: Compiler Ch1

Chapter 1

CSE309N

Compiler Construction Tools

Parser Generators:

Produce Syntax Analyzers

Scanner Generators:

Produce Lexical Analyzers

Syntax-directed Translation Engines:

Generate Intermediate Code

Automatic Code Generators:

Generate Actual Code

Data-Flow Engines:

Support Optimization

Page 30: Compiler Ch1

Chapter 1

CSE309N

The End


Recommended