Welcome to CSE131b: Compiler Construction · 2013-05-15 · Compiler program in source language...

Post on 20-Jun-2020

0 views 0 download

transcript

Welcome to CSE131b:Compiler Construction

pic from: http://xkcd.com/303/

Lingjia Tang

Monday, April 1, 13

• Course Information

• What are compilers?

• Why do we learn about them?

• History of compilers

• Structure of compilers

• A bit info on course projects

Monday, April 1, 13

Course Staff

• Instructor: Lingjia Tang (2108)

• lingjia@cs.ucsd.edu

• TA: Xinxin Jing

• x7jin@cs.ucsd.edu

• Tutors: Haronid Moncivais Miller and more

Monday, April 1, 13

http://cseweb.ucsd.edu/classes/sp13/cse131-b

Monday, April 1, 13

“Dragon Book”

Monday, April 1, 13

Grading Polices

35%

20%

45%

Project MidtermFinal

Monday, April 1, 13

Grading Polices

35%

20%

45%

Project MidtermFinal

• Class participation• Piazza participation• Projects/exam extra credits

Extra credits:

Monday, April 1, 13

Prerequisites

• CSE 70 / CSE 110

• CSE 100

• CSE 105

• CSE 130

Monday, April 1, 13

What is a compiler?

Monday, April 1, 13

• Compiler

Compiler

Monday, April 1, 13

• Compiler

Compilerprogram in source language

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Input

Output

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Input

Output

• Translates

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Input

Output

• Translates

• Typically lower the level of abstraction of the program

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Input

Output

• Translates

• Typically lower the level of abstraction of the program • High-level programming language -> machine code (C/C++ -> X86, etc)

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Input

Output

• Translates

• Typically lower the level of abstraction of the program • High-level programming language -> machine code (C/C++ -> X86, etc)

• We expect the program produced by the compiler to be better, (faster, consumes less memory, etc) than the original program

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Input

Output

Monday, April 1, 13

• Compiler

Compilerprogram in source language

program in target language

Error message

Input

Output

• Interpreter

Monday, April 1, 13

• Compiler

Interpreter

Compilerprogram in source language

program in target language

Error message

Input

Output

• Interpreter

Monday, April 1, 13

• Compiler

Interpreter

Compilerprogram in source language

program in target language

Error message

ProgramInput

Input

Output

• Interpreter

Monday, April 1, 13

• Compiler

Interpreter

Compilerprogram in source language

program in target language

Error message

ProgramInput

output

Input

Output

• Interpreter

Monday, April 1, 13

• Compiler

Interpreter

Compilerprogram in source language

program in target language

Error message

ProgramInput

output

Input

Output

• Interpreter

Static: before runtime

Monday, April 1, 13

• Compiler

Interpreter

Compilerprogram in source language

program in target language

Error message

ProgramInput

output

Input

Output

• Interpreter

Static: before runtime

Dynamic: during runtime

Monday, April 1, 13

• Compiler

Interpreter

Compilerprogram in source language

program in target language

Error message

ProgramInput

output

Input

Output

• Interpreter

Static: before runtime

Dynamic: during runtime

Examples: python, shell, javascript, PHP

Monday, April 1, 13

• Hybrid

• Example: Java

Compilerprogram in

source languageintermediate

program

Virtual Machine

Input output

Monday, April 1, 13

• Hybrid

• Example: Java

Compilerprogram in

source languageintermediate

program

Virtual Machine

Input output

Interpreter

Just-in-time compiler

Garbage Collection

native machine

code

Monday, April 1, 13

• Hybrid

• Example: Java

Compilerprogram in

source languageintermediate

program

Virtual Machine

Input output

Static

Interpreter

Just-in-time compiler

Garbage Collection

native machine

code

Monday, April 1, 13

• Hybrid

• Example: Java

Compilerprogram in

source languageintermediate

program

Virtual Machine

Input output

Static

Runtime

Interpreter

Just-in-time compiler

Garbage Collection

native machine

code

Monday, April 1, 13

• Hybrid

• Example: Java

Compilerprogram in

source languageintermediate

program

Virtual Machine

Input output

Sacrifice efficiency for portability, “safety” and other

dynamic features

Static

Runtime

Interpreter

Just-in-time compiler

Garbage Collection

native machine

code

Monday, April 1, 13

Why Learning Compiler?

• Build awesome, complex software

• Theory meets practice

• remember DFA, CFG in CSE105 ?

• Understand how SW interacts with HW

• Understand program language designs and features

Monday, April 1, 13

Why Learning Compiler? cont.

• My code runs faster than your code...

• consumes less power

• consumes less memory

• more reliable..

Monday, April 1, 13

Why Learning Compiler? cont.

• Who develop compilers?

• many many companies....

• Who use compilers?

• everyone!

Monday, April 1, 13

Why Learning Compiler? cont.

• Who develop compilers?

• many many companies....

• Who use compilers?

• everyone!

Monday, April 1, 13

Why Learning Compiler? cont.

• Who develop compilers?

• many many companies....

• Who use compilers?

• everyone!

Monday, April 1, 13

Why Learning Compiler? cont.

• Who develop compilers?

• many many companies....

• Who use compilers?

• everyone!

program in PHP

program in C++

Monday, April 1, 13

Why Learning Compiler? cont.

• Who develop compilers?

• many many companies....

• Who use compilers?

• everyone!

program in PHP

program in C++

Monday, April 1, 13

Goal of this course

• For you to become compiler NINJA!

Monday, April 1, 13

History of Compiler

• Machine Language

• Assembly

• programming too time consuming

Monday, April 1, 13

History of Compiler

• Machine Language

• Assembly

• programming too time consuming

• High-level language

Monday, April 1, 13

pics from wikipedia

Monday, April 1, 13

• Grace Hopper

pics from wikipedia

Monday, April 1, 13

• Grace Hopper conceptualized the idea of machine-independent programming languages

"Nobody believed that," she said. "I had a running compiler and nobody would touch it. They told me computers could only do arithmetic.”

COBOL - (1959)

pics from wikipedia

Monday, April 1, 13

• Grace HopperWon ACM Turing Award “for profound, influential, and lasting contributions to the design of practical high-level programming systems, notably through his work on FORTRAN"

• John Backus

conceptualized the idea of machine-independent programming languages

"Nobody believed that," she said. "I had a running compiler and nobody would touch it. They told me computers could only do arithmetic.”

COBOL - (1959)

pics from wikipedia

Monday, April 1, 13

saman amarasinghe’s slide

Monday, April 1, 13

• Compilers: map high-level abstract language to assembly language

saman amarasinghe’s slide

Monday, April 1, 13

• Compilers: map high-level abstract language to assembly language

• Simple mapping of a program to assembly language produces inefficient execution

• Higher the level of abstraction, more inefficiency

• If not efficient, high-level abstractions are useless

saman amarasinghe’s slide

Monday, April 1, 13

• Compilers: map high-level abstract language to assembly language

• Simple mapping of a program to assembly language produces inefficient execution

• Higher the level of abstraction, more inefficiency

• If not efficient, high-level abstractions are useless

• Compilers need to:

• provide a high level abstraction

• with performance of giving low-level instruction

• bridge the efficiency gap b/t high-level PL and low-level processor/ISAs

saman amarasinghe’s slide

Monday, April 1, 13

The Structure of a Modern Compiler

The Structure of a Modern Compiler

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

SourceCode

Machine

Code

Monday, April 1, 13

The Structure of a Modern Compiler

The Structure of a Modern Compiler

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

SourceCode

Machine

Code

Front end

Monday, April 1, 13

The Structure of a Modern Compiler

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

SourceCode

Machine

Code

The Structure of a Modern Compiler

Back end

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

Lexical analysis (Scanning): Identify logic pieces of the description.

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

T_WhileT_LeftParenT_Identifier yT_LessT_Identifier zT_RightParenT_OpenBraceT_IntT_Identifier xT_AssignT_Identifier aT_PlusT_Identifier bT_SemicolonT_Identifier yT_PlusAssignT_Identifier xT_SemicolonT_CloseBrace

Lexical analysis (Scanning): Group sequence of characters into lexemes – smallest meaningful entity in a language (keywords, identifiers, constants)

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

T_WhileT_LeftParenT_Identifier yT_LessT_Identifier zT_RightParenT_OpenBraceT_IntT_Identifier xT_AssignT_Identifier aT_PlusT_Identifier bT_SemicolonT_Identifier yT_PlusAssignT_Identifier xT_SemicolonT_CloseBrace

Syntax analysis (Parsing): Identify how those pieces relate to each other

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}While

<

Sequence

=

x +

a b

=

y +

y x

y z

Syntax analysis (Parsing): Convert a linear structure – sequence of tokens – to a hierarchical tree-like structure - abstract syntax tree (AST)

Monday, April 1, 13

Syntax Analyzer (Parser)Syntax Analyzer (Parser)

int * foo(i, j, k))int i;int i;int j; Extra parentheses

{for(i=0; i j) {fi(i>j)

return j;Missing increment

Not an expression}

Not an expression

Not a keyword

Saman Amarasinghe 35 6.035 ©MIT Fall 1998

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}While

Sequence

=

x +

a b

=

y +

y x

<

y z

Semantic Analysis: Identify the meaning of the overall structure

Monday, April 1, 13

Semantic Analysis: Rules of the language are checked (variable declaration, type checking)

Lexical Analysis

Syntax Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}While

Sequence

=

x +

a b

=

y +

y x

int

int int

int

intint

int

int int

int

void

void

Semantic Analysis

<

y z

int int

bool

Monday, April 1, 13

Semantic AnalyzerSemantic Analyzer

int * foo(i, j, k)int i;int i;int j; Type not declared

{int x;

Mismatched return typex = x + j + N;return j;

Mismatched return type

Uninitialized variable used} Undeclared variable

Saman Amarasinghe 37 6.035 ©MIT Fall 1998

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}While

Sequence

=

x +

a b

=

y +

y x

int

int int

int

intint

int

int int

int

void

void

<

y z

int int

bool

Intermediate Representation (IR) Generation: Generate intermediate code

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

Loop: x = a + b y = x + y _t1 = y < z if _t1 goto Loop

IR Generation

IR Generation: • Makes it easy to port compiler to other architectures (e.g. X86 to MIPS)• Enables optimizations that are not machine specific

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

Loop: x = a + b y = x + y _t1 = y < z if _t1 goto Loop

IR Optimization: Optimize intermediate code (machine-independent)

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

x = a + bLoop: y = x + y _t1 = y < z if _t1 goto Loop

IR Optimization

IR Optimization: Optimize intermediate code (machine-independent)

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

x = a + bLoop: y = x + y _t1 = y < z if _t1 goto Loop

Gode Generation: Intermediate code is translated into native code

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

add $1, $2, $3Loop: add $4, $1, $4 slt $6, $1, $5 beq $6, loop

Gode Generation: Intermediate code is translated into native code

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

add $1, $2, $3Loop: add $4, $1, $4 slt $6, $1, $5 beq $6, loop

Gode Optimization: machine dependent optimization(register allocation, instruction selection, peephole, etc)

Monday, April 1, 13

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

add $1, $2, $3Loop: add $4, $1, $4 blt $1, $5, loop

Gode Optimization: machine dependent optimization(register allocation, instruction selection, peephole, etc)

Monday, April 1, 13

Modern Compilers

• Matured Frontend

• Heavy Backend

• many passes of optimization

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

Monday, April 1, 13

Compiler Optimization

• Frances E. Allen

• first woman to win Turing Award

• “Introduced many of the abstractions, algorithms, and implementations that laid the groundwork for automatic program optimization technology”

Monday, April 1, 13

Architecture of GCC(GNU Compiler Collection)

Architecture of gcc

SourceCode

AST

GENERIC

HighGIMPLE

SSA

LowGIMPLE

RTL

MachineCode

IRsmany optimization

passes here

Monday, April 1, 13

Architecture of GCC(GNU Compiler Collection)

Architecture of gcc

SourceCode

AST

GENERIC

HighGIMPLE

SSA

LowGIMPLE

RTL

MachineCode

IRs

FrontEnd supports: C (gcc), C++ (g++), Objective-C, Objective-C++, Fortran (gfortran), Java (gcj), Ada (GNAT), and Go (gccgo) Machine code:

X86, ARM, Alpha, MIPS, SPARC,PowerPC, etc

many optimization passes here

Monday, April 1, 13

Architecture of GCC(GNU Compiler Collection)

Architecture of gcc

SourceCode

AST

GENERIC

HighGIMPLE

SSA

LowGIMPLE

RTL

MachineCode

IRs

FrontEnd supports: C (gcc), C++ (g++), Objective-C, Objective-C++, Fortran (gfortran), Java (gcj), Ada (GNAT), and Go (gccgo) Machine code:

X86, ARM, Alpha, MIPS, SPARC,PowerPC, etc

many optimization passes here

Jeanne Ferrante et. al

Monday, April 1, 13

Monday, April 1, 13

Architecture of LLVM• LLVM (Low Level Virtual Machine)

• developed in UIUC (2000)

• “language-agnostic” design

Monday, April 1, 13

This course covers

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

while (y < z) {

int x = a + b;

y += x;

}

Midterm

Monday, April 1, 13

The Course Project• Compiler: Decaf -> MIPS

• have a working compiler by the end of the quarter

• will be able to run your Decaf programs!

• Source Language: Decaf

• Custom programming language: similar to Java or C++ (simplified feature set)

• Objected-oriented, inheritance, Strong type

• Target language: MIPS

• RISC ISA

• Implementation: C++Monday, April 1, 13

Programming Assignments

Lexical Analysis

Syntax Analysis

Semantic Analysis

IR Generation

IR Optimization

Code Generation

Optimization

SourceCode

Machine

Code

P1P2P3

P4

10%10%10%

15%

MIPS

Decaf

~1 week

1.5 week

3 weeks

4 weeks

Monday, April 1, 13

• 2 person group

• find your partner by the end of this week

• Group members get the same grade

• Honor code:

• Discussion among groups is encouraged

• But implement the solution on your own

• Discussion Session ?

Monday, April 1, 13