+ All Categories
Home > Documents > Compilation: A Retrospective CS 671 April 29, 2008.

Compilation: A Retrospective CS 671 April 29, 2008.

Date post: 20-Jan-2016
Category:
Upload: carol-walton
View: 215 times
Download: 0 times
Share this document with a friend
45
Compilation: A Retrospective CS 671 April 29, 2008
Transcript
Page 1: Compilation: A Retrospective CS 671 April 29, 2008.

Compilation: A Retrospective

CS 671April 29, 2008

Page 2: Compilation: A Retrospective CS 671 April 29, 2008.

2 CS 671 – Spring 2008

What You’ve Learned This Semester

What happens when you compile your code

Theory of compilers– Internal challenges and solutions: widely-used algorithms– Available tools and their other applications– Traditional and modern challenges

CompilerHigh-Level

Programming Languages

Machine Code

Error Messages

Page 3: Compilation: A Retrospective CS 671 April 29, 2008.

3 CS 671 – Spring 2008

An Interface to High-Level Languages

Programmers write in high-level languages• Increases productivity• Easier to maintain/debug• More portable

HLLs also protect the programmer from low-level details• Registers and caches – the register keyword• Instruction selection• Instruction-level parallelism

The catch: HLLs are less efficient

CompilerHigh-Level

Programming

Languages

Machine Code

Page 4: Compilation: A Retrospective CS 671 April 29, 2008.

4 CS 671 – Spring 2008

An Interface to Computer Architectures

Parallelism• Instruction level

– multiple operations at once; minimize dependences• Processor level

– multiple threads at once; minimize synchronization

Memory Hierarchies• Register allocation (only portion explicitly managed in SW)• Code and data layout (helps the hardware cache manager)

Designs driven by how well compilers can leverage new features!

CompilerHigh-Level

Programming Languages

Machine Code

Page 5: Compilation: A Retrospective CS 671 April 29, 2008.

5 CS 671 – Spring 2008

Simplified Compiler Structure

Lexical Analysis

Parsing

Intermediate Code Generation

Code Generation

Source code(character stream)if (b==0) a = b;

Assembly code(character stream)CMP CX, 0CMOVZ CX, DX

Token stream

Abstract syntax tree

Intermediate code

Front End

Back EndMachine dependent

Machine independent

Code Optimization

IR

Page 6: Compilation: A Retrospective CS 671 April 29, 2008.

6 CS 671 – Spring 2008

Building a Lexer

Specification

“if”“while”[a-zA-Z][a-zA-Z0-9]*[0-9][0-9]*()

NFA for each RE Giant NFA

Giant DFA Table-driven code

Page 7: Compilation: A Retrospective CS 671 April 29, 2008.

7 CS 671 – Spring 2008

High Level View

Regular expressions = specification

Finite automata = implementation

Every regex has a FSA that recognizes its language

Scanner

ScannerGenerator

source code

specification

tokens

Design time

Compile time

Page 8: Compilation: A Retrospective CS 671 April 29, 2008.

8 CS 671 – Spring 2008

Lex: A Lexical Analyzer Generator

• Lex produces a C program from a lexical specification • http://www.epaperpress.com/lexandyacc/index.html

%%

DIGITS [0-9]+

ALPHA [A-Za-z]

CHARACTER {ALPHA}|_

IDENTIFIER {ALPHA}({CHARACTER}|{DIGITS})*

%%

if {return IF; }

{IDENTIFIER} {return ID; }

{DIGITS} {return NUM; }

([0-9]+”.”[0-9]*)|([0-9]*”.”[0-9]+) {return REAL; }

. {error(); }

Page 9: Compilation: A Retrospective CS 671 April 29, 2008.

9 CS 671 – Spring 2008

Phases of a Compiler

Lexical Analysis

Parsing

Intermediate Code Generation

Code Generation

Source code(character stream)if (b==0) a = b;

Assembly code(character stream)CMP CX, 0CMOVZ CX, DX

Token stream

Abstract syntax tree

Intermediate code

Front End

Back EndMachine dependent

Machine independent

Code Optimization

IR

Page 10: Compilation: A Retrospective CS 671 April 29, 2008.

10 CS 671 – Spring 2008

Parsing – Syntax Analysis

• Convert a linear structure – sequence of tokens – to a hierarchical tree-like structure – an AST

• The parser imposes the syntax rules of the language

• Work should be linear in the size of the input type consistency cannot be checked in this phase

•Deterministic context free languages for the basis

• Bison and yacc allow a user to construct parsers from CFG specifications

Page 11: Compilation: A Retrospective CS 671 April 29, 2008.

11 CS 671 – Spring 2008

Context-Free Grammar Terminology

• Terminals– Token or ε

• Non-terminals– Syntactic variables

• Start symbol– A special nonterminal is designated (S)

• Productions– Specify how non-terminals may be expanded to

form strings– LHS: single non-terminal, RHS: string of terminals

or non-terminals

• Vertical bar is shorthand for multiple productions

S (S) S

S ε

Page 12: Compilation: A Retrospective CS 671 April 29, 2008.

12 CS 671 – Spring 2008

Shift-Reduce Parsing

Bottom-up parsing uses two kinds of actions: Shift and Reduce

Shift: Move I one place to the right• Shifts a terminal to the left string

E + (I int ) E + (int I )

Reduce: Apply an inverse production at the right end of the left string• If E E + ( E ) is a production, then

E + (E + ( E ) I ) E +(E I )

Page 13: Compilation: A Retrospective CS 671 April 29, 2008.

13 CS 671 – Spring 2008

Shift-Reduce Parsing Table

Action table

1. shift and goto state n

2. reduce using X → γ– pop symbols γ off stack– using state label of top (end) of stack, look up X in

goto table and goto that state

• DFA + stack = push-down automaton (PDA)

next actions

next state on

reduction

state

terminal symbols non-terminal symbols

Page 14: Compilation: A Retrospective CS 671 April 29, 2008.

14 CS 671 – Spring 2008

Parsing Table

( ) id , $ S L

1 s3 s2 g4

2 S→id S→id S→id S→id S→id

3 s3 s2 g7 g5

4 accept

5 s6 s8

6 S→(L)

S→(L)

S→(L)

S→(L)

S→(L)

7 L→S L→S L→S L→S L→S

8 s3 s2 g9

9 L→L,S

L→L,S

L→L,S

L→L,S

L→L,S

Page 15: Compilation: A Retrospective CS 671 April 29, 2008.

15 CS 671 – Spring 2008

Yacc / Bison

•Yet Another Compiler Compiler

•Automatically constructs an LALR(1) parsing table from a set of grammar rules

•Yacc/Bison specification:

parser declarations%%grammar rules%%auxiliary code

bison –vd file.y

-or-

yacc –vd file.y

y.tab.cy.tab.hy.output

file.y

Page 16: Compilation: A Retrospective CS 671 April 29, 2008.

16 CS 671 – Spring 2008

Parsing - Semantic Analysis

• Calculates the program’s “meaning”

• Rules of the language are checked (variable declaration, type checking)

• Type checking also needed for code generation (code gen for a + b depends on the type of a and b)

Page 17: Compilation: A Retrospective CS 671 April 29, 2008.

17 CS 671 – Spring 2008

Symbol Table Implementation

Can consist of:

• a hash table for all names, and

•a stack to keep track of scope

y \

x \

x

y

x

y

x

aa \

x

X

Scope change

Page 18: Compilation: A Retrospective CS 671 April 29, 2008.

18 CS 671 – Spring 2008

Typical Semantic Errors

Traverse the AST created by the parser to find:

•Multiple declarations: a variable should be declared (in the same scope) at most once

•Undeclared variable: a variable should not be used before being declared

•Type mismatch: type of the left-hand side of an assignment should match the type of the right-hand side

•Wrong arguments: methods should be called with the right number and types of arguments

Page 19: Compilation: A Retrospective CS 671 April 29, 2008.

19 CS 671 – Spring 2008

Stack Frames

Activation record or stack frame stores:

• local vars

• parameters

• return address

• temporaries

• (…etc)

(Frame size not known until

late in the compilation process)

Arg n…

Arg 2Arg 1

Static linkLocal vars

Ret addressTemporariesSaved regs

Arg m…

Arg 1Static link

current frame

previous frame

next frame

sp

fp

Page 20: Compilation: A Retrospective CS 671 April 29, 2008.

20 CS 671 – Spring 2008

Phases of a Compiler

Lexical Analysis

Parsing

Intermediate Code Generation

Code Generation

Source code(character stream)if (b==0) a = b;

Assembly code(character stream)CMP CX, 0CMOVZ CX, DX

Token stream

Abstract syntax tree

Intermediate code

Front End

Back EndMachine dependent

Machine independent

Code Optimization

IR

Page 21: Compilation: A Retrospective CS 671 April 29, 2008.

21 CS 671 – Spring 2008

Intermediate Code Generation

• Makes it easy to port compiler to other architectures– e.g., Pentium to MIPS

• Can also be the basis for interpreters – such as in Java

• Enables optimizations that are not machine specific

AST IR PowerPC

Alpha

x86optimize

Page 22: Compilation: A Retrospective CS 671 April 29, 2008.

22 CS 671 – Spring 2008

Phases of a Compiler

Lexical analyzer

Syntax analyzer

Semantic analyzer

Intermediate code generator

Code optimizer

Code generator

Source program

Target program

Intermediate Code Optimization

• Constant propagation, dead code elimination, common sub-expression elimination, strength reduction, etc.

• Based on dataflow analysis – properties that are independent of execution paths

Page 23: Compilation: A Retrospective CS 671 April 29, 2008.

23 CS 671 – Spring 2008

Analysis and Transformation

Most optimizations require some global understanding of program flow• Moving, removing, rearranging instructions

Achieve understanding by discovering the control flow of the procedure• What blocks follow/are reachable from other blocks• Where loops exist (focus optimization efforts)• We call this Control-Flow Analysis

Connect definitions and uses of variables• We call this Data-Flow Analysis

Page 24: Compilation: A Retrospective CS 671 April 29, 2008.

24 CS 671 – Spring 2008

Data-Flow Analysis

Properties:• either a forward analysis (out as function of in) or • a backward analysis (in as a function of out).

• either an “along some path” problem or• an “along all paths” problem.

out[I] = ( in[I] – kill[I] ) gen[I]

in[B] = out[B’]B’ succ(B)

Page 25: Compilation: A Retrospective CS 671 April 29, 2008.

25 CS 671 – Spring 2008

Static Single Assignment Form

if (…)

X 5 X 3

Y X

B1

B2 B3

B4

if (…)

X0 5 X1 3

X2 (X0, X1)Y0 X2

B1

B4

B2 B3

Before SSA After SSA

Page 26: Compilation: A Retrospective CS 671 April 29, 2008.

26 CS 671 – Spring 2008

Optimizations

What are they?• Code transformations• Improve some metric

Metrics• Performance: time, instructions, cycles• Space: Reduce memory usage• Code Size• Energy

Page 27: Compilation: A Retrospective CS 671 April 29, 2008.

27 CS 671 – Spring 2008

Optimizations

InliningConstant foldingAlgebraic simplificationConstant propagationDead code eliminationLoop-invariant code motionCommon sub-expression eliminationStrength reductionBranch prediction/optimizationRegister allocationLoop unrollingCache optimization

High-level IR

Low-level IR

Page 28: Compilation: A Retrospective CS 671 April 29, 2008.

28 CS 671 – Spring 2008

Scope of Optimization

Local (or single block)• Confined to straight-line code• Simplest to analyze

Intraprocedural (or global)• Consider the whole procedure

Interprocedural (or whole program)• Consider the whole program

Page 29: Compilation: A Retrospective CS 671 April 29, 2008.

29 CS 671 – Spring 2008

Phases of a Compiler

Lexical analyzer

Syntax analyzer

Semantic analyzer

Intermediate code generator

Code optimizer

Code generator

Source program

Target program

Native Code Generation

• Intermediate code is translated into native code

• Register allocation, instruction selection

Native Code Optimization

• Peephole optimizations – small window is optimized at a time

Page 30: Compilation: A Retrospective CS 671 April 29, 2008.

30 CS 671 – Spring 2008

mov t1, [bp+x]mov t2, t1add t2, 1mov [bp+x], t2

Instruction Tiling

x = x + 1;

MOVE

MEM

FP

+

x

+

MEM

FP

+

x

1

t2

t1

Page 31: Compilation: A Retrospective CS 671 April 29, 2008.

31 CS 671 – Spring 2008

Register Allocation

s1 2s2 4s3 s1 + s2s4 s1 + 1s5 s1 * s2s6 s4 * 2

s1 s2 s3 s4 s5 s6

Live Range Interference

S1 S4

S3 S2

S5

S6

S1 S4

S3 S2

S5

S6

Graph Coloring r1 2r2 4r3 r1 + r2r3 r1 + 1r1 r1 * r2r2 r3 * 2

Result

r1 greenr2 bluer3 pink

Page 32: Compilation: A Retrospective CS 671 April 29, 2008.

32 CS 671 – Spring 2008

Instruction Scheduling

•Create a DAG of dependences

•Determine priority

•Schedule instructions with– Ready operands– Highest priority

•Heuristics: If multiple possibilities, fall back on other priority functions

– Height, slack, register usage, etc.

1m 2m

4m

7

3

65

8

10

9m

Page 33: Compilation: A Retrospective CS 671 April 29, 2008.

33 CS 671 – Spring 2008

Modern Topics!

• Alternatives to the static compilation model

• Compiling for the Core2

• Compiling for GPUs

• Compiling for power and temperature

• Compiling for parallelism

Page 34: Compilation: A Retrospective CS 671 April 29, 2008.

34 CS 671 – Spring 2008

Alternatives to the Traditional Model

Static Compilation

All work is done “ahead-of-time”

Just-in-Time Compilation

Postpone some compilation tasks

Multiversioning and Dynamic Feedback

Include multiple options in binary

Dynamic Binary Optimization

Traditional compilation model

Executables can adapt

Page 35: Compilation: A Retrospective CS 671 April 29, 2008.

35 CS 671 – Spring 2008

Just-in-Time Compilation

High-Level Programming Languages

Machine Code

FrontEnd

BackEnd

Error Messages

Ship bytecodes (think IR) rather than binaries• Binaries execute on machines• Bytecodes execute on virtual machines

Compiler

Page 36: Compilation: A Retrospective CS 671 April 29, 2008.

36 CS 671 – Spring 2008

Compiling for the Core Architecture

Netburst architecture: has trace cache with decoded instructions

Core architecture: instructions are repeatedly fetched and decoded

Problems:

• Code alignment! (Fetch, brpred effects)

4008c0 44 8b

4008d0 04 8d A0 64 56 00 8b 34 8d 20 4a 50 00 44 01 C6

4008e0 42 8d 3c 06 41 83 C0 F6 01 F2 89 34 8d 40 7f 5c

4008f0 00 89 3c 8d 20 4a 50 00 44 89 04 8d A0 64 56 00

400900 41 29 F8 44 29 C2 48 83 c1 01 48 81 F9 A0 86 01

400910 00 7c Bb

20% Speedup!!!

Page 37: Compilation: A Retrospective CS 671 April 29, 2008.

37 CS 671 – Spring 2008

Compiling for GPUs

CPUs•Lots of instructions little data

– Out of order exec – Branch prediction

•Reuse and locality•Task parallel•Needs OS•Complex sync•Latency machines

GPUs•Few instructions lots of data

– SIMD– Hardware threading

•Little reuse•Data parallel•No OS•Simple sync•Throughput machines

Page 38: Compilation: A Retrospective CS 671 April 29, 2008.

38 CS 671 – Spring 2008

Bug ReportBug Report

Graphics Shadow acne

CompilerRound off error

Page 39: Compilation: A Retrospective CS 671 April 29, 2008.

39 CS 671 – Spring 2008

Page 40: Compilation: A Retrospective CS 671 April 29, 2008.

40 CS 671 – Spring 2008

Shader Compiler (SC)

Developers ship games in byte code– Each time a game starts the shader is compiled

Compiler is hidden in driver – No user gets to set options or flags

Compiler updates with new driver (once a month)

Compile done each time game is run

About ½ code is traditional compiler, all the usual stuff• Strange features: SC fights MS compiler

Page 41: Compilation: A Retrospective CS 671 April 29, 2008.

41 CS 671 – Spring 2008

Power-Aware Computing

Page 42: Compilation: A Retrospective CS 671 April 29, 2008.

42 CS 671 – Spring 2008

Temperature

Capacitive (Dynamic) Power Static (Leakage) Power

Minimum Voltage

20 cycles

Di/Dt (Vdd/Gnd Bounce)

Vol

tage

(V

)C

urr

ent

(A)

VOUT

CL ISub

VIN

IGate

Vin Vout

CL

Vdd

Power Issues in Microprocessors

Page 43: Compilation: A Retrospective CS 671 April 29, 2008.

43 CS 671 – Spring 2008

Code Optimizations for Low Power

Reorder instructions to reduce switching effect at functional units and I/O buses

Operand swapping• Swap the operands at the input of multiplier• Result is unaltered, but power changes significantly!

Other standard compiler optimizations• Software pipelining, dead code elimination,

redundancy elimination

Use processor-specific instruction styles• on ARM the default int type is ~ 20% more efficient

than char or short (sign/zero extension)

Page 44: Compilation: A Retrospective CS 671 April 29, 2008.

44 CS 671 – Spring 2008

Code Optimization of Parallel Programs

• Crises lead to paradigm shifts

• Most of our algorithms break down in light of parallelism!

• Many “band aids” exist, need a unified solution

• Need a way to represent synchronization, parallelism in our dataflow analyses

• Need a new IR … PIR– CFG + PDG = PPG

• Can then update our dataflow analyses

Page 45: Compilation: A Retrospective CS 671 April 29, 2008.

45 CS 671 – Spring 2008

The Big Picture

We now know:– All of the components of a compiler– What needs to be done statically vs. dynamically– The potential impact of language or architecture

changes– Why Java moved the “back-end” to run time– What compiler researchers are working on in

2008!


Recommended