+ All Categories
Transcript
Page 1: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Spring 2010Spring 2010

Loop Optimizations

Instruction SchedulingInstruction Scheduling

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 1

Page 2: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Outline 5

• Scheduling for loops • Loop unrolling • Software pipelining • Interaction with register allocation • Hardware vs. Compiler • I d i V i bl R i i Induction Variable Recognition • loop invariant code motion

Saman Amarasinghe 2 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 2

Page 3: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Scheduling Loopsg p

• Loop bodies are small p• But, lot of time is spend in loops due to large

number of iterationsnumber of iterations • Need better ways to schedule loops

Saman Amarasinghe 3 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 3

Page 4: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Examplep p • Machine

– Two arithmetic units store 2 cycles

– One load/store unit • load 2 cycles • store 2 cycles

• add 2 cycles

• Source Code for i = 1 to N

A[i] = A[i] * b

Both units are pipelined (initiate one op each cycle)

• branch 2 cycles • multiply 3 cycles

– Both units are pipelined (initiate one op each cycle)

Saman Amarasinghe 4 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 4

Page 5: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Examplep p

• Source Code for i = 1 to N

A[i] = A[i] * b base

ff • Assembly Code loop:

offset

mov (%rdi,%rax), %r10 imul %r11, %r10 mov %r10, (%rdi,%rax) mov %r10, (%rdi,%rax)sub $4, %raxjz loop

Saman Amarasinghe 5 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 5

Page 6: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Example mov d=7p p• Assembly Code imul

2d=5

loop:mov (%rdi,%rax), %r10imul %r11, %r10 mov

3

0d=2,

mov %r10, (%rdi,%rax)sub $4, %raxjz loop

sub

2

0d=2

jz loop• Schedule (9 cycles per iteration)

mov movjz

2d=0

mov movimul bge

imul bgeimul

Saman Amarasinghe 6 6.035 ©MIT Fall 1998

subsub

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 6

Page 7: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Outline 5

• Scheduling for loops • Loop unrolling • Software pipelining • Interaction with register allocation • Hardware vs. Compiler • I d i V i bl R i i Induction Variable Recognition • loop invariant code motion

Saman Amarasinghe 7 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 7

Page 8: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Unrollingp g

• Unroll the loopp bod yy few times • Pros:

– Create a much larger basic block for the body

Create a much larger basic block for the body

– Eliminate few loop bounds checks • Cons:Cons:

– Much larger program – SSetup codde (# of i f iterations < unroll f ll factor))(# i – beginning and end of the schedule can still have

d l tunused slots

Saman Amarasinghe 8 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 8

Page 9: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop: p ploop: mov (%rdi,%rax), %r10 imul %r11, %r10 mov %r10, (%rdi,%rax) mov %r10, (%rdi,%rax)sub $4, %raxjz loop

Saman Amarasinghe 9 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 9

Page 10: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop: p ploop: mov (%rdi,%rax), %r10 imul %r11, %r10 mov %r10, (%rdi,%rax)sub $4, %rax mov (%rdi,%rax), %r10 imul %r11, %r10 mov %r10, (%rdi,%rax)sub $4, %raxjz loop

Saman Amarasinghe 10 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 10

Page 11: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop:mul

mov d=14

d=12

2

3p ploop:mov (%rdi,%rax), %r10imul %r11, %r10mov %r10, (%rdi,%rax)

mov

sub d=9

d=90

2sub $4, %raxmov (%rdi,%rax), %r10imul %r11, %r10

mov d=7

d=5mul

2

2

mov %r10, (%rdi,%rax)sub $4, %raxjz loop d=2

d=2mov

sub

3

0

• Schedule (8 cycles per iteration)mov mov mov mov

mov mov mov mov

2

d=0jz

mov mov mov movimul imul bge

imul imul bgeimul imul

Saman Amarasinghe 11 6.035 ©MIT Fall 1998

sub sub sub sub

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 11

Page 12: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Unrollingp g

• Rename registersg– Use different registers in different iterations

Saman Amarasinghe 12 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 12

Page 13: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop: mul

mov d=14

d=12

2p ploop:mov (%rdi,%rax), %r10imul %r11, %r10

% 10 (% di % )mov

sub d=9

d=9

3

0

mov %r10, (%rdi,%rax)sub $4, %raxmov (%rdi,%rax), %r10

mov d=7

d=5mul

2

2

imul %r11, %r10mov %r10, (%rdi,%rax)sub $4, %rax d 2

d=2

d=5mul

mov

sub

3

0sub $4, %raxjz loop 2

d=2

d=0jz

sub

Saman Amarasinghe 13 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 13

Page 14: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop: mul

mov d=14

d=12

2p ploop:mov (%rdi,%rax), %r10imul %r11, %r10

% 10 (% di % )mov

sub d=9

d=9

3

0

mov %r10, (%rdi,%rax)sub $4, %raxmov (%rdi,%rax), %rcx

mov d=7

d=5mul

2

2

imul %r11, %rcxmov %rcx, (%rdi,%rax)sub $4, %rax d 2

d=2

d=5mul

mov

sub

3

0sub $4, %raxjz loop 2

d=2

d=0jz

sub

Saman Amarasinghe 14 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 14

Page 15: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Unrollingp g

• Rename reggisters – Use different registers in different iterations

• Eliminate unnecessary dependencies again use more registers to eliminate true anti and again, use more registers to eliminate true, anti and output dependencies

– eliminate dependent-chains of calculations wheneliminate dependent chains of calculations when possible

Saman Amarasinghe 15 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 15

Page 16: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop:

d=14

d=12

2

3

mul

mov

p ploop:mov (%rdi,%rax), %r10imul %r11, %r10

% 10 (% di % )d=9

d=90

2

mov

sub

mov %r10, (%rdi,%rax)sub $4, %raxmov (%rdi,%rax), %rcx

d=7

d=52

2

mov

mul

imul %r11, %rcxmov %rcx, (%rdi,%rax)sub $4, %rax d=2

d=23

0

mov

subsub $4, %raxjz loop 2

d=0jz

Saman Amarasinghe 16 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 16

Page 17: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop:

d=5

d=3

2

3

mul

mov

p ploop:mov (%rdi,%rax), %r10imul %r11, %r10

% 10 (% di % )d=0

d=00

2

mov

sub

mov %r10, (%rdi,%rax)sub $8, %raxmov (%rdi,%rbx), %rcx

d=7

d=52

2

mov

mul

imul %r11, %rcxmov %rcx, (%rdi,%rbx)sub $8, %rbx d=2

d=23

0

mov

subsub $8, %rbxjz loop 2

d=0jz

Saman Amarasinghe 17 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 17

Page 18: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Exampleloop:

d=5

d=3

2

3

mul

mov

p ploop:mov (%rdi,%rax), %r10imul %r11, %r10mov %r10, (%rdi,%rax) d=0

d=00

2

mov

sub

sub $8, %raxmov (%rdi,%rbx), %rcximul %r11, %rcx

d=7

d=52

2

mov

mul

mov %rcx, (%rdi,%rbx) sub $8, %rbxjz loop

S h d l (4 5 l it tid=2

d=23

0

mov

sub

• Schedule (4.5 cycles per iterationmov mov mov mov

mov mov mov mov

2

d=0jz

mov mov mov movimul imul jz

imul imul jzimul imul

sub sub

Saman Amarasinghe 18 6.035 ©MIT Fall 1998

sub subsub sub

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 18

Page 19: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Outline 5

• Scheduling for loops • Loop unrolling • Software pipelining • Interaction with register allocation • Hardware vs. Compiler • lloop iinvariiant codde motiion • Induction Variable Recognition

Saman Amarasinghe 19 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 19

Page 20: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Software Pipeliningp g

Find the steady state window so that:

• Tryy to overla pp multi pple iterations so that the slots will be filled

• Find the steady-state window so that: – all the instructions of the loop body is executed – but from different iterationsbut from different iterations

Saman Amarasinghe 20 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 20

Page 21: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Examplep p • Assembly Code

loop: mov (%rdi,%rax), %r10 imul %r11, %r10 mov %r10, (%rdi,%rax)sub $4, %rax jz loopj p

• Schedule mov mov

mov mov mul jz

mul jz mul

sub sub

Saman Amarasinghe 21 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 21

Page 22: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Example• Assembly Code

p p

loop:mov (%rdi,%rax), %r10imul %r11, %r10mov %r10, (%rdi,%rax)sub $4, %raxjz loop

mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov61 2 3 1 4 2 ld5 3

j p• Schedule

mov mov1 mov2 mov mov3 mov1 mov4 mov2 ld5 mov3mul mul1 mul2 jz mul3 jz1 mul4 jz2 mul5

mul mul1 mul2 jz mul3 jz1 mul4 jz2mul mul1 mul2 mul3 mul4

sub sub1 sub2 sub3sub sub1 sub2 sub3

Saman Amarasinghe 22 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 22

Page 23: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Examplep p• Assembly Code

loop:mov (%rdi,%rax), %r10imul %r11, %r10mov %r10, (%rdi,%rax)sub $4, %raxjz loop

mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov61 2 3 1 4 2 ld5 3

j p• Schedule (2 cycles per iteration)

mov mov1 mov2 mov mov3 mov1 mov4 mov2 ld5 mov3mul mul1 mul2 jz mul3 jz1 mul4 jz2 mul5

mul mul1 mul2 jz mul3 jz1 mul4 jz2mul mul1 mul2 mul3 mul4

sub sub1 sub2 sub3sub sub1 sub2 sub3

Saman Amarasinghe 23 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 23

Page 24: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Examplep p• 4 iterations are overlapped

– value of %r11 don’t changemov4 mov2mov1 mov4

– 4 regs for (%rdi,%rax)– each addr. incremented by 4*4

mul3 jz1jz mul3mul2

– 4 regs to keep value %r10mul2

sub2sub1

– Same registers can be reused after 4 of these blocksgenerate code for 4 blocks,

loop:mov (%rdi,%rax), %r10imul %r11 %r10g ,otherwise need to move imul %r11, %r10mov %r10, (%rdi,%rax)sub $4, %raxjz loop

Saman Amarasinghe 24 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 24

Page 25: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Software Pipeliningp g • Optimal use of resources

N d l f i• Need a lot of registers – Values in multiple iterations need to be kept

• Issues in dependenciesIssues in dependencies – Executing a store instruction in an iteration before branch

instruction is executed for a previous iteration (writing when it should not have)it should not have)

– Loads and stores are issued out-of-order (need to figure-out dependencies before doing this)

• Code generation issuesCode generation issues – Generate pre-amble and post-amble code – Multiple blocks so no register copy is needed

Saman Amarasinghe 25 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 25

Page 26: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Outline 5

• Scheduling for loops • Loop unrolling • Software pipelining • Interaction with register allocation • Hardware vs. Compiler • I d i V i bl R i i Induction Variable Recognition • loop invariant code motion

Saman Amarasinghe 26 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 26

Page 27: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Register Allocation d I t ti S h d li and Instruction Scheduling

• If reggister allocation is before instruction scheduling–restricts the choices for schedulingg

Saman Amarasinghe 27 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 27

Page 28: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Examplep1: mov 4(%rbp), %rax2 dd % % b 2: add %rax, %rbx 3: mov 8(%rbp), %rax4: add %rax, %rcx 4: add %rax, %rcx

Saman Amarasinghe 28 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 28

Page 29: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Examplep1: mov 4(%rbp), %rax2 dd % % b

12: add %rax, %rbx3: mov 8(%rbp), %rax 4: add %rax, %rcx 2

31 14: add %rax, %rcx

11 3

3

1

4

Saman Amarasinghe 29 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 29

Page 30: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Examplep1: mov 4(%rbp), %rax2 dd % % b

12: add %rax, %rbx3: mov 8(%rbp), %rax 4: add %rax, %rcx 2

31 14: add %rax, %rcx

11 3

3

1

42 4ALUop

1 3MEM 1

Saman Amarasinghe 30 6.035 ©MIT Fall 1998

1 31 3MEM 2

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 30

Page 31: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Examplep1: mov 4(%rbp), %rax2 dd % % b

12: add %rax, %rbx3: mov 8(%rbp), %rax4: add %rax, %rcx 2

31 14: add %rax, %rcx

11Anti-dependence

How about a different register?3

3

1

4

Saman Amarasinghe 31 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 31

Page 32: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Examplep1: mov 4(%rbp), %rax2 dd % % b

12: add %rax, %rbx3: mov 8(%rbp), %r104: add %r10, %rcx 2

3

4: add %r10, %rcx

Anti-dependenceHow about a different register?

3

3

4

Saman Amarasinghe 32 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 32

Page 33: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Examplep1: mov 4(%rbp), %rax2 dd % % b

12: add %rax, %rbx3: mov 8(%rbp), %r10 4: add %r10, %rcx 2

3

4: add %r10, %rcx

3

3

42 4ALUop

1 3MEM 1

Saman Amarasinghe 33 6.035 ©MIT Fall 1998

1 31 3MEM 2

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 33

Page 34: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Register Allocation d I t ti S h d li and Instruction Scheduling

• If reggister allocation is before instruction scheduling–restricts the choices for schedulingg

Saman Amarasinghe 34 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 34

Page 35: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Register Allocation and Id Insttructiti on SSchhedduli ling

g

• If reggister allocation is before instruction scheduling – restricts the choices for scheduling

• If instruction scheduling before registerIf instruction scheduling before registerallocation

Register allocation may spill registersRegister allocation may spill registers – Will change the carefully done schedule!!!

Saman Amarasinghe 35 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 35

Page 36: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Outline 5

• Scheduling for loops • Loop unrolling • Software pipelining • Interaction with register allocation • Hardware vs. Compiler • I d i V i bl R i i Induction Variable Recognition • loop invariant code motion

Saman Amarasinghe 36 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 36

Page 37: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Superscalar: Where have all the t i t ?transistors gone?

• Out of order execution – If an instruction stalls, go beyond that and start

executing non-dependent instructions

– Pros: • Hardware scheduling • Tolerates unpredictable latencies

– Cons: • Instruction window is small

Saman Amarasinghe 37 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 37

Page 38: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Superscalar: Where have all the t i t ?transistors gone?

• Reggister renamingg – If there is an anti or output dependency of a register

that stalls the pipeline, use a different hardware register

– Pros: • Avoids anti and output dependencies

– Cons: • Cannot do more complex transformations to eliminate

dependencies

Saman Amarasinghe 38 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 38

Page 39: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Hardware vs. Compilerp• In a superscalar, hardware and compiler scheduling

can work hand-in-hand • Hardware can reduce the burden when not predictable

by the compiler • Compiler can still greatly enhance the performance

– Large instruction window for scheduling – Many program transformations that increase parallelism

C il i iti l h h d• Compiler is even more critical when no hardware support

VLIW machines (Itanium DSPs) VLIW machines (Itanium, DSPs)

Saman Amarasinghe 39 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 39

Page 40: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Outline 5

• Scheduling for loops • Loop unrolling • Software pipelining • Interaction with register allocation • Hardware vs. Compiler

V bl • I dInductiion Variiable RRecogniitiion • loop invariant code motion

Saman Amarasinghe 40 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 40

Page 41: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Induction Variables

• Examplepi = 200 for j = 1 to 100ja(i) = 0i = i - 1

Saman Amarasinghe 41 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 41

Page 42: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Induction Variables

• Examplepi = 200 for j = 1 to 100ja(i) = 0i = i - 1

Basic Induction variable: J = 1 2 3 4J = 1, 2, 3, 4, …..

Index Variable i in a(i):

Saman Amarasinghe 42 6.035 ©MIT Fall 1998

I = 200, 199, 198, 197….

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 42

Page 43: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

=

Induction Variables

• Examplepi = 200 for j = 1 to 100for j 1 to 100 a(i) = 0i = i - 1 Basic Induction variable: J = 1 2 3 4J = 1, 2, 3, 4, …..

Index Variable i in a(i):

Saman Amarasinghe 43 6.035 ©MIT Fall 1998

I = 200, 199, 198, 197…. = 201 - J

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 43

Page 44: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

=

Induction Variables

• Examplepi = 200 for j = 1 to 100for j 1 to 100 a(201 - j) = 0i = i - 1 Basic Induction variable: J = 1 2 3 4J = 1, 2, 3, 4, …..

Index Variable i in a(i):

Saman Amarasinghe 44 6.035 ©MIT Fall 1998

I = 200, 199, 198, 197…. = 201 - J

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 44

Page 45: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

=

Induction Variables

• Examplep

for j = 1 to 100for j 1 to 100 a(201 - j) = 0

Basic Induction variable: J = 1 2 3 4J = 1, 2, 3, 4, …..

Index Variable i in a(i):

Saman Amarasinghe 45 6.035 ©MIT Fall 1998

I = 200, 199, 198, 197…. = 201 - J

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 45

Page 46: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

What are induction variables?

• x is an induction variable of a loop L ifp– variable changes its value every iteration of the loop – the value is a function of number of iterations of

the loop

• In compilers this function is normally a linear functionfunction – Example: for loop index variable j, function c*j + d

Saman Amarasinghe 46 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 46

Page 47: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

What can we do with induction i bl ?variables?

• Use them to perform strength reduction • Use them to perform strength reduction

• Get rid of them

Saman Amarasinghe 47 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 47

Page 48: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Classification of induction variables

• Basic induction variables – Explicitly modified by the same constant amount

once during each iteration of the loop – Example: loop index variable

• Dependent induction variables – Can be expressed in the form: a*x + b where a and Can be expressed in the form: a x + b where a and

be are loop invariant and x is an induction variable – Example: 202 - 2*j

Saman Amarasinghe 48 6.035 ©MIT Fall 1998

Example: 202 2 j

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 48

Page 49: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Classification of induction variables

• Class of induction variables: All induction variables with same basic variable in their linear equationsq

• Basis of a class: the basic variable thatBasis of a class: the basic variable that determines that class

Saman Amarasinghe 49 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 49

Page 50: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Finding Basic Induction Variablesg

• Look inside loop nodesp• Find variables whose only modification is of

the form j = j + d where d is a loop the form j j + d where d is a loop constant

Saman Amarasinghe 50 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 50

Page 51: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Finding Dependent Induction Variablesg p • Find all the basic induction variables

• Search variable k with a single assignment in the loop

• Variable assignments of the form k = e op j or k = -j where j is an induction variable and e is loop invariant

Saman Amarasinghe 51 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 51

Page 52: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Finding Dependent Induction Variablesg p • Example for i = 1 to 100

j = i*ck = j+1

Saman Amarasinghe 52 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 52

Page 53: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

A special casep

t = 202 for j = 1 to 100t = t - 2 a(j) = tjt = t - 2 b(j) = t(j)

Saman Amarasinghe 53 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 53

Page 54: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

=

A special casep

u1 = 200 t = 202 for j = 1 to 100

u2 = 202 for j = 1 to 100

t = t - 2 a(j) = t

j u1 = u1 - 4 a(j) = u1 j

t = t - 2 b(j) = t

a(j) u1 u2 = u2 - 4 b(j) = u2 (j) b(j) u2

Saman Amarasinghe 54 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 54

Page 55: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Outline 5

• Scheduling for loops • Loop unrolling • Software pipelining • Interaction with register allocation • Hardware vs. Compiler • I d i V i bl R i i Induction Variable Recognition • Loop invariant code motion

Saman Amarasinghe 55 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 55

Page 56: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motionp

• If a computation produces the same value inp p every loop iteration, move it out of the loop

Saman Amarasinghe 56 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 56

Page 57: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motionp

• If a computation produces the same value inp p every loop iteration, move it out of the loop

for i = 1 to N x x + 1x = x + 1 for j = 1 to N

a(i,j) = 100*N + 10*i + j + x

Saman Amarasinghe 57 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 57

Page 58: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motionp

• If a computation produces the same value inp p every loop iteration, move it out of the loop

for i = 1 to N x x + 1x = x + 1 for j = 1 to N

a(i,j) = 100*N + 10*i + j + x

Saman Amarasinghe 58 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 58

Page 59: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motion

• If a compputation pproduces the same value in every loop iteration, move it out of the loop

p

t1 = 100*N

a(i,j) = 100*N + 10*i + j + x

for i = 1 to N x x + 1x = x + 1 for j = 1 to N

Saman Amarasinghe 59 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 59

Page 60: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motionp

• If a computation produces the same value inp p every loop iteration, move it out of the loop

t1 = 100*N for i = 1 to N x x + 1x = x + 1 for j = 1 to N

a(i,j) = t1 + 10*i + j + x

Saman Amarasinghe 60 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 60

Page 61: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motionp

• If a computation produces the same value inp p every loop iteration, move it out of the loop

t1 = 100*N for i = 1 to N x x + 1x = x + 1 for j = 1 to N

a(i,j) = t1 + 10*i + j + x

Saman Amarasinghe 61 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 61

Page 62: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motion

• If a compputation pproduces the same value in every loop iteration, move it out of the loop

p

t1 = 100*N

for i = 1 to N

x x + 1 x = x + 1 for j = 1 to N

a(i,j) = t1 + 10*i + j + x

Saman Amarasinghe 62 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 62

Page 63: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motion

• If a compputation pproduces the same value in every loop iteration, move it out of the loop

p

t1 = 100*N

for i = 1 to N

x x + 1 x = x + 1 t2 = t1 + 10*i + x for j = 1 to N

a(i,j) = t1 + 10*i + j + x( ,j) j

Saman Amarasinghe 63 6.035 ©MIT Fall 1998

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 63

Page 64: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

Loop Invariant Code Motionp

• If a computation produces the same value inp p every loop iteration, move it out of the loop

t1 = 100*N for i = 1 to N x x + 1x = x + 1 t2 = t1 + 10*i + x for j = 1 to N

a(i,j) = t2 + j

Saman Amarasinghe 64 6.035 ©MIT Fall 1998

( ,j) j

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 64

Page 65: 6.035 Lecture 14, Loop optimizations: instruction scheduling...mov mov1 mov2 mov mov3 mov1 mov4 mov2 mov5 mov3 mov6 1 2 3 1 4 2 ld5 3 † Schedule. mov mov mov mov mov mov mov mov

MIT OpenCourseWarehttp://ocw.mit.edu

6.035 Computer Language Engineering Spring 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Source: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-035-computer-language-engineering-spring-2010/lecture-notes/ Saylor Course: http://www.saylor.org/courses/cs304/

The Saylor Foundation 65


Top Related