+ All Categories
Home > Documents > Reiner Hartenstein, TU Kaiserslautern, Germany http ...

Reiner Hartenstein, TU Kaiserslautern, Germany http ...

Date post: 05-Oct-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
12
[email protected] 9 June 2013 Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 1 Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de The Tunnel Vision Syndrome: Massively Delaying Progress 1 Reiner Hartenstein IEEE fellow FPL fellow SDPS fellow Kaiserslautern University of Technology The George Washington University, 5-7 June 2013, Washington, DC © 2012, [email protected] http://hartenstein.de TU Kaiserslautern Living in Baden-Baden 2 © 2012, [email protected] http://hartenstein.de TU Kaiserslautern Outline (1) Preface History of Computing The Systolic Array The Kress Array The Xputer Paradigm The Twin Paradigm Approach More Tunnel Vision We must Reinvent Computing Conclusions 3 http://www.uni-kl.de © 2012, [email protected] http://hartenstein.de TU Kaiserslautern 4 Power consumption by ICT infrastructures: x30 til 2030 if trends continue G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 11 Sep 2008 4 © New York Times at Dallas http://hartenstein.de TU Kaiserslautern Google‘s Electricity Bill Cost of a data center determined by the monthly power bill „The possibility of computer equipment power consumption spiraling out of control could have serious consequences for the overall affordability of computing.” [L. A. Barrosso, Google] 5 Patent for water-based data centers Google going to sell electricity http://hartenstein.de/ComputerStromverbrauch.pdf © 2012, [email protected] TU Kaiserslautern PATMOS 2013 - 23 rd International Workshop on Power And Timing Modeling, Optimization and Simulation 6 co-located w. VARI 2013 - 4rd European Workshop on CMOS Variability http://xputer.de/PATMOS/ the leading conference on power optimization
Transcript

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 1

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

The Tunnel Vision

Syndrome:

Massively Delaying

Progress

1

Reiner Hartenstein

VIPSI-2012 MONTENEGRO

Hotel Splendid in Becici Dec 31, 2012 to Jan 1, 2013

IEEE fellow

FPL fellow

SDPS fellow

Kaiserslautern University of Technology

The George Washington University, 5-7 June 2013, Washington, DC

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Living in Baden-Baden

2

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Outline (1)

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

3

http://www.uni-kl.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

4

Power consumption by ICT infrastructures: x30 til 2030 if trends continue

G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 –11 Sep 2008

4 © New York Times at Dallas

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Google‘s Electricity Bill

Cost of a data center determined by the monthly power bill

„The possibility of computer equipment power consumption spiraling out of control could have serious consequences

for the overall affordability of computing.”

[L. A. Barrosso, Google]

5

Patent for water-based data centers

Google going to sell electricity

http://hartenstein.de/ComputerStromverbrauch.pdf © 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

PATMOS 2013 - 23rd International Workshop on Power

And Timing Modeling, Optimization and Simulation

6

co-located w. VARI 2013 - 4rd European Workshop on CMOS Variability

http://xputer.de/PATMOS/

the leading conference on power optimization

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 2

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

The End of Evolution

7

The end of evolution: we need a revolution!

Disruptive research is urgently required and

We need by orders of magnitude more parallelism and power-efficiency

fundamental issues have to be revisited

We must overcome the von Neumann syndrome and also the widespread Tunnel Vision dementia

re-implementing

Power efficiency progress too slow

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

„mini“computers: VAX-11/750

8

quasi-standard around 1980

UC Berkeley (visiting professor)

E.I.S. project (my M-&-C-action)

my Xputer lab at

Kaiserslautern

my CS department

at Kaiserslautern

(personal experience:)

NATO ASI, Urbino 1981

my von Neumann syndrome experience also here

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Too many terminals

9

NATO ASI on VLSI 1981, SOGESTA, Urbino, Italy,

orders of magnitude

speed-up by using

museum equipment

Paolo

Antognetti

Donald O. Pederson

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

Outline (2)

10

http://www.uni-kl.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

The History of Computing (1)

11

The 1st electrical computer, ready prototyped for mass production ?

Guess: which year, which company ?

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

The History of Computing

(2) Prototype 1884: Herman Hollerith

12

the first reconfigurable computer

first Xilinx FPGA 100 years later

1989 US census use

The LUT (lookup table)

datastream-based ! datastream-based !

non-volatile !!

size: <3 refrigerators ! size: <3 refrigerators !

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 3

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Punched Card Data Memory

Not yet invented in 1884:

• magnetic tape (1898),

• the vacuum tube (1904),

• magnetic drum (1932),

• the transistor (1934),

• ferrite core memory (1949),

• hard disc (1956).

13

… state of the art …..

1969: MPGA

1971: PLD

1973. PLA

1982: EPLD

1984: FPGA.

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

6 decades later:

paradigm shift

14

30 tons, 178 kW

almost 1000 square feet of floor space

about 3 hours MTBF

from data streams to instruction streams

1946

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

104

102

101

15

FFT FFT 100

Reed-Solomon Decoding

Reed-Solomon Decoding 2400

Viterbi Decoding Viterbi Decoding 400

1000

MAC MAC

DSP and wireless

molecular dynamics simulation

molecular dynamics simulation

88

BLAST BLAST 52

protein identification protein identification 40

Smith-Waterman pattern matching Smith-Waterman pattern matching

288

Bioinformatics

GRAPE GRAPE

20 20

Astrophysics Astrophysics

SPIHT wavelet-based image compression

SPIHT wavelet-based image compression 457

real-time face detection

real-time face detection

6000 6000

video-rate stereo vision

video-rate stereo vision

900 pattern

recognition pattern

recognition 730

Image processing, Pattern matching, Multimedia

3000 CT imaging CT imaging crypto crypto

1000

DES breaking DES

breaking

28500

100

106

103

Spe

edup

-Fac

tor

8723 DNA seq.

3439

1116*

*)DES br. equipment size

Speed-up factors are not new

>15000 >15000

DPLA replacing 256 FPGAs’1984 (E.I.S. project)

by Software to FPGA migration Speed-up Factors

Software to DPLA migration

PISA

project

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

no von Neumann-bottle- neck

no von Neumann-bottle- neck

instruction stream parallelism:

many von Neumann

bottlenecks

many von Neumann

bottlenecks

16

[Reiner’s

watering can

model]

What Parallelism?

datastream parallelism:

why such massive speed-up ?

instead of:

Look at the Reconfigurable

Computing Paradox

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

instruction stream parallelism:

many von Neumann

bottlenecks

many von Neumann

bottlenecks

17

[Reiner Hartenstein’s

watering can model]

Instruction Stream

Parallelism

better illustration

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

18

Lambert M. Surhone, Mariam T. Tennoe,

Susan F. Hennessow (ed.): Von Neumann

Syndrome; ßetascript publishing 2011

The Beauty and the

Joy of Computing …

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 4

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Max Planck: Replacement of false doctrines by new insights needs 50 years waiting for not only old professors but also their scholars to die off.

50 Years of

Software Crisis F. L. Bauer coined this term in 1961

19 © 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

Outline (3)

20

http://www.uni-kl.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Early ASAP Conferences:

tend back to datastreams !

CHDL-Based CAD System for the Synthesis of Systolic Architectures;

21

The 3rd International Conference on Systolic Arrays, 1989 in Killarney, Ireland, in the historic Great Southern Hotel

Mapping Systolic Arrays onto the Map-Oriented Machine (MoM)

The ASAP series is an important forum to discuss cures for healing from the von Neumann syndrome

our contributions:

systolic general purpose processor

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Our ASAP‘94 thru ASAP‘97 Papers

ASAP'94, San Francisco, CA, USA: A Dynamically Reconfigurable Wavefront Array Architecture for Evaluation of Expressions

22

ASAP'95, Strasbourg, France: A Parallelizing Compilation Method for the Map-oriented Machine;

ASAP'96, Chicago, Ill., USA: A Synthesis System for Bus-based Wavefront Array Architectures;

ASAP`97, Zurich, CH: A Novel Sequencer Hardware for Application Specific Computing;

(MoM)

http://xputer.de/ASAP/

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Systolic Arrays (1)

23

„VAX? That‘s why it is so slow“

IEEE 7th ISCA, La Baule, France, May 6-8, 1980

from the airport to La Baule

Mario Barbacci:

M. J. Foster and H. T. Kung: The Design of Special-Purpose VLSI Chips ...

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Systolic Arrays (2)

24

IEEE 7th ISCA, La Baule, France, May 6-8, 1980

M. J. Foster and H. T. Kung: The Design of Special-Purpose VLSI Chips ...

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 5

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

http://www.fpl.uni-kl.de/papers/publications/karl-steinbuch.html

25

why not general purpose ?

M. J. Foster and H. T. Kung: “The Design of Special-

Purpose VLSI Chips ... “ Karl Steinbuch

It is not sufficient to invent something. You need to recognize, that you have invented something.

Systolic Arrays (3)

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

Outline (4)

26

http://www.uni-kl.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

27

What Synthesis Method? (2)

of course algebraic! (linear projection)

Rainer Kress replaced it by simulated annealing*:

supports also any irregular & wild form pipe networks

1995:

supports only applications with strictly regular data dependencies

*) KressArray [ASP-DAC-1995]

http://kressarray.de/

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

http://xputers.informatik.uni-kl.de/papers/publications/NageldingerDiss.html

Ulrich Nageldinger: Coarse-grained Reconfigurable Architectures Design Space Exploration; Dissertation, 2001, Kaiserslautern University of Technology

Systolic Array Generalization

28

Rainer Kress: A Fast Reconfigurable ALU for Xputers, Dissertation 1996, Kaiserslautern University of Technology

DATE 2001, Munich, Germany ?

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Ulrich‘s tie

29 © 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

Outline (5)

30

http://www.uni-kl.de

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 6

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

“It’s not our job”

x x x

x x x

x x x

| | |

x x x x

x x

x x x

- - -

x x x x

x x

x x x

- - -

- - -

- - -

- - -

x x x

x x x

x x x

| | |

| | |

| | |

| |

|

31 *) or receives

Another Tunnel Vision Symptom

without a sequencer: missed to invent a new machine paradigm

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

32

Who generates

the data streams?

http://xputer.de/ http://data-streams.org/

*) publ. 1989

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

104

102

101

33

100

2400

400

1000

MAC MAC

88 52

40

288

20 20

457

900 730

1000

100

103

Spe

edup

-Fac

tor

DPLA replacing 256 FPGAs’1984 (E.I.S. project)

Complexity: linear with area

PISA

project

>15000 >15000

Design Rule Check on MoM Xputer, programmed by MoPL (Lynn Conway‘s λ-grid-based CMOS rules)

by Software to DPLA migration PISA Speed-up Factors

105

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Processing 4-by-4 Reference Patterns

34

Design Rule Check accelerator*: DPLA: fabricated by the

E.I.S. Multi University Project:

*) PISA DRC accelerator [ICCAD 1984]

1984: 1 DPLA replaces 256 FPGAs

DPLA was more area-efficient than FPGA - by several orders of magnitude

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Reconfigurable Address Generator

no instruction streams needed even to generate most complex address sequences

35

a) b)

c)

d) e)

f) g)

video scan

-90º rotated video scan

sheared video scan

non-rectangular video scan

zigzag video scan

spiral scan

feed-back-driven scans

atomic scan linear scan

-45º rotated (mirx (v scan))

perfect shuffle

GAU

auto-sequencing memory: generalized the DMA

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

Outline (6)

36

http://www.uni-kl.de

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 7

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Dual paradigm mind set:

an old hat - but ignored

37

time to space mapping: procedural to structural:

1967: W. A. Clark: Macromodular Computer

Systems; 1967 SJCC, AFIPS Conf. Proc.

C. G. Bell et al: The Description and Use of Register-Transfer Modules (RTM's); IEEE Trans-C21/5, May 1972

FF

token bit

evoke

FF FF

1971

loop to pipe mapping

why did it take 25 years to find out?

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Duality of procedural Languages

Flowware Languages

read next data item

goto (data address)

jump to (data address)

data loop

data loop nesting

data loop escape

data stream branching

yes: internally parallel loops

38

Software Languages

read next instruction

goto (instruction address)

jump to (instruction address)

instruction loop

instruction loop nesting

instruction loop escape

instruction stream branching

no: internally parallel loops

But there is an AsymmetryAsymmetry But there is an AsymmetryAsymmetry

program counter: data counter(s):

more simple:

no ALU tasks

MoPL state register(s):

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

39

A Clean Terminology, please

program source compilation result

Software instruction streams

Flowware data streams

Configware datapath structures configured

Data-Flow Execution Models for Extreme Scale Computing (DFM 2013)

StreamStream http://www.cs.ucy.ac.cy/dfmworkshop/

http://www.pactconf.org/ in conjunction with PACT 2013

This new series started 10 Oct 2011 at Galveston Island, Texas, USA

Edinburgh, UK, 19-23 Sep 2013

Generalization of the term „Program“

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

The 1st

Dual Paradigm Co-Compiler

40

Analyzer / Profiler

GNU C compiler

para d igm Computer machine

DPSS

X-C compiler

Xputer machine paradigm

Partitioner

Loop Transformations

X-C

Resource Parameters

supporting different platforms

Juergen Becker’s

CoDe-X 1996

(Rainer Kress)

X-C is C language extended by MoPL

KressArray Flowware

Extended Lesley Lamport

Hyperplane Theorem

Host Software

KressArray Configware

all 3 types of sources:

Software,

Configware,

Flowware

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Outline (7)

41

http://www.uni-kl.de

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

More Tunnel Vision (1)

42 > 20 years ignored by TR scene

solution illustration:

some re-write rules: specification:

Term Rewriting (TR) use in EDA: „still the only top-down example - Universidade de Brasilia in 2001: TR-expert Prof. Mauricio Ayala-Rinćon,

only in verification: is bottom-up.“

automatic floor plan design by TR: an integer multiplier example, published 1979

http://www.fpl.uni-kl.de/karl/karl_history_fbi.html#3.1 algebraic structured design

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 8

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

More Tunnel Vision (2)

43

20 years ignored by TR scene

floor plan of one slice

microchip layout integer multiplier example, published 1979

http://www.fpl.uni-kl.de/karl/karl_history_fbi.html#3.1 algebraic structured design © 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

Outline (8)

44

http://www.uni-kl.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

45

A huge design space

extending Flynn‘s taxonomy by going heterogeneous:

Reiner‘s Taxonomy

datastream-based (anti-machine)

noI versus SI or MI

Programmability crisis solution impossible without mastering the entire design space

Single vs. Multiple Instruction vs. Data

Mike Flynn‘s taxonomy

reconfigurable or not

Diana‘s Taxonomy

Dia

na G

öhrin

ger‘s

P

h.D

.thes

is

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

46

The tunnel vision of

the pre-manycore age

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

47

Heterogeneous Taxonomy

The extension into this huge design space is mainly

ignored by curricula and even by most R&D scenes

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Education

Revolution:

the M-&-C VLSI

Design Revolution

48

Das E.I.S.-Projekt:

http://xputer.de/EIS/

[1980]

The by far most

effective project

in the history of

modern computer

science

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 9

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Education Revolution:

the M-&-C Design Revolution

49

reduced width of specialization

The Mead-&-Conway strategy:

Clearing out & intuitive models

Removal of the education

dilemma

Silicon Foundry

ta

ll th

in

m

an

Application

cohe

renc

e

traditional division of specialization:

Logic level

Switching level

Circuit level

Register Transfer (RT) level

Application level

Layout level

in-house technology

submit reject

submit reject

submit reject

submit reject

submit reject

width of specialization

frag

men

tatio

n

Das E.I.S.-Projekt:

http://xputer.de/EIS/

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

More Connected Thinking

50

The new strategy:

heterogeneous platforms

ta

ll th

in

m

an

cohe

renc

e

traditional division of specialization:

Logic level

Switching level

Circuit level

Register Transfer (RT) level

application level

Layout level

inter-processor NoC

pipe-networks

arbiter

Multiplexing

distribution

Interconnect at all levels

off-chip communication

global and local memory

distributed memory

even more

ta

ll th

in

w

om

an

cohe

renc

e

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

The Tunnel Vision of

Language Designers

… in teaching to students …

Absurdely incomprehensible abstractions are the problem in „standard“ languages

[E. A. Lee: Are new languages necessary for multicore? 2007]

[E. A. Lee. The problem with threads. Computer, 2006.]

Concurrency models should operate at component architecture level rather than programming languages. [E. A. Lee]

The High Cost of Movement of Data

and Instructions

51

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Less Abstraction Layers ?

Removing Abstraction Layers hides critical sources of efficiency limits.

52

We must change how programmers think, also by …..

Hides issues to detect overhead and bottlenecks

… fusion of abstraction layers: challenging educators

… opening the borders between paradigm domains

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

CS Education

53

procedural have not *) efficiently

You cannot *teach Hardware to a Programmer

structural

To a Hardware Guy you always can

teach Programming

have procedural

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

•Preface

•History of Computing

•The Systolic Array

•The Kress Array

•The Xputer Paradigm

•The Twin Paradigm Approach

•More Tunnel Vision

•We must Reinvent Computing

•Conclusions

Outline (9)

54

http://www.uni-kl.de

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 10

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Conclusions

55

We must disruptively reivent CS education

Disruptive research is urgently required and

We need by orders of magnitude more parallelism and power-efficiency

fundamental issues have to be revisited

We must overcome the von Neumann syndrome and the widespread Tunnel Vision dementia

A reductionist attitude of most R&D areas massively delays the solution of urgent problems

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

We need „une' Levée en Masses“

56

We need „une' Levée en

Masses“

We need „une' Levée en

Masses“

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

ASAP 2013, Washington, DC, 5-7 June 2013

57

Reiner Hartenstein:

The Tunnel Vision Syndrome: Massively Delaying Progrsss

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

backup for

discussion

58

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Program Engineering (2)

The Generalization of Software Engineering

59

PE

Program Engineering

Flowware Engineering

FE

auto-sequencing Memory auto-sequencing Memory

asM asM

SE

Software Engineering

CPU CPU

CE

Configware Engineering structures

pipe network model etc.

pipe network model etc.

The Generalization of Software Engineering —

*) do not confuse with „dataflow“!

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

The Reconfigurable Computing Paradox

60

• The spirit from the Mainframe Age is collapsing under the von Neumann syndrome

• There is something fundamentally wrong in using the von Neumann paradigm

• Up to 4 orders of magnitude speedup + tremendously slashing the electricity bill by migration to FPGA

• Bad FPGA technology: reconfigurability overhead, wiring overhead, routing congestion, slow clock speed

• The reason of this paradox ?

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 11

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Data Sequencing by the MoM

61

2-dimensional memory organization

adjustable size Scan patterns for sequencing inside the Scan Window and for moving it

(2-D example)

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Reconfigurable Address Generator

no instruction streams needed even to generate most complex address sequences

62

a) b)

c)

d) e)

f) g)

video scan

-90º rotated video scan

sheared video scan

non-rectangular video scan

zigzag video scan

spiral scan

feed-back-driven scans

atomic scan linear scan

-45º rotated (mirx (v scan))

perfect shuffle

until

GAU

auto-sequencing memory: generalized the DMA

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

1

2

3

4

5

6

7

8 y

x

1 2 3 4 5 6 7 8

JPEG zigzag

scan pattern v2

SouthScan is step by [0,1]

endSouthScan;

*> Declarations

NorthEastScan is loop 6 times until [*,1]

step by [1, -1] endloop

end NorthEastScan;

SouthWestScan is loop 7 times until [1,*]

step by [-1,1] endloop

end SouthWestScan;

HalfZigZag is EastScan

loop 3 times SouthWestScan

SouthScan NorthEastScan

EastScan endloop

end HalfZigZag;

goto PixMap[1,1]

HalfZigZag;

SouthWestScan

uturn (HalfZigZag)

HalfZigZag

data counter data counter

data counter data counter

2

1

3

4

HalfZigZag

Main program:

Flowware language example (MoPL): programming the datastream

x

y

63

EastScan is step by [1,0]

end EastScan;

MoPL: no instruction streams !

(an animation)

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

The End of Evolution

64

The end of evolutionary extension of current models

Disruptive research is required in programmability

An entirely new software stack is needed

Fundamental programming issues have to be revisited

We need a much deeper integration between applications and data inside all kinds of memory

Rethink all disciplines from circuit design and test, up to architecture, system design, storage behavior, compilers , run time systems, operating systems, and programming.

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

What means zettaFLOP ?

BTW:

July 2005:

an early

trailblazer

Name FLOPS

yottaFLOPS 1024

zettaFLOPS 1021

exaFLOPS 1018

petaFLOPS 1015

teraFLOPS 1012

gigaFLOPS 109

megaFLOPS 106

kiloFLOPS 103

ASCI Red: 850 kW

IBM Roadrunner: 2,483 kW

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Following the exemplar: our ideal:

Project has provided an educational framework to create a population of designers (and researchers) needed

66

The most influential research project in modern computer history.

The brain child of:

Carver Mead Lynn Conway

(~1980-…) The VLSI Revolution

Solving the VLSI design crisis: missing reply to Moore‘s Law

missing designer population, design tools (SW), HW for a design as a whole

Incubator of EDA* industry, workstations…

comparable to our scenario

text book [1980] ( Bestseller ! )

[email protected] 9 June 2013

Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 12

Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de

© 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

VLSI Design Education Spreading Rapidly

1980 - 1983

67

world-wide

Carver Mead

Lynn Conway

incubator of the EDA industry etc.

The most effective project in the history of modern computer science

67 © 2012, [email protected] http://hartenstein.de

TU Kaiserslautern

Flowware Languages

68

MoPL: fully supporting the anti machine paradigm – the counterpart of the von Neumann paradigm

general purpose:

Streams-C: defines 1-D streams; generates VHDL

DSP-C: allows to describe key features of DSPs

specialized:

Brook: for modern graphics hardware

StreaMIT


Recommended