+ All Categories
Home > Documents > ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs ›...

ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs ›...

Date post: 28-May-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
113
High-Level Synthesis Part 1 ECE 699: Lecture 9
Transcript
Page 1: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

High-Level Synthesis Part 1

ECE 699: Lecture 9

Page 2: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Required Reading

•  Chapter 14: Spotlight on High-Level Synthesis •  Chapter 15: Vivado HLS: A Closer Look

The ZYNQ Book

S. Neuendorffer and F. Martinez-Vallina, Building Zynq Accelerators with Vivado High Level Synthesis, FPGA 2013 Tutorial (selected slides on Piazza)

Page 3: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Recommended Reading

G. Martin and G. Smith, “High-Level Synthesis: Past, Present, and Future,” IEEE Design & Test of Computers, IEEE, vol. 26, no. 4, pp. 18–25, July 2009. Vivado Design Suite Tutorial, High-Level Synthesis, UG871, Nov. 2014 Vivado Design Suite User Guide, High-Level Synthesis, UG902, Oct. 2014 Introduction to FPGA Design with Vivado High-Level Synthesis, UG998, Jul. 2013.

Page 4: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

4 ECE 448 – FPGA and ASIC Design with VHDL

Behavioral Synthesis

Algorithm

I/O Behavior

Target Library

Behavioral Synthesis

RTL Design

Logic Synthesis

Gate level Netlist

Classic RTL Design Flow

Page 5: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

5 ECE 448 – FPGA and ASIC Design with VHDL

Need for High-Level Design

•  Higher level of abstraction •  Modeling complex designs •  Reduce design efforts •  Fast turnaround time •  Technology independence •  Ease of HW/SW partitioning

Page 6: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

6 ECE 448 – FPGA and ASIC Design with VHDL

Platform Mapping SW/HW Partitioning

Software (executed in

the microprocessor system)

Hardware (executed in

the reconfigurable processor system)

Program

Page 7: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

7 ECE 448 – FPGA and ASIC Design with VHDL

SW/HW Partitioning & Coding Traditional Approach

Specification

SW/HW Partitioning

SW Coding HW Coding

SW Compilation HW Compilation

SW Profiling HW Profiling

Page 8: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

8 ECE 448 – FPGA and ASIC Design with VHDL

SW/HW Partitioning & Coding New Approach

Specification

SW/HW Coding

SW Compilation HW Compilation

SW Profiling HW Profiling

SW/HW Partitioning

Page 9: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

9 ECE 448 – FPGA and ASIC Design with VHDL

Advantages of Behavioral Synthesis

•  Easy to model higher level of complexities •  Smaller in size source compared to RTL code •  Generates RTL much faster than manual method •  Multi-cycle functionality •  Loops •  Memory Access

Page 10: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

10

Generation 1 (1980s-early 1990s): research period Generation 2 (mid 1990s-early 2000s): •  Commercial tools from Synopsys, Cadence, Mentor Graphics, etc. •  Input languages: behavioral HDLs Target: ASIC Outcome: Commercial failure Generation 3 (from early 2000s): •  Domain oriented commercial tools: in particular for DSP •  Input languages: C, C++, C-like languages (Impulse C, Handel C, etc.),

Matlab + Simulink, Bluespec •  Target: FPGA, ASIC, or both Outcome: First success stories

Short History of High-Level Synthesis

Page 11: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

11 ECE 448 – FPGA and ASIC Design with VHDL

Hardware-Oriented High-Level Languages

•  C-Based System level languages •  Commercial

•  Handel C -- Celoxica Ltd. •  Impulse C -- Impulse Accelerated Technologies •  Carte C – SRC Computers •  SystemC -- The Open SystemC Initiative

•  Research •  Streams-C -- Los Alamos National Laboratory •  SA-C -- Colorado State University, University of

California, Riverside, Khoral Research, Inc. •  SpecC – University of California, Irvine and

SpecC Technology Open Consortium

Page 12: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

12 ECE 448 – FPGA and ASIC Design with VHDL

Other High-Level Design Flows

•  Matlab-based •  AccelChip DSP Synthesis -- AccelChip

•  System Generator for DSP -- Xilinx

•  GUI Data-Flow based •  Corefire -- Annapolis Microsystems

•  Java-based •  Commercial

•  Forge -- Xilinx

•  Research •  JHDL – Brigham Young University

Page 13: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

13 ECE 448 – FPGA and ASIC Design with VHDL

Handel-C Overview

•  High-level language based on ISO/ANSI-C for the implementation of algorithms in hardware

•  Allows software engineers to design hardware without retraining

•  Clean extensions for hardware design including flexible data widths, parallelism and communications

•  Well defined timing model •  Each statement takes a single clock cycle

•  Includes extended operators for bit manipulation, and high-level mathematical macros (including floating point)

Page 14: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

14 ECE 448 – FPGA and ASIC Design with VHDL

Handel-C/ANSI-C Comparisons

Preprocessors  i.e.  #define  

Structures  ANSI-­‐C  Constructs  for,  while,  if,  switch  

Func=ons  

Arrays  

Pointers  

Arithme=c  operators  

Bitwise  logical  operators  

Logical  operators  

ANSI-­‐C  Standard  Library  

Recursion  

Floa=ng  Point  

Handel-­‐C  Standard  Library  

Parallelism  

Arbitrary  width  variables  

RAM,  ROM  Signals  

Interfaces  

Enhanced  bit  manipula=on  

ANSI-­‐C   HANDEL-­‐C  

Page 15: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

15 ECE 448 – FPGA and ASIC Design with VHDL

Handel-C Design Flow

Executable  Specifica=on  

Handel-­‐C  

Synthesis  

Place  &  Route  

VHDL  

EDIF  EDIF  

Page 16: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

16 ECE 448 – FPGA and ASIC Design with VHDL

More abstract, lessimplementation-

specific

Less abstract, moreimplementation-

specific

RTL Domain(Implementation-specific)

Timed C Domain(Implementation-specific)

Untimed C Domain(Non-implementation-specific)

Ver

ilog

and

VH

DL

Sys

tem

C

Aug

men

ted

C/C

++

Pur

e C

/C++

The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043

Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Different Levels of C/C++ Synthesis Abstraction

Page 17: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

17 ECE 448 – FPGA and ASIC Design with VHDL

- Non-implementation-specific- Easy to create- Fast to simulate- Easy to modify

Pure C/C++

Gate-levelnetlist

Verilog /VHDL RTL

LUT/CLB-level netlist

ASICtarget

Pure C/C++Synthesis

User interactionand guidence

Verilog /VHDL RTL

RTLSynthesis

RTLSynthesis

FPGAtarget

Auto-generated,implementation-specific

Pure Untimed C/C++ Design Flow

The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043

Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Page 18: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

18 ECE 448 – FPGA and ASIC Design with VHDL

Mentor Graphics – Catapult C

Page 19: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

19 ECE 448 – FPGA and ASIC Design with VHDL

•  Catapult C automatically converts un-timed C/C++ descriptions into synthesizable RTL.

Mentor Graphics – Catapult C

Page 20: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

20 ECE 448 – FPGA and ASIC Design with VHDL

SystemC -based design-flow alternatives

SystemC

Auto-RTL Translation

Verilog / VHDL RTL

RTL Synthesis

SystemC Synthesis

Gate-level netlist

Implementation specific, relatively slow to simulate, relatively difficult to modify

Alternative SystemC flows

Page 21: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

21 ECE 448 – FPGA and ASIC Design with VHDL

SystemC Evolution

Sys

tem

C 2

.0

Sys

tem

C1.

0RTL

Behavioral/Transaction-

level

Algorithmic

System

Timed

Untimed

The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043

Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Page 22: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

22 ECE 448 – FPGA and ASIC Design with VHDL

Reconfigurable Supercomputers

Page 23: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

23 ECE 448 – FPGA and ASIC Design with VHDL

Interface

µP memory

µP memory . . .

µP µP . . .

I/O Interface

FPGA memory

FPGA memory

. . .

FPGA FPGA . . .

I/O

Microprocessor system Reconfigurable system

What is a Reconfigurable Computer?

Page 24: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

24 ECE 448 – FPGA and ASIC Design with VHDL

Reconfigurable Supercomputers

Machine Released

SRC 6 from SRC Computers Cray XD1 from from Cray SGI Altix from SGI SRC 7 from SRC Computers, Inc,

2002 2005 2005 2006

Page 25: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

25 ECE 448 – FPGA and ASIC Design with VHDL

Pros and cons of reconfigurable computers

+ can be programmed using high-level programming languages, such as C, by mathematicians & scientist themselves + facilitates hardware/software co-design + shortens development time, encourages experimentation and complex optimizations + allows sharing costs among users of various applications -  high entry cost (~$100,000) -  hardware aware programming -  limited portability -  limited availability of libraries - limited maturity of tools.

Page 26: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

26 ECE 448 – FPGA and ASIC Design with VHDL

SRC Programming Model

Microprocessor FPGA main.c

function_1() function_2()

ANSI C

function_1

function_2

macro_1(a, b, c) macro_2(b, d) macro_2(c, e)

macro_3(s, t) macro_1(n, b) macro_4(t, k)

FPGA

Macro_1

Macro_2 Macro_2

a

b c

d e MAP C

(subset of ANSI C)

I/O

I/O

Libraries of macros

VHDL

macro_1 macro_2 macro_3 macro_4 ……………………….

Page 27: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

27 ECE 448 – FPGA and ASIC Design with VHDL

SRC Compilation Process

Object files

Application sources Macro sources

MAP Compiler µP Compiler

Logic synthesis

Place & Route

Linker

.v files

.bin files

. ngo files

.o files .o files

Application executable

Configuration bitstreams

HDL sources

Netlists

.c or .f files . vhd or .v files

Logic synthesis

Place & Route

Linker

.v files

.bin files

. ngo files

HDL sources

. or .mc or .mf files

Page 28: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

28 ECE 448 – FPGA and ASIC Design with VHDL

Library Development - SRC

HLL (C, Fortran)

HDL (VHDL, Verilog)

µP system

FPGA system

Application Programmer

Library Developer

HLL (C, Fortran)

HLL (C, Fortran)

LLL (ASM)

HLL (C, Fortran)

Page 29: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

29 ECE 448 – FPGA and ASIC Design with VHDL

SRC Programming Environment

+ very easy to learn and use + standard ANSI C + hides implementation details + very well integrated environment + mature - in production use for over 4 years with constant improvements - subset of C - legacy C code requires rewriting - C limitations in describing HW (paralellism, data types) - closed environment, limited portability of code to HW platforms other than SRC

Page 30: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

30 ECE 448 – FPGA and ASIC Design with VHDL

Application Development for Reconfigurable Computers

Program Entry

Compilation

Execution

Platform mapping

Debugging & Verification

Page 31: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

31 ECE 448 – FPGA and ASIC Design with VHDL

Ideal Program Entry

Program Entry

Function

Page 32: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

32 ECE 448 – FPGA and ASIC Design with VHDL

Actual Program Entry

SW/HW Partitioning

Data Transfers & Synchronization

Use of Internal and External Memories

Sequence of Run-time Reconfigurations

Use of FPGA Resources

(multipliers, µP cores)

Preferred Architectures

Program Entry

Function

FPGA Mapping

SW/HW Interface

Page 33: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

33

AutoESL Design Technologies, Inc. (25 employees) Flagship product: AutoPilot, translating C/C++/System C to VHDL or Verilog •  Acquired by the biggest FPGA company, Xilinx Inc., in 2011 •  AutoPilot integrated into the primary Xilinx toolset, Vivado, as Vivado HLS, released in 2012 “High-Level Synthesis for the Masses”

Cinderella Story

Page 34: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

High Level Language C, C++, System C

Hardware Description Language VHDL or Verilog

Vivado HLS

Vivado HLS

Page 35: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

High-Level Synthesis

HDL  Code  

Physical Implementation FPGA  Tools  

Netlist  

Post  Place  &  Route  

Results  

Functional Verification

Timing Verification

Reference  ImplementaAon  in  C  

Test  Vectors  

Manual Modifications (pragmas, tweaks)

HLS-­‐ready  C  code  

HLS-Based Development and Benchmarking Flow

Page 36: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

36

–  Open-source HLS Tool

•  Developed at the University of Toronto •  Faculty supervisors: Jason H. Anderson and Stephen Brown •  FPL Community Award 2014

–  High-Level Synthesis from C to Verilog –  Targets Altera FPGAs (extension to Xilinx relatively simple) –  Two flows

•  Pure Hardware •  Hardware/Software Hybrid = Tiger MIPS + hardware accelerator(s) + Avalon bus + shared on-chip and off-chip memory

LegUp – Academic Tool for HLS

Page 37: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

37

–  Domain specific language for cryptology: Cryptol

•  High-level programming language similar to Haskell •  Developed by Galois Inc. based in Portland, USA

–  High-Level Synthesis from Cryptol to efficient Software and Hardware

Cryptol – New Language for Cryptology

Modified C

SW benchmarking HW benchmarking SW benchmarking HW benchmarking

Cryptol Reference C

Optimized C

HLS SW HLS HW HLS

HDL HDL Optimized C

Page 38: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Levels of Abstraction in FPGA Design

Page 39: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

High-Level Synthesis vs. Logic Synthesis

Page 40: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Algorithm and Interface Synthesis

Page 41: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Vivado HLS Design Flow

Page 42: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Design Trade-offs Explored Using HLS

Page 43: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

C Functional Verification and C/RTL Cosimulation

in Vivado HLS

Page 44: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Vivado HLS

Page 45: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Vivado HLS Scheduling and Binding

Page 46: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Vivado HLS Scheduling and Binding

Scheduling – translation of the RTL statements interpreted from the C code into a set of operations, each with an associated duration in terms of clock cycles. Affected by the clock frequency, uncertainty, target technology, and user directives. Binding - associating the scheduled operations with the physical resources of the target device.

Page 47: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Three Possible Outcomes from HLS Average of 10 numbers

Page 48: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Source: The Zynq Book

Vivado HLS Synthesis Process

Page 49: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Native Integer Data Types of C

Source: The Zynq Book

Page 50: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Arbitrary Precision Integer Data Types of C and C++ Accepted by Vivado HLS

Source: The Zynq Book

Page 51: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Arbitrary Precision Integer Types of C and C++

Source: The Zynq Book

Page 52: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Native Floating-Point Data Types of C

Source: The Zynq Book

Page 53: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Fixed-point Word Format

Source: The Zynq Book

Page 54: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Arbitrary Precision Fixed-Point Data Types used in Vivado HLS

Source: The Zynq Book

W – total width, I – number of integer bits Q – quantization mode, O – overflow mode,

N – number of saturation bits in overflow wrap modes

Page 55: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Quantization modes for for the C++ ap_fixed and ap_ufixed types

Source: The Zynq Book

Page 56: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Truncation to zero

Source: UG902 Vivado Design Suite User Guide, High-Level Synthesis

Page 57: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Overflow modes for for the C++ ap_fixed and ap_ufixed types

Source: The Zynq Book

Page 58: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Wraparound

Source: UG902 Vivado Design Suite User Guide, High-Level Synthesis

Page 59: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

C++ code with the declaration of fixed point variables

Source: The Zynq Book

Page 60: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

System C Data Types

Source: The Zynq Book

Page 61: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

An Example Top-Level Function for HLS

Source: The Zynq Book

Page 62: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Simplified Interface Diagram for the Example Top-Level Function

Source: The Zynq Book

Page 63: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Synthesis of Port Directions

Source: The Zynq Book

Page 64: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Default Port Level Types and Protocols

Source: The Zynq Book

Page 65: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Data flow between Vivado HLS blocks

Source: The Zynq Book

Page 66: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

RTL Interface Diagram Showing Default Block Level Ports and Protocols

Source: The Zynq Book

Page 67: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

67  

 Can High-Level Synthesis Compete Against a Hand-Written Code in the

Cryptographic Domain? A Case Study

Ekawat Homsirikamol & Kris Gaj George Mason University

USA

Project supported by NSF Grant #1314540

Page 68: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Primary  Author  

Ekawat Homsirikamol a.k.a “Ice”

Working on the PhD Thesis entitled

“A New Approach to the Development of Cryptographic Standards Based

on the Use of High-Level Synthesis Tools”

Page 69: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

69

Manual Design

HDL  Code  

Manual Optimization FPGA  Tools  

Netlist  

Post  Place  &  Route  

Results  

Functional Verification

Timing Verification

Informal  SpecificaAon   Test  Vectors  

Traditional Development and Benchmarking Flow

Page 70: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

70

Manual Design

HDL  Code  

Option Optimization FPGA  Tools  

Netlist  

Post  Place  &  Route  

Results  

Functional Verification

Timing Verification

Informal  SpecificaAon   Test  Vectors  

Extended Traditional Development and Benchmarking Flow

GMU ATHENa

Page 71: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

ATHENa – Automated Tool for Hardware EvaluatioN

71  

Benchmarking open-source tool, written in Perl, aimed at an

AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms

Currently under development at George Mason University

http://cryptography.gmu.edu/athena

Page 72: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

72

•  batch mode of FPGA tools

•  ease of extraction and tabulation of results •  Text Reports, Excel, CSV (Comma-Separated Values)

•  optimized choice of tool options •  GMU_optimization_1 strategy

Generation of Results Facilitated by ATHENa

vs.

Page 73: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

73

High-Level Synthesis

HDL  Code  

Option Optimization FPGA  Tools  

Netlist  

Post  Place  &  Route  

Results  

Functional Verification

Timing Verification

Reference  ImplementaAon  in  C  

Test  Vectors  

Manual Modifications (pragmas, tweaks)

HLS-­‐ready  C  code  

HLS-Based Development and Benchmarking Flow

GMU ATHENa

Page 74: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

74

•  Algorithm: AES-128 •  Mode of operation: Counter (CTR) •  Protocol and interface: GMU proposal •  Two vendors: Xilinx & Altera •  Four different FPGA families

ü  Xilinx Spartan-6 (X-S6) ü  Xilinx Virtex-7 (X-V7) ü  Altera Cyclone IV (A-CIV) ü  Altera Stratix V (A-SV)

Case Study

Page 75: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

75

•  Vivado HLS 2014.1

•  Xilinx ISE v14.7

•  Altera Quartus II v13.0sp1

•  ATHENa v0.6.4 (with GMU_optimization_1)

Tools & Tool Versions

Page 76: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

76

Interface & Protocol

Page 77: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

77

Top-Level

Page 78: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

78

Reference Hardware Design in RTL VHDL

Page 79: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

79

RTL Result

Latency = 11 cycles Time between two consecutive outputs = 10 cycles

Page 80: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

80

Software Design

Reference Code •  Source: P. Barreto and V. Rijmen, “Reference code in

ANSI C v2.2,” Mar. 2002.

HLSv0 •  Removed support for decryption •  Removed support for different AES variants

Page 81: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

81

HLSv0: Xilinx Results

Latency = 7367 cycles

Page 82: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

82

HLSv1: Code Refactoring

Refactor the code to match the target AES architecture

•  KeyScheduling is performed once per round •  Improved Galois field multiplication operation •  Included last round as part of the core loop

Page 83: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

83

HLSv1: Xilinx Results

Latency = 3224 cycles

Page 84: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

84

HLSv2: Optimization directives: ARRAY_RESHAPE

Ø  Change an array shape in the output hardware

void AES_encrypt (word8 a[4][4], word8 k[4][4], word8 b[4][4]) { #pragma HLS ARRAY_RESHAPE variable=a[0] complete dim=1 reshape #pragma HLS ARRAY_RESHAPE variable=a[1] complete dim=1 reshape #pragma HLS ARRAY_RESHAPE variable=a[2] complete dim=1 reshape #pragma HLS ARRAY_RESHAPE variable=a[3] complete dim=1 reshape #pragma HLS ARRAY_RESHAPE variable=a complete dim =1 reshape

Page 85: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

85

HLSv2: Optimization directives: UNROLL & INLINE

Ø  Unroll a loop OutputLoop: for (i = 0; i < 4; i ++) #pragma HLS UNROLL for (j = 0; j < 4; j ++) #pragma HLS UNROLL b[i][j] = s[i][j];

Ø  Flatten a function's hierarchy for improved performance

void KeyUpdate (word8 k[4][4], word8 round) { #pragma HLS INLINE ... }

Page 86: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

86

HLSv2: Optimization directives: RESOURCE & INTERFACE

Ø  Specify the type of FPGA resource to be used by the target variable

word32 rcon[10] = { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36 };

#pragma HLS RESOURCE variable=Rcon0 core=ROM_1P_1S

Ø  Direct how an input/output port should behave, i.e., registered or handshake mode

void AES_encrypt (word8 a[4][4], word8 k[4][4], word8 b[4][4]) { #pragma HLS INTERFACE register port=b

Page 87: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

87

HLSv2: Xilinx Results

Latency = 11 cycles

Page 88: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

88

HLSv2: HLS vs. RTL, Frequency - Area

Page 89: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

89

HLSv2: HLS vs. RTL, Throughput - Area

Page 90: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

90

Source of Inefficiencies: Datapath vs. Control Unit

Datapath Control Unit

Data Inputs

Data Outputs

Control Inputs

Control Outputs

Control Signals

Status Signals

Determines •  Area •  Clock Frequency

Determines •  Number of clock cycles

Page 91: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

91

Datapath inferred correctly •  Frequency and area within 10% of manual designs Control Unit suboptimal •  Difficulty in inferring an overlap between completing the last round and reading the next input block •  One additional clock cycle used for initialization of the state at the beginning of each round •  The formulas for throughput:

RTL: Throughput = Block_size / (#Rounds * TCLK) HLS: Throughput = Block_size / ((#Rounds+2) * TCLK)

Source of Inefficiencies

Page 92: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

92

AES-ECB-ENC x2: HLS vs. RTL, Frequency - Area

Page 93: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

93

AES-ECB-ENC x2: HLS vs. RTL, Throughput - Area

Page 94: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

94

AES-CTR

Page 95: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

95

AES-CTR Results

Page 96: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

96

Full AES-CTR with I/O processors

Page 97: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

97

AES-CTR with IO Results

Page 98: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

98

Results for AES

gen_mod_add: if (G_OPERATOR = ADDER) generate end generate;

Page 99: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

99

•  Area and frequency of designs produced by High-Level Synthesis are comparable to handwritten RTL code

•  Small increase in the number of clock cycles reduces

throughput of HLS-based approach •  Complex I/O units can be created by HLS-based approach •  HLS-based design can compete against handwritten RTL

code when we have a specific architecture and latency in mind while preparing an HLS-ready HLL code

Conclusions

Page 100: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

Hardware  Benchmarking    of  SHA-­‐3  Finalists  

using  High-­‐Level  Synthesis  

Ekawat Homsirikamol & Kris Gaj George Mason University

Page 101: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

101

•  5 final SHA-3 candidates + old standard SHA-2 •  Most efficient sequential architectures (/2h for BLAKE, x4 for Skein, x1 for others) •  GMU VHDL codes developed during SHA-3 contest •  Reference software implementations in C

included in the submission packages

Hypotheses: •  Ranking of candidates will remain the same •  Performance ratios HDL/HLS similar across candidates

Our Test Case

Page 102: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

102

Manual RTL vs. HLS-based Results: Altera Stratix III

RTL HLS

Page 103: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

103

Manual RTL vs. HLS-based Results: Altera Stratix IV

RTL HLS

Page 104: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

104

Lack of Correlation for Xilinx Virtex 6

RTL HLS

Page 105: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

105

Lack of Correlation for Xilinx Virtex 6

RTL HLS

Page 106: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

106

Lack of Correlation for Xilinx Virtex 7

RTL HLS

Page 107: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

107

Ratios of Major Results RTL/HLS for Altera Stratix IV

Page 108: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

108

Ratios of Major Results RTL/HLS for Xilinx Virtex 6

Page 109: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

109

Datapath vs. Control Unit

Datapath Control Unit

Data Inputs

Data Outputs

Control Inputs

Control Outputs

Control Signals

Status Signals

Determines •  Area •  Clock Frequency

Determines •  Number of clock cycles

Page 110: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

110

Datapath inferred correctly •  Frequency and area within 30% of manual designs Control Unit suboptimal •  Difficulty in inferring an overlap between completing the last

round and reading the next input block •  One additional clock cycle used for initialization of the state at

the beginning of each round •  The formulas for throughput:

RTL: Throughput = Block_size / (#Rounds * TCLK) HLS: Throughput = Block_size / ((#Rounds+2) * TCLK)

Encountered Problems

Page 111: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

111

Hypothesis I: •  Ranking of candidates in terms of throughput, area, and throughput/

area ratio will remain the same TRUE for Altera Stratix III, Stratix IV

FALSE for Xilinx Virtex 5, Virtex 6, and Virtex 7 Hypothesis II: •  Performance ratios HDL/HLS similar across candidates

Hypothesis Check

Stratix III Stratix IV Frequency 0.99-1.30 0.98-1.19 Area 0.71-1.01 0.68-1.02 Throughput 1.10-1.33 1.09-1.27 Throughput/Area

1.14-1.55 1.17-1.59

Page 112: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

112

Correlation Between Altera FPGA Results and ASICs

Stratix III FPGA ASIC

Page 113: ECE699 lecture 9 - George Mason Universityece.gmu.edu › ... › S15 › viewgraphs › ECE699_lecture_9.pdf10 Generation 1 (1980s-early 1990s): research period Generation 2 (mid

113

Most Promising Methodology & Toolset

High-Level Synthesis Xilinx Vivado HLS

HDL  Code  

Option Optimization GMU ATHENa

FPGA  Tools  Altera  Quartus  II  

Reference  ImplementaAon  in  C  

Manual Modifications

HLS-­‐ready  C  code  

Results

Frequency & Throughput decrease Area increases by no more than 30% compared to manual RTL


Recommended