+ All Categories
Home > Documents > HW/SW Co-design

HW/SW Co-design

Date post: 27-Jan-2016
Category:
Upload: don
View: 37 times
Download: 3 times
Share this document with a friend
Description:
HW/SW Co-design. Lecture 5: Lab 3 – Active HW Accelerator Design. Course material designed by Professor Yarsun Hsu, EE Dept, NTHU RA: Yi-Chiun Fang, EE Dept, NTHU. Outline. Active Hardware Design Co-designed System on FPGA. ACTIVE HARDWARE DESIGN. Active Hardware. - PowerPoint PPT Presentation
14
HW/SW Co-design HW/SW Co-design Lecture 5: Lecture 5: Lab 3 – Active HW Lab 3 – Active HW Accelerator Design Accelerator Design Course material designed by Professor Yarsun Hsu, EE Dept, NT RA: Yi-Chiun Fang, EE Dept, NTHU
Transcript
Page 1: HW/SW Co-design

HW/SW Co-designHW/SW Co-design

Lecture 5:Lecture 5:Lab 3 – Active HW Accelerator Lab 3 – Active HW Accelerator

DesignDesign

Course material designed by Professor Yarsun Hsu, EE Dept, NTHURA: Yi-Chiun Fang, EE Dept, NTHU

Page 2: HW/SW Co-design

OutlineOutline

Active Hardware DesignCo-designed System on FPGA

Page 3: HW/SW Co-design

ACTIVE HARDWARE DESIGNACTIVE HARDWARE DESIGN

Page 4: HW/SW Co-design

Active HardwareActive HardwareMost devices in the real world have the ability to actively generate interruptsWhen the CPU detects that an interrupt is asserted, it saves a small amount of state and jumps to the kernel interrupt handler at a fixed address in memoryThe handler performs the corresponding processing (ISR), and executes a “return from interrupt” instruction to return the CPU to the execution state prior to the interrupt

Page 5: HW/SW Co-design

GRLIB IRQMP (1/2)GRLIB IRQMP (1/2)

Multiprocessor Interrupt ControllerAttached to AMBA bus as an APB slaveThe interrupts generated on the interrupt bus are all forwarded to the interrupt controllerThe interrupt controller prioritizes, masks and propagates the interrupt with the highest priority to the processor

Page 6: HW/SW Co-design

GRLIB IRQMP (2/2)GRLIB IRQMP (2/2)

IRQMP implements a two-level interrupt controller for 15 interruptsWhen any of the IRQ lines are asserted high, the corresponding bit in the interrupt pending register is setThe pending bits will stay set even if the IRQ line is de-asserted, until cleared by software or by an interrupt acknowledgefrom the processor

Page 7: HW/SW Co-design

Active 1-D IDCT HW Acc. (1/3)Active 1-D IDCT HW Acc. (1/3)

The data path is identical to its passive versionThe registered IRQ number is 15HIRQ line raises up for exactly one clock cycle right after the second stage completes

addrphase

dataphase

stage1

stage2

Raise HIRQ signal for one clock cycle

Page 8: HW/SW Co-design

Active 1-D IDCT HW Acc. (2/3)Active 1-D IDCT HW Acc. (2/3)

Every time the system is interrupted by the IDCT accelerator, its ISR will set a global variable idct_flag to 1cyg_uint32idct_isr(cyg_vector_t vector, cyg_addrword_t data){ unsigned long *idct_flag = (unsigned long *) data;

(*idct_flag) = 1;

cyg_interrupt_acknowledge(vector); return CYG_ISR_HANDLED;}

Page 9: HW/SW Co-design

Active 1-D IDCT HW Acc. (3/3)Active 1-D IDCT HW Acc. (3/3)

Instead of polling the device registers, we now wait for idct_flag to become 1We reset the flag back to 0 afterwardsstatic voidhw_idct_1d(short *dst, short *src, unsigned int mode){ ...

*c_reg = (long)((mode << 1) | 0x1);

while (idct_flag == 0){ /*busy waiting loop*/ } idct_flag = 0; ...}

Page 10: HW/SW Co-design

CO-DESIGNED SYSTEM ON CO-DESIGNED SYSTEM ON FPGAFPGA

Page 11: HW/SW Co-design

Build SW ApplicationBuild SW Application

In addition to the flags mentioned in the previous labs, we use -D_HW_ACTIVE_ flag to enable the use of IDCT ISR

This flag will only work if -D_HW_ACC_ flag is set

Use make to build the new version

Page 12: HW/SW Co-design

Install IDCT AcceleratorInstall IDCT Accelerator

We replace grlib-gpl-1.0.19-b3188/lib/esw/idct_acc/idct_1x8.vhd with lab_pkg/lab3/hw/idct_1x8.vhdUse make ise | tee ise_log to build the bitstream

Page 13: HW/SW Co-design

Profiling Results (1/2)Profiling Results (1/2)

Build the program with -D_PROFILING_ flag onCompare the computation results of sw_idct_2d() and hw_idct_2d()Compare thecomputationresults withand without-D_HW_ACTIVE_flag

Page 14: HW/SW Co-design

Profiling Results (2/2)Profiling Results (2/2)

The active version is still faster than the pure SW implementation but much slower than its passive version

Interrupt latencyThe calculation is too fast

Only lasts for two clock cycles The action bit is already reset to 0 when the CPU

polls the device registers for the first time

Interrupt is useful when the CPU gets to do other meaningful operations before the hardware completes


Recommended