Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | abel-cooper |
View: | 223 times |
Download: | 1 times |
1
DSP Implementation on FPGA
Ahmed ElhossiniENGG*6090 : Reconfigurable Computing
SystemsWinter 2006
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 2
References Reconfigurable Computing for Digital Signal Processing: A Survey ,
RUSSELL TESSIER AND WAYNE BURLESON, Journal of VLSI Signal Processing 28, 7–27, 2001
FPGA implementations of fast Fourier transforms for real-time signal and image processing, I.S. Uzun, A. Amira and A. Bouridane , IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 3, June 2005.
Image Processing Algorithms on Reconfigurable Architecture using HandelC, V Muthukumar and Daggu Venkateshwar Rao, Proceedings of the EUROMICRO Systems on Digital System Design (DSD’04).
Experiences on developing computer vision hardware algorithms using Xilinx system generator, Ana Toledo Moreo, Pedro Navarro Lorente, F. Soto Valles, Juan Suardı´az Muro*, Carlos Ferna´ndez Andre´s , Microprocessors and Microsystems 29 (2005) 411–419
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 3
Introduction The application domain of digital signal processing over the past decade
expanded because of the advance in VLSI technology. ASIC and programmable DSP processors was the implementation
mechanisms of choice for many DSP applications. In the last few decades new system implementations based on
reconfigurable computing are being considered. They offer the functional efficiency of hardware and the programmability
of software. These flexible platforms are quickly maturing in logic capacity of
programmable devices and the availability of embedded modules (Multipliers and Hard Cores).
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 4
Architectural Requirements for DSP
Data path configured for DSP Fixed-point arithmetic MAC- Multiply-accumulate
Multiple memory banks and buses Specialized addressing modes
Bit-reversed addressing Circular buffers
Specialized execution control Specialized peripherals for DSP
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 5
Choice Measures
Performance. Cost Power Flexibility
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 6
DSP Implementation
Pure HWImplementation
Hybrid HW/SWImplementation
(HW/SW Codesign)
Pure SWImplementation
DSPImplementation
ASIC
Soft or Hard coreProcessor on
ReconfigurableDevice/Hardware
Accelerator
DSP processor/Hardware
accelerator onReconfigurable
Device
General PurposProcessor
DSP ProcessorHard/Soft Core on
ReconfigurableDevice
MicroBlaze PowerPC Nios
ReconfigurableDevices
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 7
Topics Covered
FFT Implementation of FPGA. Image Processing Algorithms on Reconfigurable
Architecture using Handel-C. Experiences on developing computer vision
hardware algorithms using Xilinx system generator.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 8
Handle C
Handel-C is essentially an extended subset of the standard ANSI-C language, specifically designed for use in a hardware environment.
Unlike other C to FPGA tools Handel-C allows hardware to be directly targeted fromsoftware, allowing a more efficient implementation to be created.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 9
Xilinx System Generator
System Generator is a tool box added to MATLAB simulink.
It allow a graphical representation of the algorithm.
Includes many blocks that are commonly used by DSP algorithms.
Allow converting directly to HDLs.
10
FPGA implementations of fast Fourier transforms forreal-time signal and image processing
I.S. Uzun, A. Amira
A. Bouridane
IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 3, June
2005
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 11
Target The design and implementation of a parametrisable architecture, which
provides a framework for the implementation of different types of 1-D FFT algorithms.
The development of an FPGA-based FFT library by implementing radix-2, radix-4, split-radix and FHT algorithms in order to provide system designers and engineers with the flexibility to meet different system requirements (such as chip area, memory etc.) with given hardware resources.
The evaluation and comparison of hardware implementations of aforementioned FFT algorithms. The performance measures to be considered in comparisons are the computation speed, maximum system frequency, chip area and memory usage.
The design and implementation of a generic parallel 2-D FFT architecture for real-time image processing applications for use to enhance large medical and astronomical images using frequency-domain filtering techniques.
The development of an FPGA-based parametrisable system for frequency-domain filtering of large images.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 12
FFT Implementation on FPGA
Implementing 4 Different Transforms Radix 2 FFT Radix 4 FFT Split Radix FFT Fast Hartley transform
Introduce a parallel version of the 2D parallel FFT transform based on Radix 2 and Radix 4. Make use of more FFT processing elements to perform
computation.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 13
Proposed system for FFT implementation
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 14
Butter-Fly Used With Different Architectures
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 15
Functional block diagram of 1-D FFT processor architecture
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 16
Block diagram of radix-2 butterfly used in FPGA FFT processor
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 17
Architectural block diagram of AGU
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 18
Computation time (us) of different algorithms for 1024 point FFT
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 19
Functional block diagram of parallel 2-D FFT processor architecture
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 20
Computation time and Device utilization
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 21
2-D FFT performance comparison with existing FPGA-based designs
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 22
Conclusion
This work introduces an implementation platform for FFT Transforms.
Handle-C is used as the description language.
A comparison of this implementation shows a lower execution time with a reasonable resource utilization.
23
Image Processing Algorithms on Reconfigurable Architecture using HandelC
V Muthukumar and Daggu Venkateshwar Rao
Proceedings of the EUROMICRO Systems on Digital System Design (DSD’04)
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 24
Target In this work the canny edge detection architecture for 2D
images has been developed using reconfigurable architecture and hardware modeled using Handle-C.
The algorithm involve the implementation of different image processing algorithms such as:
First the image is smoothed by Gaussian Convolution which is 5x5 convolution operation.
Morphological Operation, which is 3x3 operator on the image. 2D convolution.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 25
Implementation
The algorithm is modeled using Handle-C. It is implemented using the EDK2 and
RC1000-PP XilinxVertex-E FPGA. This chip doesn’t have any embedded multiplier.
The hardware implementation is compared to a software implementation using a PC with pentium processor at 1300MHz Frequancy.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 26
Architecture of 3x3 moving window
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 27
Edge Detection Architecture
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 28
Results
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 29
Results
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 30
Conclusion
Handle C is used to implement 2D convolution which is used to implement edge detection.
The implementation is compared to VC++ implementation on P3 1300MHz processer, and shows a better performance.
31
Experiences on developing computer vision hardware algorithms using Xilinx system generator
Ana Toledo Moreo, Pedro Navarro Lorente, F. Soto Valles,
Juan Suardı´az Muro*, Carlos Ferna´ndez Andre´s
Microprocessors and Microsystems 29 (2005) 411–419
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 32
Target
This paper shows how the Xilinx system generator (XSG) environment can be used to develop hardware-based computer vision algorithms from a system level approach, which makes it suitable for developing co-design environments.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 33
Application Examples Binarization algorithm
Converting a gray scale image into a black and white binary image.
Xilinx System Generator is used to implement this unit.
Compared with a VHDL implementation. Generalized convolution blocks
Convolution is one of the basic image processing algorithms.
Xilinx System Generator is used to implement different type of algorithms.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 34
Modular-blockset-based hardware binarization block
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 35
VHDL-based hardware binarization block
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 36
Hardware convolution block
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 37
Hardware binarization block implementation results
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 38
Generalized hardware convolution implementation results
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 39
Results
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 40
Conclusion
This work demonstrate the use of Xilinx System Generator to implement Image processing algorithm.
A comparison is made to the VHDL implementation and show a competitive results.
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 41
ResultsSystem FPGA Tool Alternative
Implementation
Main Algorithn
FPGA implementations of fast Fourier transforms for real-time signal and image processing
Xilinx XCV2000ERC1000-PP
Board
Handle C with EDK2
Comparison with Different implementations
4 FFT algorithms
Image Processing Algorithms on Reconfigurable Architecture using HandelC
Xilinx XCV2000ERC1000-PP
Board
Handle C with EDK2
VC++ Program on P3 1300 MHZ processor
2D Convulsion and Edge Detection
Experiences on developing computer vision hardware algorithms using Xilinx system generator
XilinxXCV800
Xilinx System Generator
VHDL Binarization and 2D convulsion
ENGG*6090 – Winter 2006 DSP Implementation on FPGA 42
Conclusion In this review 3 different papers on
implementing DSP algorithms on FPGA are demonstrated.
Handle C is an efficient tool to implement DSP algorithms and provide a competitive result to those of current HDLs.
Xilinx System Generator , which is tool based on MathWorks MATLAB, is an good tool to implement DSP systems.
Modern tools for implementing DSP algorithms could be used to replace the current HDLs.
43
Thank You
Questions ?