Abstract
FPGA Implementation Of Non Linear Filters For
Image Processing
Mr. Hirschl [email protected]
Guide : Prof L. P. Yaroslavsky
AgendaBackground
Non Linear Filters Hardware and Flow
ResearchResearch goals Related workAlgorithms
ConclusionResultsDemoBibliography
The big pictureBio-Medical Imaging System require massive image processing
Image processing solution
Real time
Implemented in hardware
Focus on non linear filters.
FPGA
Non Linear FiltersBackground
Non Linear Filters Hardware and flow
ResearchResearch goals Related workAlgorithms
ConclusionResultsDemoBibliography
Non Linear Filters topicsUnified approach - definitionsWhat is a windowExample of Sliding window Types of non linear filtersNeighborhood & EstimationNon linear filters examples
Image enchantmentHistogram equalizationOther
Unification approach definitionFilters work in a moving window.
For each window a filter generate output value by means of a certain estimation operation ESTM applied to a certain set of values that we will call neighborhood NBH.
L.P. Yaroslavsky, Nonlinear Signal Processing Filters: A Unification Approach.
We take an image
Look at a small part on the left upper corner
It is made of 7 x 5 pixels
What is a window examplex
4
x 7
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
A 3 x 3 sliding window example
N = n x nNumber
Of
elements
Sliding Window
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
K nearest value (100,1)
NBH example
Sliding example
Unification approach pixel
L.P. Yaroslavsky, Nonlinear Signal Processing Filters: A Unification Approach.
Unification approach nbh estm
L.P. Yaroslavsky, Nonlinear Signal Processing Filters: A Unification Approach.
Lets look a 3 x 3 window
Window operations example118 108
114 104
107 110
102 105
108 110
108 110
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
1110
105 108 110 102 10499 100 101 103
99 100 101 102 105103 104 108 110
Vector it
Sort it
Min, Max, Median
Rank
Median example
Example for 5x 5 window median filter.
The images are before and after running in the hardware simulator
Lets look a 3 x 3 window
Window operations example118 108
114 104
107 110
102 105
108 110
108 110
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
1110
105 108 110 102 10499 100 101 103
7 8 9 4 61 2 3 5
28 57 85 113 198142 170 227 255
198 227 255 113 17028 57 85 142
Vector it
Get rank order statistics
Min, Max, Median
Create look up table
Histogram equalization
Histogram
Unification approach – hist eq
L.P. Yaroslavsky, Nonlinear Signal Processing Filters: A Unification Approach.
Unification approach – hist eq
Unification approach -example
L.P. Yaroslavsky, Nonlinear Signal Processing Filters: A Unification Approach.
Hardware and flowBackground
Non Linear Filters
Hardware and flowResearch
Research goals Related workAlgorithms
ConclusionResultsDemoBibliography
Hardware implementations topicsFPGAVHDLToolsFlow
GenerationImplementationVerificationAnalysis
VHDL Code generatorVerification suite
FPGA - ArchitectureCLB
IOB
OSCStartup
JTAG
Routes
ConfigurationMemory
Configure the FPGA to specific applicationConfigure the FPGA to specific applicationConfigure the FPGA to specific application
CLB
IOB
ROT
LOG
Field Programmable Gate Array
FPGA – Building blocks CLBLook Up Table - LUT
FF
Routes
FPGA – Building blocks IOBPAD
BUFFER
FF
FPGA – Building blocks ROUTEPSM -
Programmable
Switching
Matrix
VHDLHardware Description Language
Standard IEEE language for hardware generation & simulation
Top-Down design
Design reuse
Behavioral description
RTL Register Transfer Logic
VHDL Hardware description Language
SER_PAR
nRESETBIT_CLKCE
SD_IN PD_OUT
IEEE-1076-87,IEEE1076-93
Entity ser_par is port ( nRESET: in STD_LOGIC; BIT_CLK: in STD_LOGIC; CE: in STD_LOGIC; SD_IN: in PIXEL_TYPE ; PD_OUT: out PIXEL_LINE_TYPE );end ser_par;
architecture behave of ser_par issignal SER_REG : PIXEL_LINE_TYPE ;beginD_IN_LABEL:process (BIT_CLK, nRESET)Begin If ( nRESET='0') then SER_REG <= ZERO_LINE ; Elsif (rising_edge(BIT_CLK)) then --CLK rising edge if ( CE = '1' ) then
SER_REG(1) <= SD_IN; SER_REG(2 to IMAGE_DIMENSION)<=SER_REG( 1 to IMAGE_DIMENSION -
1); end if; end if;end process;
PD_OUT <=ZERO_LINE when ( nRESET = '0' ) else SER_REG ;
end behave;
example
FLOW – GeneralEntering the design
Synthesizing
Func Simulation
Implementation
Time Simulation
Programming file
Simulation
Design EntryVHDLSchematicsState Machine editor
?
Synthesiser
Implementation
Simulation?
FPGA Design Flow
TOOLSMatlab - modeling of a filter in HW writing style.
Xilinx WebPACK synthesizer, mapper , place and route
Model sim – VHDL model simulation
VHDL code generator
VHDL code generatorOne of the novelties in our work
Creates the required VHDL code
Support all window sizes
Vendor independent
Simple to use.
FPGA Design – verification
Take an image
MATLAB Make it into a stream files
Send it to simulator
Receive the simulator output vector stream
Verified in MATLAB environment VHDL model result Vs Matlab model result.
SD_OUT
nRESETBIT_CLKCELINE_CLKSD_IN
Simulation & Testing Environment
VHDL ModelImage generator
VHDL Model
VEC_IN
VEC_OUT
MATLAB Model
TEXT Files
MATLAB MODELSIM FAUNDATION
Research goalsBackground
Non Linear Filters Hardware and flow
Research
Research goals Related workAlgorithms
ConclusionResultsDemoBibliography
Research goals topicsAlgorithms implementation study
Create building blocks for real time image processing – LEGO style
Graphic Co-Processor
Long term goals
Algorithms implementation studyCompare different implementations for the same algorithmsCompare variations of the same algorithms
AreaSpeed - PerformanceLatencyPowerOther studies :
• Silicon regularity• Primitives usage• Pipe lining and routing issues
Create Processing BlocksSerial / Parallel sorter
Serial / Parallel Rank computer
Serial / Parallel Occurrences computer
Serial Histogrammer
Histogram equalization
Focus on the engine
Intellectual Property (IP) philosophy
Create Processing Blocks
A sorter – in this example 3 input vector
102
99
101
SORTERProcessing
Block
105
102
104
SORTERProcessing
Block
99
101
102
102
100
101
SORTERProcessing
Block
102
104
105
104
102
103
SORTERProcessing
Block
100
101
102
Create Processing Blocks
A median filter to denoise image
MEDIANProcessing
Block
MEDIANProcessing
Block
MEDIANProcessing
Block
MEDIANProcessing
Block
MEDIANProcessing
Block
Noisy Image Denoise Image
Graphic Co-ProcessorAdvanced Bio medical imaging systems
Accelerate graphic performance
Concentrate on non linear filters
Dedicated hardware
Single Instruction Multiple Data – SIMD
Configurable processor.
Artificial retinaNumerous works trying to progress in the field.
Related workBackground
Non Linear Filters Hardware and flow
ResearchResearch goals
Related workAlgorithms
ConclusionResultsDemoBibliography
Related work topicsGraphic processing hardware language
Specific image processors
Application Specific Integrated Circuit
ASIC’s and boards
Sorters
Histogrammer
Image language- crooksIn this works the group developed a high level language that is based on a set of image processing commands.
This language can be synthesize a flexible HW solution
Based on specific HW – non generic
Limited abilitiesP. Donachy, Design and Implementation of a High Level Image Processing Machine Using reconfigurable Hardware. PhD thesis, The Queen’s university of Belfast , Ireland 1996.D. Crookes, K. Benkrid, J. Smith, A. Benkrid, High Level Programming for Real Time FPGA-Based Video Processing, Proceedings of ICASSP2000, Istanbul 2000.D. Crookes, K. Benkrid, A. Bourdane, K. Alotaibi, A. Benkrid, Design and implementation of high level programming environment for FPGA-based image processing, IEEE Proc visual image process, Vol. 147 No. 4 August 2000.
ASIC Image processorA full fixed image processorImplemented in ASICRequired large memoryParallel approachOff line processing
100 MHz = 0.1Ghz = 10 nsS. Muller, A New Programmable VLSI Architecture for Histogram and Statistics Computation In Different Windows,IEEE08186-7310-9/95 Hamburg Germany 1995.
Fixed Image processorA image processor that is able to do
For a 3x3 window
Median, Morphological , addition , subtraction , mostly linear
100 MHz = 0.1Ghz = 10 nsK.wiatr, Pipeline Architecture of specialized reconfigurable processor in FPGA structures for real time pre-K.wiatr, Pipeline Architecture of specialized reconfigurable processor in FPGA structures for real time pre-processing,IEEE1089-6503/98 University of Krakow , Poland 1998.processing,IEEE1089-6503/98 University of Krakow , Poland 1998.
OtherOther sorters used specific cells
Combination of HW and software solution
R. Lin, S.Olariu, “Efficient VLSI Architecture for column sort”. IEEE Transactions on VLSI system Vol 7, NO 1, March 1999.M. Bednara, O. Beyer, J. Teich, R. Wanka, “Tradeoff Analysis And Architecture Design Of Hybrid Hardware/Software Sorter”, Application-Specific Systems, Architectures, and Processors, 2000. Proceedings., 10-12 July 2000 pg 299 –308.
AlgorithmsBackground
Non Linear Filters Hardware and flow
ResearchResearch goals Related work
AlgorithmsConclusion
ResultsDemoBibliography
Algorithms topicsSorters
Serial / Parallel
Rank computerSerial / Parallel
HistogrammerSerial / Parallel
Histogram equalization
Sorter Serial - basicCell
Value
Age
Sorter
Cells main shadow
Full Sorter
Not a First In First Out FIFO
Value
Age
Value
Age
Value
Age
0 1 9 -
7 3 5 1 9 0Over flow cell
0 1 5 99
7 3 5 1 9 0
1 5 9 -1 5 9
7 3 5 1 9 0
1 3 5 95 9
7 3 5 1 9 0
1 3 5 -
7 3 5 1 9 0
Sorter Serial – cellsMain Cell
Shadow cell
Cn/Age
An/Age
B Pixel In
BPixelIn
An-1/Age
B>An-1 B>An
An/Age
Retire_n-1
Cn+1Cn
Retire_n
Main element
Shadow element
F.F.
mux
mux
mux
F.F.
A_Out
C_Out
Age++
=
falsereteireAgeAge nretirementn 1&nC
OutC _ retirementn AgeAge 1nC
1nC truereteiren 1
=nAB nA
1, nn ABABOutA _
B1nA
1, nn ABAB
A 3 bit sorter
Parallel Sorter - basicDistributed Arithmetic's
112
101
DASmaller UPComparatorBigger DNDA
112
101
u1
x2
x1
a_in
b_in
a1_out
b1_outcomparator
multiplexerC-C&S
Q
QSET
CLR
D
Q
QSET
CLR
D
u1
x2
x1
a_in
b_in b1_out
comparator
multiplexer
flip-flop
S-C&S
3
2
1
2
3 1
3
1
2
3
>
>
>
C&S
C&S
C&S
Example
Parallel Sorter - pipelineFully pipe lined sorter.
Partly pipe lined sorter
Interesting enough the partly pipe line sorter is faster in some cases.
For example Adjustable parallel sorter works at 15 % faster then fully pipe lines sorter at 150 MHz.
Parallel Rank computer .Compare each pair
Sums up the comparisons
Use of comparator primitives
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
1
3
2
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
>
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Value RankAdderComparator
>
>1
0
0
0
1
1
0
0
0
0 1
2
0
Based on Prof Yaroslavsky work
SRC
HIST
OCC
Serial Rank computer - basic.Cell
Value
Rank
Computer
Value
Rank
Value
Rank
Value
Rank
7 3 5 1 9 0
12
93
01Clock
FIFO
Number In
Sampling timeCurrent time
7 3 5 1 9 0
52
11
93Clock
7 3 5 1 9 0
32
53
11Clock
Serial Rank computer - cells.First cell
Rank cell
FIFO
B_IN
An/Rankn
Rankn
An/Rankn
A0<B_InA1<B_In
An<B_In
First Rank Cell
ADDER
Flip-Flop
….
=
Lastn RankRank 1 nn RankRank
nRank
nin AB nRank ninLastn ABRankRank ,
1 nn RankRank
An-1/Rankn-1
An/Rankn
B>A
-1 A>=L
+1
Last Rank in
Rankn
Rankn
An/Rankn
An
An<B_In
Rank Cell
B_In
Flip-Flop
Comparator
Comparator
Subtractor
Adder
Serial Occurrences computer Based on Rank computer
First occurrence cell
Occurrence cell
B_IN
Histogram
Ra
nk
n
An
A1=B_In
An=B_In
First Occurrence Cell
ADDER
Flip-Flop
….
An
occurrence
An-1
An
Last Number in
An
An
Occurrence cell
B_In
Flip-Flop
+1
-1
Histogram
Flip-Flop
OCCn-1 OCCn
Histogrammer A 5 pixel FIFO , 256 level example
0
1
...
...255
-1
MUXMUX
+1
MUXMUX
=
.
.
108103101104101103113
0
1
...
...255
-1
MUXMUX
+1
MUXMUX
=
.
.
108103101104101103113
0
1
...1
255
-1
103103
+1
108108
=
.
.
108103101104101103113
0
0
...2
255
-1
113103
+1
103103
=
.
.
103101104101103113
108 leaves103 enters
103 leaves113 enters
FF HIST
Histogrammer - DPRA dual port RAM DPR histogrammer
DPR8 bit data
9 bit address DB_OUT
DA_OUT
DB_IN
DA_IN
+1
-1
Last
First
Two port,enable
The Access
To two memory
Cell on the
Same time.
Histogram equalizationMapping from a window gray scale[0-Max Pixel Value] range to
a full dynamic range:
[0-255]
Calculate the rank vector
Create a Divider using a look up table
integrate both to achieve this functionality
Histogram equalization
Rank Computer
LUT
Histogram equalization
slide
ResultsBackground
Non Linear Filters Hardware and flow
ResearchResearch goals Related workAlgorithms
Conclusion
ResultsDemoBibliography
Results topicsAnalysis of algorithms
Area
Speed - performance
Power
Latency
Conclusions
Results Speed - basicThe speed is for one operation on N elements and is defined in MHzFor reference a 8 bit counter run at 300MhzA 81 pixels sorter works at 147Mhz So 81 pixels will be sorted every 6.8 ns The speed is limited by the time it takes for a signal to propagate from one state element to the next state element.
Results Speed – Speed
The result are normalized to the slowest algorithm working at 96 MHz
Results SizeFor a N x N window
• Or a more General realization
N2N2N
2
S
N
2N
N
Results SizeFor a N x N window
• Or a more General realization
N2N2N
2
S
N
2N
N
Results LatencyThe latency is dependent on the architecture
mainly the number of state elements
N
S
N
2
Results powerThe power is dependent on
activity factor
area used N
2N2N
N2Nlog
N
Results HistogramsHistogram using DPR is very inexpensive in terms of area of the FPGA.
Each 256 DPR histogram takes about 1/32 of the available DPR
Results Histogram equalizationHistogram equalization makes use of the rank computer
The Look Up Table used to equalize the histogram is a ROM that is “free” of charge
Results UniquenessFocus on non linear filtering
Support any window size
Pipe line adjustable sorter
VHDL generator – configurable processor
HW oriented Matlab models
Full verification suite
IP approach
Analysis based on implementaions
Conclusion Parallel algorithms are faster then serial Parallel algorithms are more costly then serial ADA is better then DA sorter FPGA are fit to process high volume data The usage of FPGA for NLF is feasible. Algorithms
implementation studyCreate building blocks for real time image
processing – LEGO style
Graphic Co-ProcessorLong term goals - After the engine is ready we need the body and interface.
Further workGraphic Co-Processor
Long term goals - After the engine is ready we need the body and interface.
Building more blocks like , neighborhood creation.
Extending estimation operations
DemoVHDL – code generator
Implementation
Simulation
Image example
Thanks
TO:My wife Nava for her devoted support
Prof Yaroslavsky for patient guidance
Mr. Shalom Danny for helping in the GUI.
BibliographyNon linear filters
Artificial retina
Image processor
Sorters
Rank computer
Histogramming
Bibliography[1] J. Astola, P. Kuosmanen, Fundamentals of Nonlinear Digital Processing, CRC Press, Boca Raton, N.Y., 1997
[2] L. Yaroslavsky, Nonlinear Filters for Image Processing in Neuromorphic Parallel Networks, Optical Memory and Neural Networks, vol. 12, No. 1, 2003
[3] L. Yaroslavsky, Digital Holography and Digital Image Processing, Kluwer scientific publications, Boston, 2003, ch.12.
[4] A. Asano, K. Kazuyoshi, Y. Ichioka, “The nearest neighbor median filter: some deterministic properties and implementations”. Pattern Recognition Vol23, No. 10, pp.1059-1066, Great Britain 1990.
[5] P. Donachy, “Design and Implementation of a High Level Image Processing Machine Using reconfigurable Hardware”. PhD thesis, The Queen’s university of Belfast , Ireland 1996.
[6] D. Crookes, K. Benkrid, J. Smith, A. Benkrid, “High Level Programming for Real Time FPGA-Based Video Processing”. Proceedings of ICASSP2000, Istanbul 2000.
[7] D. Crookes, K. Benkrid, A. Bourdane, K. Alotaibi, A. Benkrid, “Design and implementation of high level programming environment for FPGA-based image processing”. IEEE Proc visual image process, Vol. 147 No. 4 August 2000.
[8] R. Lin, S.Olariu, “Efficient VLSI Architecture for column sort”. IEEE Transactions on VLSI system Vol 7, NO 1, March 1999.
[9] C. Hennind, T. G. Noll, “Architecture And Implementation Of BitSerial Sorter For Weighted Median Filter”. Custom Integrated Circuits Conference, Proceedings of the IEEE 1998, pg 189–192, University Of Technology RWTH Aachen, Germany.
[10] L.Lin, G.B. Adams II, E.J. Coyle, “Input Compression and Efficient Algorithms and Architectures for Stack filters”. IEEE proc. Winter Workshop on non linear digital signal processing, Tempere Finland pp.5.2-5 Jan 1993
[11] M. Bednara, O. Beyer, J. Teich, R. Wanka, “Tradeoff Analysis And Architecture Design Of Hybrid Hardware/Software Sorter”, Application-Specific Systems, Architectures, and Processors, 2000. Proceedings., 10-12 July 2000 pg 299 –308.
[12] N. Woolfries, P. Lysaght, S. Marshall, G. McGregor, D. Robinson, “Fast Implementations Of Non Linear Filters using FPGA’s”, Non-Linear Signal and Image Processing (Ref. No. 1998/284), IEE Colloquium on , 22 pg. 13/1-13/5 May 1998.
[13] J. H. Koo, T. S. Kim, S. S. Dong, C. H. Lee, “Development Of FPGA Based Adaptive Image Enhancement Filter System Using Genetic Algorithm” , Evolutionary Computation, 2002. CEC '02. Proceedings of the 2002 Congress on , Volume: 2 , pg 1480-1485 12-17 May 2002.
Sorters[1] J. Wiseman, A Hardware architecture for efficient Implementation of Real-Time Weighted median filter .www. [2] L.Lin, G.B. Adams II, E.J. Coyle, Input Compression and Efficient Algorithms and Architectures for Stack filters, IEEE proc. Winter Workshop on non linear digital signal processing, Tempere Finland pp.5.2-5 Jan 1993 [3] N. Woolfries, P Lysgat, S. Marshall, G. Mcgregor, D. Robinson, Fast implementation of Non-linear filters using FPGA. [4] R. Lin, S.Olariu, Efficient VLSI Architecture for column sort, IEEE Transactions on VLSI system Vol 7, NO 1 ,March 1999 [5] I. Hatirans., Y. Leblebci, Scalable Binary Sorting Architecture based on Rank Ordering with Linaer Area Time Complexity IEEE 0-7803-6598-4/00 2000 [6] M. Bednara,O .Beyer,J. Teich,R. Wanka, Tradeoff Analysis And Architecture Design Of Hybrid Hardware/Software Sorter, Paderborn University , Germany 2000. [7] C. Hennind, T. G. Noll, Architecture And Implementation Of Bit Serial Sorter For Weighted Median Filter, RWTH Aachen, Germany 1998.
SortersK.wiatr, Pipeline Architecture of specialized reconfigurable processor in FPGA structures for real time pre-processing,IEEE1089-6503/98 University of Cracow , Poland 1998.
S. Muller, A New Programmable VLSI Architecture for Histogram and Statistics Computation In Different Windows,IEEE 08186-7310-9/95 Hamburg Germany 1995.
design Implementation And Evaluation of a VLSI High Speed array Processor for real time image processing morphology operations 1990 !!!
A. Raghupathy,P. Hsu,K.J. Liu,N. Chandraxhoodan,VLSI Architecture and Design for High Performance Adaptive Video Scaling, IEEE 0-7803-5471-0/99, University of Maryland, USA 1999.
M. Kelly, K. W. Kenneth, W. Hsu, A flexible pipelined image processor, IEEE 0-7803-4980-6/98 NY,USA 1998
G. Angelopoulos,I. Pitas, A Fast Implementation of 2-D Weighted Median Filter,IEEE 1051-4691/94 University of Thessalonica Greece, 1994.
P.S. Windyga, Fast Impulsive Noise Removal, IEEE 1057–7149/01, University of central Florida Orlando 2001
2D median filter algorithm for parallel reconfigurable computers 1995
Any questions
END
VHDL - exampleCounter example
Entity D_FF isport ( D, CLK_S : in Bit; Q: out Bit := ‘0’ ; NQ: out Bit := ‘1’ );end entity D_FF;Architecture Behave of D_FF isbeginBIN_FF: process ( CLK_S) begin if ( CLK_S = ‘1’ and CLK_S’event ) then
Q <= D;NQ <= not ( Q );
end if;end process;end architecture Behave;
Entity COUNTER_BIN_N isgeneric (N: Integer := 4 );port ( IN_1 : in Bit; Q : out Bit_Vector ( 0 to N-1); );end entity COUNTER_BIN_N;Architecture Behave of COUNTER_BIN_N iscomponent D_FFport (D , CLK_S : in BIT; Q, NQ : out BIT);end component D_FF;signal S: Bit_vector( 0 to N);begin S(0) <= IN_1; G_1 : for I in 0 to N-1 generate
D_Flip_Flop: D_FF port map ( S(I+1), S(I), Q(I), S(I+1));end generate;end architecture Behave
D Q
nQCLK_S
COUNTER_BIN_NQ[0:3]IN_1
D Q
nQ
Clk_s
D Q
nQ
Clk_s
D Q
nQ
Clk_s
D Q
nQ
Clk_s
Entity COUNTER_BIN_N isport ( IN_1 : in Bit; Q : out Bit_Vector ( 0 to 3); );end entity COUNTER_BIN_3;Architecture Behave of COUNTER_BIN_3 iscomponent D_FFport (D , CLK_S : in BIT; Q, NQ : out BIT);end component D_FF;signal S: Bit_vector( 0 to 4);begin S(0) <= IN_1; G_1 : D_Flip_Flop: D_FF port map ( S1, S0, Q0, S1); G_2 : D_Flip_Flop: D_FF port map ( S2, S1, Q1, S2); G_3 : D_Flip_Flop: D_FF port map ( S3, S2, Q2, S3); G_4 : D_Flip_Flop: D_FF port map ( S4, S3, Q3, S4); Q <= S3 & S2 & S1 & S0;end architecture Behave
COUNTER_BIN_N
Q[0:3]IN_1
D Q
nQ
Clk_s
D Q
nQ
Clk_s
D Q
nQ
Clk_s
D Q
nQ
Clk_s
1) Number of the neighboring elements
with values lower the a
2) position of value a in a variational row ( ordered, in ascending values order sequence of the neighborhood elements)
3)
Rank
105 108 110 102 10499 100 101 103
7 8 9 4 61 2 3 5
99 100 101 102 105103 104 108 110
Original VectorRank Vectorvariational Vector
Number of the neighboring elements with the same value as that of the element a.
( defined for quantized values).
Histogram
105 108 110 102 10499 100 101 103
99 100 101 102 105103 104 108 110
Original Vector
Histogram
variational Vector
1 1 1 1 11 1 1 1
Non Linear filters
Lets look a 3 x 3 window
Histogram equalization118 108
114 104
107 110
102 105
108 110
108 110
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
1110
7 8 9 4 61 2 3 5
28 57 85 113 198142 170 227 255
198 227 255 113 17028 57 85 142
Get Pixels Ranks
Create look up table
Histogram equalization
For this 3 x 3 window
Morphological cross/lower part
Value +-2
Rank +-1
Neighborhood example I
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
118 108
114 104
107 110
102 105
108 110
102 102
116
108
113 102
113 103
100 102
101 104
99 100
101 103
104
108
112 101 101 104 103 105 111
Operation on Sliding window Running a window of : n x n pixels
N = n x n
N = Number of pixels
FPGA – Programmable Logiclogic functions :
• AND OR etc, and
Math functions : + , *
Memory, FF, State Elements• Flip Flop - FF
• Latch
• Random Access Memory - RAM
• Read Only Memory - ROM
• First In First Out - FIFO
• Dual Port Ram - DPR
AND function
B
C
A
1
00
1
B C
1
0
0
1
00
0
1Mulitply function
A
D - Flip FlopD Q
>CLK
Clock
Data In
Data out
CLK
1
00
1
D Q
Field Programmable Gate Array
FLOW – GeneralFunctional specification
Design specification
MATLAB simulation
Design and verification
Implementation and analysis
TOOLS FPGA Design flow
FPGA Design - SynthesizerTranslate VHDL into Physical components like Gates and FF’s.
Optimize Boolean Logic.
Use constraints to define it’s goals.
Use specific vendor primitives
FPGA Design - Simulator
Sorter Serial – Main cellMain Cell
An/Age
B Pixel In
B P
ixel In
An-1/Age
B>An-1 B>An
An/AgeMain element
=nAB nA
1, nn ABABOutA _
B1nA
1, nn ABAB
Sorter Serial – Shadow cell
Cn/Age+1
Age=
Retire_n-1
Cn+1Cn
Retire_n
Shadow element
=
falsereteireAgeAge nretirementn 1&nC
OutC _ retirementn AgeAge 1nC
1nC truereteiren 1
Sorter Serial – 3 bit sorter
Cn/Age+1
An/Age
B>AB Pixel In
B P
ixel In
An-1/Age
B>An-1 B>An
An/Age
Age=
Retire_n-1
Cn+1Cn
Retire_n
Cn/Age+1
An/Age
B>AB Pixel In
B P
ixel In
An-1/Age
B>An-1 B>An
An/Age
Age=
Retire_n-1
Cn+1Cn
Retire_n
Cn/Age+1
An/Age
B>AB Pixel In
B P
ixel In
An-1/Age
B>An-1 B>An
An/Age
Age=
Retire_n-1
Cn+1Cn
Retire_n
Cn/Age+1
An/Age
B>AB Pixel In
B P
ixel In
An-1/Age
B>An-1 B>An
An/Age
Age=
Retire_n-1
Cn+1Cn
Retire_n
Parallel Sorter - arrayDistributed Arithmetic's
3
2
1
2
3 1
3
1
2
3
>
>
>
C&S
C&S
C&S
Histogrammer – FFA dual port single state element cell
This cell enables:MUX on I/O
Write enable
Memory
D_inB
D_inA
AnB
D_outB
D_outA
Rd
clk
Reset
Wr
D_out
0
0
...2
255
-1
113103
+1
103103
=
.
.
103101104101103113
The divider is a ROM a look up table LUT
The input is the address of the memory cell
The memory cell store the result of division
The LUT will give the result for given constant coefficient
Histogram equalization Divider
Divider
Address 8 bit
=
Input Value
Output 8 bit
=
Division Result
Results for Parallel SorterAnalysis
Size Parallel sorter w/o counter
Only Median pixel
Size 3 9 25 3 9 25
Registers 9 81 625 14 93 653
Slices 40 460 3600 55 391 2873
FF 72 568 4792 66 522 3850
LUT 49 864 7200 69 669 5439
Gate equivalent 1080
11232
90400
1160
9056
69196
Memory 57 68 123 57 62 109
IOB 49 145 401 19 19 19
Gclk fan-out 52 364 50 278 1943
Av conn. delay (10) 2.5 4.8 2.2 3.7 4.5
END
END
Result for Serial Sorter
Analysis of the Xilinx mapper and place and route reports
Size 3 9 81
Registers 18 282 2278
Slices 132 345 2664
FF 108 276 2292
LUT 233 668 5192
Gate equivalent 2460 6720 53658
Memory 58 61 93
IOB 19 19 19
Gclk fan-out 75 210 1771
Av conn. delay (10) 5.9 9.2 8.8
Parallel