+ All Categories
Home > Documents > 0270_PDF_C06.pdf

0270_PDF_C06.pdf

Date post: 07-Nov-2015
Category:
Upload: bao-tram-nguyen
View: 212 times
Download: 0 times
Share this document with a friend
10
Chapter 6 Performing Bit-Reversal by Repeated Permutation of Intermediate Results It has been shown in the previous chapter that if the input data are first permuted into bit-reversed order, then the radix-2 DIF RN FFT can be used to obtain naturally ordered output. This process is depicted for a N = 8 example in Figure 6.1. When the static permutation step is not performed in place, the bit-reversed input data are available in array b after the reordering. Figure 6.1 Bit-reversing the input before performing in-place DIF RN FFT. © 2000 by CRC Press LLC
Transcript
  • Chapter 6

    Performing Bit-Reversal by

    Repeated Permutation of

    Intermediate Results

    It has been shown in the previous chapter that if the input data are rst permutedinto bit-reversed order, then the radix-2 DIFRN FFT can be used to obtain naturallyordered output. This process is depicted for a N = 8 example in Figure 6.1. Whenthe static permutation step is not performed in place, the bit-reversed input data areavailable in array b after the reordering.

    Figure 6.1 Bit-reversing the input before performing in-place DIFRN FFT.

    2000 by CRC Press LLC

  • Of course, the same result can be accomplished by bit-reversing the output froman NR FFT algorithm as depicted in Figure 6.2 for the same example.

    Figure 6.2 Bit-reversing the output after performing in-place DIFNR FFT.

    6.1 Combining Permutation with Buttery Compu-

    tation

    The cost of extra memory accesses in a separate bit-reversing phase can be completelyeliminated if data permutation is combined with the buttery computation at eachstep. Such an alternative is presented in this section.

    6.1.1 The ordered radix-2 DIFNN FFT

    When input x and output X are both in natural order, the algorithm is referred toas an ordered FFT in the literature. The ordered radix-2 DIF FFT procedure wasoriginally proposed by Stockham [30, 89]. The key to understanding what is requiredis to view each buttery computation as consisting of one permutation step followedby one in-place computation step. These permutation steps reorder the initial input aswell as the input to each subsequent subproblem, and the notation introduced in theprevious chapters can be used to describe this process in a natural way.

    Again using the N = 8 example above, with the input in the natural order, i.e.,

    a[i2i1i0] = xi2i1i0 , the rst in-place buttery is denoted byi2i1i0. If this buttery

    operation is preceded by permuting the data in a[i2i1i0] to b[i1i0i2], it is natural to use

    2000 by CRC Press LLC

  • i2i1i0 i1i0i2a b

    to denote the permutation, which is followed by in-place buttery computation denotedby

    i1i0i2 i1i0i2

    To show the combined eect, the two sequences above are condensed into

    i2i1i0 i1i0i2

    a b

    If the next step involves permuting the derivative x(1)i2i1i0 in b[i1i0i2] to a[i0i1i2], thenthe derivatives x(2)i2i1i0 and x

    (3)i2i1i0

    can both be computed in-place in a[i0i1i2]. Sincex

    (3)i2i1i0

    = Xi0i1i2 is contained in a[i0i1i2], the output frequencies Xms are naturallyordered in array a as desired.

    However, the easiest way to understand an algorithm may not be the most e-cient way to implement an algorithm. For example, two implementations of a singlebuttery computation step involving naturally ordered input elements a[2] = x2 anda[6] = x6 are depicted in Figures 6.3 and 6.4.

    In Figure 6.3, the ordered DIF FFT is implemented as one understands it; i.e.,a permutation step actually precedes the buttery computation. As reected by thefragment of pseudo-code displayed in Figure 6.3, memory locations b[4] and b[5] areeach modied twice.

    In Figure 6.4, the ordered DIF FFT is implemented without rst permuting a[2]to b[4], a[6] to b[5], . . . , etc. Instead, the derivative x(1)2 is computed and storeddirectly into b[4], and so on. As reected by the fragment of pseudo-code displayed inFigure 6.4, memory locations b[4] and b[5] are each modied only once. Since the samememory accessing pattern applies to all butteries in every stage, this implementationeliminates all extra memory accesses in reordering intermediate results, and it is a moreecient way to implement the ordered DIF FFT algorithm. The complete pseudo-codeprogram is given as Algorithm 6.1 below.

    2000 by CRC Press LLC

  • Figure 6.3 Naive Implementation of the (ordered) DIFNN FFT.

    2000 by CRC Press LLC

  • Figure 6.4 Implement the (ordered) DIFNN FFT with no extra memory access.

    2000 by CRC Press LLC

  • Algorithm 6.1 The (ordered) radix-2 DIFNN FFT algorithm.

    beginNumOfProblems := 1 Initially: One problems of size NProblemSize := N HalfSize = ProblemSize/2Distance := 1NotSwitchInput := truewhile ProblemSize > 1 do Halve each problem

    if NotSwitchInput Array a contains input; array b contains outputfor JFirst := 0 to NumOfProblems 1 do

    J := JFirst; Jtwiddle := 0K := JFirstwhile J < N 1 do

    W := w[Jtwiddle]b[J ] := a[K] + a[K +N/2]b[J + Distance] := (a[K] a[K+N/2]) WJtwiddle := Jtwiddle + NumOfProblems Assume w[] = NJ := J + 2 NumOfProblemsK := K + NumOfProblems

    end whileend forNotSwitchInput := false

    else Array b contains input; array a contains outputfor JFirst := 0 to NumOfProblems 1 do

    J := JFirst; Jtwiddle := 0K := JFirstwhile J < N 1 do

    W := w[Jtwiddle]a[J ] := b[K] + b[K +N/2]a[J + Distance] := (b[K] b[K +N/2]) WJtwiddle := Jtwiddle + NumOfProblems Assume w[] = NJ := J + 2 NumOfProblemsK := K + NumOfProblems

    end whileend forNotSwitchInput := true

    end ifNumOfProblems := NumOfProblems 2ProblemSize := ProblemSize/2Distance := Distance 2

    end whileend

    2000 by CRC Press LLC

  • 6.1.2 The shorthand notation

    As usual, assuming that x is initially contained in a in the natural order, a second arrayb would alternately contain the data. The entire computation process, along with theuse of the two arrays, is depicted below.

    i2i1i0 i1i0i2 i0

    i12

    i012

    a b a b

    Note that the corresponding twiddle factors are

    i1i0N , i00N ,

    0N = 1 ,

    because DIFNR, DIFRN, and DIFNN FFT algorithms all transform the same elementxi2i1i0 , although they refer to the dierent addresses of xi2i1i0 in expressing the samealgorithm.

    Once again, all details of the (ordered) DIFNN FFT can be captured by a shorthandnotation together with the twiddle factors.

    6.2 Applying the Ordered DIF FFT to a N = 32

    Example

    Generalizing the shorthand notation for N = 32, the following sequence represents allve stages of permutation and computation depicted in Figure 6.5.

    i4i3i2i1i0 i3i2i1i0i4 i2i1i0

    i34 i1i0

    i234 i0

    i1234

    i01234

    a b a b a b

    The corresponding twiddle factors are

    i3i2i1i0N , i2i1i00N ,

    i1i000N ,

    i0000N ,

    0N = 1 .

    By comparing Figure 6.6, where the butteries associated with a particular pairof resulting subproblems are shown without the cluttering of others, with the twounordered DIF FFT in Figures 4.4 and 5.3, one immediately observes that

    all three variants of the DIF FFT treat exactly the same pairs of subproblemsduring each stage of the computation.

    Thus they all implement the same radix-2 DIF FFT algorithm.

    2000 by CRC Press LLC

  • Figure 6.5 Butteries of the (ordered) DIFNN FFT algorithm.

    2000 by CRC Press LLC

  • Figure 6.6 Identifying the subproblems paired up by the (ordered) DIFNN FFT.

    2000 by CRC Press LLC

  • 6.3 In-Place Ordered (or Self-Sorting) Radix-2 FFT

    Algorithms.

    Another class of ordered FFTs performs in-place permutation and consequently doesnot need a second array; they are the so-called self-sorting in-place algorithms. Thisclass contains variants of the prime-factor algorithms [20, 81, 99] and a radix-2 FFT [58].This class has been further extended to include self-sorting in-place radix-3, radix-4,radix-5, and nally mixed-radix FFTs [101]. The radix-2 algorithm is relevant to thediscussion in this chapter. Using the notation developed earlier, the process of applyingthe self-sorting in-place radix-2 DIF FFT to array a, which contains naturally orderedx, is depicted below for N = 32.

    i4i3i2i1i0 i0i3i2i1i4 i0i1i2

    i34 i0i1

    i234 i0

    i1234

    i01234

    a a a a a a

    Observe that the permutation always involves bits in symmetric positions: e.g., instep 1, the left-most bit i4 switches with the right-most bit i0 and in step 2, bit i3,the second bit from the left end, switches with bit i1, the second bit from the rightend. Accordingly, the ordering of the bits is reversed after only two steps, andthe permutation can be implemented using pairwise interchanges. The contents ina[0i3i2i11] and a[1i3i2i10] are switched in step 1 and the contents in a[i00i214] anda[i01i204] are switched in step 2. Since each pairwise interchange can be done usinga single temporary location, the array b is not needed.

    2000 by CRC Press LLC

    INSIDE the FFT BLACK BOX: Serial and Parallel Fast Fourier Transform AlgorithmsTable of ContentsPart II: Sequential FFT AlgorithmsChapter 6: Performing Bit-Reversal by Repeated Permutation of Intermediate Results6.1 Combining Permutation with Butterfly Computation6.1.1 The ordered radix-2 DIF NN FFT6.1.2 The shorthand notation

    6.2 Applying the Ordered DIF FFT to a N = 32 Example6.3 In-Place Ordered (or Self-Sorting)Radix-2 FFT Algorithms.


Recommended