49926338 Fast Fourier Transform Algorithm and Application

FASTFOURIERTRANSFORMALGORITHMSWITHAPPLICATIONSADissertationPresentedtotheGraduateSchoolofClemsonUniversityInPartialFulllmentoftheRequirementsfortheDegreeDoctorofPhilosophyMathematicalSciencesbyToddMateerAugust2008Acceptedby:Dr. ShuhongGao,CommitteeChairDr. JoelBrawleyDr. NeilCalkinDr. KevinJamesABSTRACTThis manuscript describes a number of algorithms that can be used to quicklyevaluateapolynomial overacollectionof pointsandinterpolatetheseevaluationsbackintoapolynomial. EngineersdenetheFastFourierTransformasamethodofsolvingtheinterpolationproblemwherethecoecientringusedtoconstructthepolynomialshasaspecialmultiplicativestructure. MathematiciansdenetheFastFourierTransformasamethodofsolvingthemultipointevaluationproblem. Onepurpose of the document is toprovide amathematical treatment of the topic oftheFastFourierTransformthatcanalsobeunderstoodbysomeonewhohasanunderstandingofthetopicfromtheengineeringperspective.The manuscript will also introduce several new algorithms that eciently solvethemultipointevaluationproblemovercertainniteeldsandrequirefewerniteeldoperationsthanexistingtechniques. Thedocumentwill alsodemonstratethatthese new algorithms can be used to multiply polynomials with nite eld coecientswithfeweroperationsthanSchonhagesalgorithminmostcircumstances.Athirdobjectiveofthisdocumentistoprovideamathematical perspectiveofseveralalgorithmswhichcanbeusedtomultiplypolynomialswhosesizeisnotapoweroftwo. Severalimprovementstothesealgorithmswillalsobediscussed.Finally, thedocumentwill describeseveral applicationsof theFastFourierTransform algorithms presented and will introduce improvements in several of theseapplications. In addition to polynomial multiplication, the applications of polynomialdivisionwithremainder, the greatest commondivisor, decodingof Reed-Solomonerror-correctingcodes, andthecomputationof thecoecientsof adiscreteFourierserieswillbeaddressed.iiDEDICATIONI dedicate this worktomywife Jennifer andmychildrenNathan, Laura,Jonathan, andDaniel. Intermsofourfamily, theprogressofmygraduateresearchprogramhasbeenmeasuredthroughacollectionof ftyquartersproducedbytheUnitedStates mint over roughlythe same tenyear periodwhile I completedmygraduatestudies. Ilookforwardtoplacingthenalquarteronourdoctorschoolmapat the endof this year (2008) whenI anticipate beingnishedwithseveralpublicationsrelatedtotheresearchpresentedinthismanuscript.I have reallytreasuredthis time tobe at home withmyfamilywhile mychildrenwereyoungandfortheircompanyandsupportwhilecompletingthisproject. Muchof thisdissertationwaswrittenwithlittlechildreninmylaporbymysideas mywifeandI workedtogether toget throughtheserst fewyears ofparenthood. Iconsiderthistimetobemorevaluablethanthedegreeforwhichthisdissertationwaswritten.iiiACKNOWLEDGMENTSThere are many people who deserve recognition for their role in my educationwhich has led me to this point in my academic career. I will attempt to be as completeaspossiblehere,knowingthatarelikelyseveralpeoplethatIhaveleftout.Obviously, my advisor Shuhong Gao and committee members deserve mentionforbeingwillingtomentormethroughthisprocess. Thiswasanespeciallymorechallengingtaskgiventhe fact that we were separatedgeographicallyduringtheentirewritingofthisdissertation. Infact,myadvisorandIonlysaweachothertwotimesinthethreeyearsittooktocompletetheresearchprogram. IwouldalsoliketothankShuhongGaoforteachingacomputeralgebraclassin2001whichgotmeinterestedinthistopicformydoctoralstudies.I wouldalsoliketothankseveral anonymous reviewers whoreadover thisentiremanuscriptseveral times. Therearelikelymoreplaceswherethedocumentcanbeimprovedandifthereisaninterest, Iwill makerevisionstothemanuscriptastheseplacesarepointedouttome.I wouldalsolike this opportunitytothankthe teachers whichtaught methesubjectsrelatedtothisresearchprogram. Inparticular, IwouldliketothankRebeccaNowakowski, JamesPayne, DaleMcIntyre, Yao-HuanXu, JennyKey, andJoel Brawley for equipping me with the algebra and complex variables background totake on this assignment. Also, Timothy Mohr, Frank Duda and Robert Mueller taughtmethesignalsanalysiswhichintroducedmetotheengineeringperspectiveoftheFFT. In classes taught by Joe Churm and Robert Jamison, I learned how the materialof this dissertation is closely related to basic music theory. Through the instruction ofCurtFrank,JimKendall,FredJenny,MichelleClaus,andJamesPetersonIlearnedivthe computer science and programming skills needed for this research project. Finally,I would like to thank my father Robert Mateer who taught me the trigonometry thatis sofundamental tothe topic of this dissertation. While I turnedout tobe analgebraist instead of following in his footsteps and making analysis (i.e. Calculus) myspecialty,Istillhopetobecomeasgoodofateacherasmyfathersomeday....butthereisaGodinheavenwhorevealsmysteries.Daniel2:28avTABLEOFCONTENTSPageTITLEPAGE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiDEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivLISTOFTABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xLISTOFFIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiCHAPTER1. INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Themathematiciansperspective . . . . . . . . . . . . . . . . . . 31.2 Prerequisitemathematics . . . . . . . . . . . . . . . . . . . . . . 41.3 Operationcountsofthealgorithms. . . . . . . . . . . . . . . . . 51.4 Multipointpolynomialevaluation . . . . . . . . . . . . . . . . . . 71.5 Fastmultipointevaluation . . . . . . . . . . . . . . . . . . . . . 91.6 Lagrangianinterpolation . . . . . . . . . . . . . . . . . . . . . . 151.7 Fastinterpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 171.8 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 232. MULTIPLICATIVEFASTFOURIERTRANSFORMALGORITHMS 242.1 Thebitreversalfunction . . . . . . . . . . . . . . . . . . . . . . 252.2 Classicalradix-2FFT . . . . . . . . . . . . . . . . . . . . . . . . 262.3 Twistedradix-2FFT . . . . . . . . . . . . . . . . . . . . . . . . 302.4 Hybridradix-2FFTs . . . . . . . . . . . . . . . . . . . . . . . . 352.5 Classicalradix-4FFT . . . . . . . . . . . . . . . . . . . . . . . . 362.6 Twistedradix-4FFT . . . . . . . . . . . . . . . . . . . . . . . . 412.7 Radix-8FFT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.8 Split-radixFFT . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.9 Modiedsplit-radixFFT . . . . . . . . . . . . . . . . . . . . . . 542.10 Theternaryreversalfunction . . . . . . . . . . . . . . . . . . . . 592.11 Classicalradix-3FFT . . . . . . . . . . . . . . . . . . . . . . . . 602.12 Twistedradix-3FFT . . . . . . . . . . . . . . . . . . . . . . . . 662.13 Hybridradix-3FFTs . . . . . . . . . . . . . . . . . . . . . . . . 73viTableofContents(Continued)Page2.14 Radix-3FFTforsymbolicrootsofunity. . . . . . . . . . . . . . 752.15 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 773. ADDITIVEFASTFOURIERTRANSFORMALGORITHMS . . . . 783.1 VonzurGathen-GerhardadditiveFFT . . . . . . . . . . . . . . 783.2 Wang-Zhu-CantoradditiveFFT . . . . . . . . . . . . . . . . . . 873.3 ShiftedadditiveFFT . . . . . . . . . . . . . . . . . . . . . . . . 943.4 GaosadditiveFFT . . . . . . . . . . . . . . . . . . . . . . . . . 993.5 AnewadditiveFFT. . . . . . . . . . . . . . . . . . . . . . . . . 1083.6 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 1154. INVERSEFASTFOURIERTRANSFORMALGORITHMS . . . . . 1164.1 Classicalradix-2IFFT . . . . . . . . . . . . . . . . . . . . . . . . 1164.2 Twistedradix-2IFFT. . . . . . . . . . . . . . . . . . . . . . . . 1204.3 OthermultiplicativeIFFTs . . . . . . . . . . . . . . . . . . . . . 1224.4 Wang-Zhu-CantoradditiveIFFT. . . . . . . . . . . . . . . . . . 1294.5 AnewadditiveIFFT . . . . . . . . . . . . . . . . . . . . . . . . 1324.6 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 1395. POLYNOMIALMULTIPLICATIONALGORITHMS . . . . . . . . . 1415.1 Karatsubamultiplication . . . . . . . . . . . . . . . . . . . . . . 1415.2 Karatsubasalgorithmforothersizes . . . . . . . . . . . . . . . . 1455.3 FFT-basedmultiplication . . . . . . . . . . . . . . . . . . . . . . 1465.4 Schonhagesalgorithm. . . . . . . . . . . . . . . . . . . . . . . . 1485.5 FFT-basedmultiplicationusingthenewadditiveFFTalgorithm 1565.6 Comparisonofthemultiplicationalgorithms . . . . . . . . . . . 1585.7 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 1606. TRUNCATEDFASTFOURIERTRANSFORMALGORITHMS. . . 1626.1 AtruncatedFFTalgorithm. . . . . . . . . . . . . . . . . . . . . 1646.2 AninversetruncatedFFTalgorithm. . . . . . . . . . . . . . . . 1676.3 IllustrationoftruncatedFFTalgorithms . . . . . . . . . . . . . 1756.4 TruncatedalgorithmsbasedonrootsofxN1 . . . . . . . . . . 1776.5 TruncatedalgorithmsbasedonrootsofxNx . . . . . . . . . . 1796.6 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 1807. POLYNOMIALDIVISIONWITHREMAINDER . . . . . . . . . . . 1817.1 Classicaldivision. . . . . . . . . . . . . . . . . . . . . . . . . . . 182viiTableofContents(Continued)Page7.2 Newtondivision . . . . . . . . . . . . . . . . . . . . . . . . . . . 1837.3 NewtondivisonusingthemultiplicativeFFT . . . . . . . . . . . 1887.4 Newtondivisonforniteeldsofcharacteristic2 . . . . . . . . 1927.5 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 1948. THEEUCLIDEANALGORITHM. . . . . . . . . . . . . . . . . . . . 1958.1 TheEuclideanAlgorithm. . . . . . . . . . . . . . . . . . . . . . 1958.2 TheExtendedEuclideanAlgorithm . . . . . . . . . . . . . . . . 1998.3 NormalizedExtendedEuclideanAlgorithm . . . . . . . . . . . . 2098.4 TheFastEuclideanAlgorithm . . . . . . . . . . . . . . . . . . . 2108.5 AlgorithmimprovementsduetotheFastFourierTransform. . . 2268.6 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 2319. REED-SOLOMONERROR-CORRECTINGCODES . . . . . . . . . 2349.1 SystematicencodingofReed-Solomoncodewords . . . . . . . . . 2359.2 AtransformoftheReed-Solomoncodeword. . . . . . . . . . . . 2379.3 DecodingofsystematicReed-Solomoncodewords . . . . . . . . . 2409.4 Pseudocodeandoperationcountofthesimpledecodingalgorithm2469.5 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 24910. FURTHERAPPLICATIONSANDCONCLUDINGREMARKS . . . 25110.1 ComputingthecoecientsofadiscreteFourierseries . . . . . . 25110.2 Fastmultipointevaluation: revisited . . . . . . . . . . . . . . . . 25510.3 Fastinterpolation: revisited. . . . . . . . . . . . . . . . . . . . . 25710.4 OtherresearchareasinvolvingFFTalgorithms . . . . . . . . . . 25810.5 Concludingremarks . . . . . . . . . . . . . . . . . . . . . . . . . 260APPENDICESA. Masterequationsforalgorithmoperationcounts . . . . . . . . . . . . 263B. Operationcount: split-radixFFT . . . . . . . . . . . . . . . . . . . . 274C. Additionaldetailsofthemodiedsplit-radixFFT . . . . . . . . . . . 278D. Complexconjugateproperties . . . . . . . . . . . . . . . . . . . . . . 285E. ProofoftheexistenceoftheCantorbasis . . . . . . . . . . . . . . . . 287F. Taylorshiftofapolynomialwithniteeldcoecients . . . . . . . . 294G. Taylorexpansionofapolynomialwithniteeldcoecientsatx. . 298H. AdditionalrecurrencerelationsolutionsforadditiveFFTalgorithms . 302I. Operationcount: Karatsubasmultiplicationalgorithm . . . . . . . . . 305J. Operationcount: Schonhagesalgorithm. . . . . . . . . . . . . . . . . 307viiiTableofContents(Continued)PageK. KaratsubasalgorithminFFT-basedmultiplicationusingthenewad-ditiveFFT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309L. Reischertsmultiplicationmethod . . . . . . . . . . . . . . . . . . . . 311M. Twopositions onfuture polynomial multiplicationalgorithmperfor-mance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315N. Complexityoftruncatedalgorithms . . . . . . . . . . . . . . . . . . . 317O. AlternativederivationofNewtonsMethod . . . . . . . . . . . . . . . 319BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323ixLISTOFTABLESTable Page5.1 Addition cost comparison between Schonhages algorithm and FFT-basedmultiplicationusingthenewadditiveFFTalgorithm. . . . . . . . . 160L.1 OperationcountsofpointwiseproductsinvolvedinReischertsmulti-plicationmethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314xLISTOFFIGURESFigure Page1.1 Pseudocodeforfastmultipointevaluation(iterativeimplementation) . 111.2 Pseudocodeforfastmultipointevaluation(recursiveimplementation) . 121.3 Pseudocodeforfastmultipointevaluation(8points). . . . . . . . . . . 131.4 Pseudocodeforfastinterpolation(recursiveimplementation) . . . . . . 202.1 Pseudocodeforclassicalradix-2FFT . . . . . . . . . . . . . . . . . . . 282.2 Pseudocodefortwistedradix-2FFT . . . . . . . . . . . . . . . . . . . . 332.3 Pseudocodeforclassicalradix-4FFT . . . . . . . . . . . . . . . . . . . 402.4 Pseudocodeforsplit-radixFFT(conjugate-pairversion) . . . . . . . . . 512.5 Pseudocodefornewclassicalradix-3FFT . . . . . . . . . . . . . . . . 632.6 Pseudocodeforimprovedtwistedradix-3FFT . . . . . . . . . . . . . . 713.1 PseudocodeforvonzurGathen-GerhardadditiveFFT . . . . . . . . . 863.2 PseudocodeforWang-Zhu-CantoradditiveFFT. . . . . . . . . . . . . 933.3 PseudocodeforshiftedadditiveFFT . . . . . . . . . . . . . . . . . . . 973.4 PseudocodeforGaosadditiveFFT. . . . . . . . . . . . . . . . . . . . 1053.5 PseudocodefornewadditiveFFT. . . . . . . . . . . . . . . . . . . . . 1114.1 Pseudocodeforclassicalradix-2IFFT. . . . . . . . . . . . . . . . . . . 1184.2 Pseudocodefortwistedradix-2IFFT . . . . . . . . . . . . . . . . . . . 1214.3 Pseudocodeforsplit-radixIFFT(conjugate-pairversion) . . . . . . . . 1244.4 Pseudocodefornewtwistedradix-3IFFT . . . . . . . . . . . . . . . . 1274.5 PseudocodeforWang-Zhu-CantoradditiveIFFT . . . . . . . . . . . . . 1314.6 PseudocodeforthenewadditiveIFFT . . . . . . . . . . . . . . . . . . 1375.1 PseudocodeforKaratsubamultiplication . . . . . . . . . . . . . . . . . 144xiListofFigures(Continued)Figure Page5.2 PseudocodeforFFT-basedmultiplication. . . . . . . . . . . . . . . . . 1475.3 PseudocodeforSchonhagesmultiplication . . . . . . . . . . . . . . . . 1536.1 PseudocodefortruncatedFFT . . . . . . . . . . . . . . . . . . . . . . 1656.2 PseudocodeforinversetruncatedFFT . . . . . . . . . . . . . . . . . . 1746.3 IllustrationoftruncatedFFTalgorithms . . . . . . . . . . . . . . . . . 1767.1 Pseudocodeforclassicaldivision. . . . . . . . . . . . . . . . . . . . . . 1837.2 PseudocodeforNewtondivisionalgorithm . . . . . . . . . . . . . . . . 1877.3 PseudocodeforimprovedNewtondivision . . . . . . . . . . . . . . . . 1918.1 PseudocodeforEuclideanAlgorithm . . . . . . . . . . . . . . . . . . . 1988.2 PseudocodeforExtendedEuclideanAlgorithm. . . . . . . . . . . . . . 2078.3 PseudocodeforFastEuclideanAlgorithm . . . . . . . . . . . . . . . . . 2229.1 PseudocodeforsimpleReed-Solomondecodingalgorithm. . . . . . . . 247C.1 Pseudocodeformodiedsplit-radixFFT(versionAreductionstep) . . 279C.2 Pseudocodeformodiedsplit-radixFFT(versionBreductionstep) . . 280C.3 Pseudocodeformodiedsplit-radixFFT(versionCreductionstep) . . 281C.4 Pseudocodeformodiedsplit-radixFFT(versionDreductionstep) . . 282F.1 PseudocodeforTaylorexpansionofapolynomialat . . . . . . . . . . 296G.1 PseudocodeforTaylorexpansionatx. . . . . . . . . . . . . . . . . . 300L.1 PseudocodeforReischertmultiplication . . . . . . . . . . . . . . . . . . 313xiiCHAPTER1INTRODUCTIONAround1805, Carl FriedrichGauss inventedarevolutionarytechnique foreciently computing the coecients of what is now called1a discrete Fourier series.Unfortunately,Gaussneverpublishedhisworkanditwaslostforoveronehundredyears. Duringtherestof thenineteenthcentury, variationsof thetechniquewereindependentlydiscoveredseveral moretimes, but never appreciated. Intheearlytwentieth century, Carl Runge derived an algorithm similar to that of Gauss that couldcomputethecoecientsonaninputwithsizeequaltoapoweroftwoandwaslatergeneralizedtopowersofthree. AccordingtoPierreDuhamelandM.Hollmann[22],this technique was widely known and used in the 1940s. However, after World War II,Runges work appeared to have been forgotten for an unknown reason. Then in 1965,J.W.CooleyandJ.W.Tukeypublishedashortvepagepaper[16]basedonsomeother works of the early twentieth century which again introduced the technique whichisnowknownastheFastFourierTransform. Thistime, however, thetechniquecouldbeimplementedonanewinventioncalledacomputerandcouldcomputethecoecientsofadiscreteFourierseriesfasterthanmanyeverthoughtpossible. SincethepublicationoftheCooley-Tukeypaper, engineershavefoundmanyapplicationsforthealgorithm. Over2,000additional papershavebeenpublishedonthetopic1Iftheyearofthisdiscoveryisaccurateasclaimedin[40],thenGaussdiscoveredthe Fourier series even before Fourier introduced the concept in his rejected 1807 workwhichwaslaterpublishedin1822asabook[27]. Nevertheless, Fouriersworkwasbetterpublicizedthanthatof GaussamongthescienticcommunityandthetermFourier series has been widely adopted to describe this mathematical construction.1[39], andtheFastFourierTransform(FFT)hasbecomeoneofthemostimportanttechniquesintheeldofElectricalEngineering. Therevolutionhadnallystarted.In[25],CharlesFiducciashowedforthersttimethattheFFTcanbecom-puted in terms of algebraic modular reductions. As with the early FFT publications,thisideahasbeengenerallyignored. However,DanielBernsteinrecentlywrotesev-eralunpublishedworks([2],[3],[4])whichexpandupontheobservationsofFiducciaandshowthealgebraictransformationsinvolvedinthisapproachtocomputingtheFFT.Themainpurposeof thisdocumentistoprovideamathematical treatmentof FFTalgorithms, extendingthe workof FiducciaandBernstein. Manyof thealgorithmscontainedinthismanuscripthaveappearedintheliteraturebefore, butnotfromthisalgebraicperspective. Whileitisunlikelythatthecompletionofthisthesiswill triggerarevolutionsimilartothatwhichfollowedthepublicationoftheCooleyandTukeypaper,itishopedthatthisdocumentwillhelptopopularizethismathematicalperspectiveofFFTalgorithms.Anotherpurposeofthedocumentistointroduceanewalgorithmoriginallyinvented by Shuhong Gao [30] that quickly evaluates a polynomial over special collec-tions of nite eld elements. Although it turned out that this algorithm is less ecientthan existing techniques for all practical sizes,the careful study of the Cooley-Tukeyalgorithms throughthis researcheort resultedinanewversionof thealgorithmthatissuperiortoexistingtechniquesforallpracticalsizes. Thenewversionofthisalgorithmwillalsobeintroducedaspartofthismanuscript. Wewillthenshowhowthenewalgorithmcanbeusedtomultiplypolynomialswithcoecientsovera-niteeldmoreecientlythanSchonhagesalgorithm, themostecientpolynomialmultiplicationalgorithmforniteeldscurrentlyknown.2MostFFTalgorithmsonlyworkwhentheinputsizeisthepowerofasmallprime. Thisdocumentwillalsointroducenewalgorithmsthatworkforanarbitraryinput size. Wewill thenexploreseveral applications of theFFTthat canbeim-provedusingthenewalgorithmsincludingpolynomial division, thecomputationofthegreatestcommondivisor,anddecodingReed-Solomoncodes.Another motivation for writing this document is to provide a treatment of theFFTthattakestheperspectiveofbothmathematiciansandengineersintoaccountsothatthesetwocommunitiesmaybettercommunicatewitheachother. Theengi-neering perspective of the FFT has been briey introduced in these opening remarks.WewillnowconsiderthemathematiciansperspectiveoftheFFT.1.1 ThemathematiciansperspectiveIt has already been mentioned that engineers originally dened the Fast FourierTransformas atechnique whichecientlycomputes the coecients of adiscreteFourier series. As introducedbyFiduccia[25], mathematicians developedanal-ternativedenitionof theFastFourierTransform(FFT)asamethodof ecientlyevaluating a polynomial at the powers of a primitive root of unity. Unfortunately, thisinterpretation is completely the opposite of that of the engineer, who view the inverseof theFast Fourier Transformas asolutiontothis multipoint evaluationproblemusingthediscreteFourierseries. Similarly, themathematiciandenestheinverseFFT as a method of interpolating a set of these evaluations back into a polynomial.Wewill seeinChapter10thatthisinterpolationisthegoal of whattheengineerscall the FFT. One of the challenges of studying the FFT literature is reading paperswrittenbyauthorswhoviewtheFFTproblemfromadierentperspective. Furtherdistortingtheengineersoriginal meaningof thephraseFastFourierTransform,3theadditiveFFThasbeendened[29] asanalgorithmwhichexploitstheaddi-tivevectorspaceconstructionofniteeldstoecientlyevaluateapolynomialataspecialcollectionoftheseniteeldelements. ThistechniquehasnorelationtothediscreteFourierseriesatall.In this manuscript, the FFT will be presented from the mathematicians pointof view. Inother words, wewill denetheFFTasatechniquewhichecientlyevaluatesapolynomialoveraspecialcollectionofpointsandtheinverseFFTwillbe dened as a technique which eciently interpolates a collectionof evaluations ofsomepolynomialataspecialsetofpointsbackintothispolynomial.TwotypesofFFTalgorithmswill beconsidered: (1)themultiplicativeFFTwhichworkswiththepowers of aprimitiveroot of unity; and(2) theadditiveFFTwhichworksoveraspecial collectionofniteeldelements. Again, inChapter10, wewill showhowsomeof thealgorithms developedinthis document canbeusedtosolvetheengineeringapplicationsforwhichtheywere originallydesignedandputsome oftheFourierbackintotheFastFourierTransform.Atthispoint, itmightbeappropriatetopointoutanadditional dierenceofopinionbetweenmathematiciansandengineersrelevanttothealgorithmsinthisdocument. The complex numbers is a collection of elements of the form A+1BwhereAandBarerealnumbers. Mathematicianstypicallyuseitorepresent1whiletheengineerstypicallyusethesymbolj. Inthisdocument, thesymbol Iwillbe used to represent1, following the convention used in several popular computeralgebrapackages.1.2 PrerequisitemathematicsMuchof thematerial inthisdocumentcanbeunderstoodbyareaderwhohascompletedacurriculuminundergraduatemathematics. Specically,oneshould4have completed a course in Discrete Mathematics based on material similar to [66], acourse in Linear Algebra based on material similar to [51], and a introductory coursein Modern Algebra based on material similar to [60]. In particular, one should have abasic understanding of binary numbers, trees, recursion, recurrence relations, solvinglinearsystemsofequations, inverses, rootsofunity, groups, rings, elds, andvectorspaces.Additionally, somebackgroundinthealgebraicstructures usedthroughoutthisdocumentishighlyrecommended. TounderstandthemultiplicativeFFTs, oneneedstoknowthebasicpropertiesof complexnumbers(seechapter1in[67]). Tounderstandthe moreadvancedmaterialinthe document,one shouldhaveanunder-standingofpolynomial ringsandniteelds, alsoknownasGaloiselds. OnecanstudyChapter2of[52]orChapters1-6of[80])tolearnthismaterial.1.3 OperationcountsofthealgorithmsMathematically modeling the eort needed to implement any type of algorithmis a dicult topic that has changed several times over the years. Originally, one wouldonlycount thenumberof multiplicationoperationsrequiredbythecomputerandexpresstheresultasafunctionof theproblemsizerepresentedbyn. Thebig-Onotationwaslaterinventedtosimplifytheseexpressionswhencomparingdierentalgorithms. Afunctionf(x)issaidtobeO(g(x))ifthereexistsconstantsCandksuchthat[f(x)[ C[g(x)[ (1.1)wheneverx>k. Unfortunately, thisnotationwasmisusedovertheyearsandwas5laterreplacedwiththebig-notationbyDonKnuth[48].2Afunctionf(x)issaidtobe(g(x))ifthereexistsconstantsC1,C2,andksuchthatC1 [g(x)[ [f(x)[ C2 [g(x)[ (1.2)whenever x >k. The big-notationprovides abetter measure of the eortneededtocompleteaparticularalgorithm. Inthismanuscript,wewillgiveapreciseoperationcount for everyFFTalgorithmdiscussedinChapters 2-4andwill alsoexpressanoperationcountusingthebig-notationwhenappropriate. FortheapplicationsdiscussedinChapters5-10, itwill oftenbediculttoobtainapreciseoperationcount or evenalower boundof thenumber of operations required. Inthese cases, the big-O notation will be used instead and the author will attempt topresentastightofanupperboundaspossible.In the early days of FFT analysis, only the number of multiplications requiredwas consideredsignicant andthenumber of additions neededwas ignored. ThisledtoanalgorithmbyWinograd[83] whichprovidedaconstructivelower boundonthenumberofmultiplicationsneededtocomputetheFFTofsize2k. Problemsarose, however, whenpeopleattemptedtoimplementthealgorithmandonlyfounditpractical forcomputingFFTsofsizeupto28=64, muchsmallerthanpracticalFFTsizes. Itturnedoutthatinordertoachieveamodestreductioninthenumberof multiplications, atradeoof manymore additions was required. As aresult,Winogradsalgorithmisonlyof theoretical interestandthenumberof additionsis2Todemonstratethemisuseof thebig-Onotation, onecanshowthatanyal-gorithminthispaperisO(x7399). Thisgivesusnoinformationwhatsoeverthatmayhelpusestimatehowmucheortaparticularalgorithmcostsorthatmayhelpuscomparetwoalgorithms.6now computed as well as the number of multiplications. Even more advanced modelsof FFTalgorithmsalsoincludethecontributionfrommemoryaccessesandcopies.However, thesemodelsareoftendependentonthearchitectureof acomputerandwillnotbeusedforthecostanalysespresentedinthisdocument.Instead, we will typically count the number of multiplications and the numberofadditionsneededtoimplementaparticularalgorithm. Sometimes,the counts willbe given in terms of the algebraic structure used in the algorithm and sometimes thecountswillbegivenintermsofacomponentofthestructure.3Theseresultswillusuallynotbecombined, butoccasionallyananalysiswillbepresentedthatrelatesthetwooperations. Incaseswhereaconservativeoperationcountofthealgorithmisdesired,thenumberofcopieswillalsobecountedwhendataneedstobeshuedaround. Whenthisapplies, acopywill bemodeledasequivalenttothecostof anaddition.1.4 MultipointpolynomialevaluationFromthemathematiciansperspective, theFFTisaspecialcaseofthemul-tipointevaluationproblem. Inthenextfewsections, wewill explorealgorithmsforsolvingthismoregeneralproblem.Letf(x)beapolynomialofdegreelessthannwithcoecientsinsomeringR. Wecanexpressf(x)asf(x) = fn1 xn1+ fn2 xn2+ + f1 x + f0, (1.3)3For example, in the case of complex numbers (C), we can either count the numberofadditionsandmultiplicationsinC, orwecancountthenumberofadditionsandmultiplicationsinR, thereal numbers. Wewill seethatthereareseveral dierentstrategiesforcomputingcomplexnumberarithmeticintermsoftherealnumbers.7where f0, f1, . . . , fn2, fn1 R. We wishtoevaluate f at some set of pointsS= 0, 1, . . . , n1 R.LetjbeoneofthepointsinS. Thenf(j) = fn1 jn1+ fn2 jn2+ + f1 j + f0. (1.4)If(1.4)isusedtodeterminef(j)withoutseekingtominimizethecomputations,itwouldrequire12 n212 nmultiplicationsandn 1additions. TheevaluationofjforeachpointinSwouldrequire(n3)multiplicationsand(n2)additions.Inatypical highschool algebracourse(e.g. [5]), thetechniqueof syntheticdivisionisintroduced. Thismethodisbasedontheso-calledRemainderTheoremwhich states that f() is equal to the remainder when f(x) is divided by the polyno-mialx . SyntheticdivisionisequivalenttoatechniquecalledHornersmethodwhichinvolvesrewriting(1.4)asf(j) = (( ((fn1 j) + fn2 j) + . . . ) +f1 j) + f0. (1.5)SyntheticdivisionandHornersmethodrequiren 1multiplicationsandn 1ad-ditionstoevaluatef(j). TheevaluationofjforeachpointinSnowonlyrequiresn2nmultiplicationsandn2nadditions. Thesetechniquesaresaidtobe(n2).81.5 FastmultipointevaluationTointroduceanumberofconceptsrelevanttothestudyofFFTalgorithms,wewill nowconsideranalgorithmbasedonmaterial foundin[25] and[57] whicheciently solves the multipoint evaluation problem. The presentation of the algorithminthissectionisbasedon[34] andassumesthatnisoftheform2k. However, thealgorithmcaneasilybeadaptedtoletnbeoftheformpkifdesired.Thealgorithmworksbyperformingaseriesof reductionstepswhicharedesignedtoecientlycomputethepolynomial evaluations. Mathematicians viewthereductionstepsastransformationsbetweenvariousquotientrings, butuserep-resentative elements tocompute inthese algebraic structures. Inthis document,f(x) mod M(x)will beinterpretedastheremainder whichresultswhenf(x) isdividedbysome other polynomial M(x) calledthe modulus polynomial. Eachreduction step receives some residue polynomial f= fmod MAas input and pro-duces as output the residue polynomials fmod MB= fmod MBand fmod MC=fmod MCwhereMA= MB MC.Thereductionstepscanbeorganizedintoabinarytreewithk +1levels. Onleveli,therewillbe2kinodesforeachiin0 i k. Thesenodeswillbelabeledusingthenotation(i, j)whereidenotesthelevelofthenodeonthebinarytreeandj isusedtolabel thenodesoneachlevel fromlefttoright. Here, 0 j 0andjintherange0 j

Date post:	07-Jul-2015
Category:	Documents
Upload:	killerdealer
View:	164 times
Download:	0 times

49926338 Fast Fourier Transform Algorithm and Application

Documents