| Chris@10 | 1 <html lang="en"> | 
| Chris@10 | 2 <head> | 
| Chris@10 | 3 <title>Introduction - FFTW 3.3.3</title> | 
| Chris@10 | 4 <meta http-equiv="Content-Type" content="text/html"> | 
| Chris@10 | 5 <meta name="description" content="FFTW 3.3.3"> | 
| Chris@10 | 6 <meta name="generator" content="makeinfo 4.13"> | 
| Chris@10 | 7 <link title="Top" rel="start" href="index.html#Top"> | 
| Chris@10 | 8 <link rel="prev" href="index.html#Top" title="Top"> | 
| Chris@10 | 9 <link rel="next" href="Tutorial.html#Tutorial" title="Tutorial"> | 
| Chris@10 | 10 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> | 
| Chris@10 | 11 <!-- | 
| Chris@10 | 12 This manual is for FFTW | 
| Chris@10 | 13 (version 3.3.3, 25 November 2012). | 
| Chris@10 | 14 | 
| Chris@10 | 15 Copyright (C) 2003 Matteo Frigo. | 
| Chris@10 | 16 | 
| Chris@10 | 17 Copyright (C) 2003 Massachusetts Institute of Technology. | 
| Chris@10 | 18 | 
| Chris@10 | 19      Permission is granted to make and distribute verbatim copies of | 
| Chris@10 | 20      this manual provided the copyright notice and this permission | 
| Chris@10 | 21      notice are preserved on all copies. | 
| Chris@10 | 22 | 
| Chris@10 | 23      Permission is granted to copy and distribute modified versions of | 
| Chris@10 | 24      this manual under the conditions for verbatim copying, provided | 
| Chris@10 | 25      that the entire resulting derived work is distributed under the | 
| Chris@10 | 26      terms of a permission notice identical to this one. | 
| Chris@10 | 27 | 
| Chris@10 | 28      Permission is granted to copy and distribute translations of this | 
| Chris@10 | 29      manual into another language, under the above conditions for | 
| Chris@10 | 30      modified versions, except that this permission notice may be | 
| Chris@10 | 31      stated in a translation approved by the Free Software Foundation. | 
| Chris@10 | 32    --> | 
| Chris@10 | 33 <meta http-equiv="Content-Style-Type" content="text/css"> | 
| Chris@10 | 34 <style type="text/css"><!-- | 
| Chris@10 | 35   pre.display { font-family:inherit } | 
| Chris@10 | 36   pre.format  { font-family:inherit } | 
| Chris@10 | 37   pre.smalldisplay { font-family:inherit; font-size:smaller } | 
| Chris@10 | 38   pre.smallformat  { font-family:inherit; font-size:smaller } | 
| Chris@10 | 39   pre.smallexample { font-size:smaller } | 
| Chris@10 | 40   pre.smalllisp    { font-size:smaller } | 
| Chris@10 | 41   span.sc    { font-variant:small-caps } | 
| Chris@10 | 42   span.roman { font-family:serif; font-weight:normal; } | 
| Chris@10 | 43   span.sansserif { font-family:sans-serif; font-weight:normal; } | 
| Chris@10 | 44 --></style> | 
| Chris@10 | 45 </head> | 
| Chris@10 | 46 <body> | 
| Chris@10 | 47 <div class="node"> | 
| Chris@10 | 48 <a name="Introduction"></a> | 
| Chris@10 | 49 <p> | 
| Chris@10 | 50 Next: <a rel="next" accesskey="n" href="Tutorial.html#Tutorial">Tutorial</a>, | 
| Chris@10 | 51 Previous: <a rel="previous" accesskey="p" href="index.html#Top">Top</a>, | 
| Chris@10 | 52 Up: <a rel="up" accesskey="u" href="index.html#Top">Top</a> | 
| Chris@10 | 53 <hr> | 
| Chris@10 | 54 </div> | 
| Chris@10 | 55 | 
| Chris@10 | 56 <h2 class="chapter">1 Introduction</h2> | 
| Chris@10 | 57 | 
| Chris@10 | 58 <p>This manual documents version 3.3.3 of FFTW, the | 
| Chris@10 | 59 <em>Fastest Fourier Transform in the West</em>.  FFTW is a comprehensive | 
| Chris@10 | 60 collection of fast C routines for computing the discrete Fourier | 
| Chris@10 | 61 transform (DFT) and various special cases thereof. | 
| Chris@10 | 62 <a name="index-discrete-Fourier-transform-1"></a><a name="index-DFT-2"></a> | 
| Chris@10 | 63      <ul> | 
| Chris@10 | 64 <li>FFTW computes the DFT of complex data, real data, even- | 
| Chris@10 | 65   or odd-symmetric real data (these symmetric transforms are usually | 
| Chris@10 | 66   known as the discrete cosine or sine transform, respectively), and the | 
| Chris@10 | 67   discrete Hartley transform (DHT) of real data. | 
| Chris@10 | 68 | 
| Chris@10 | 69      <li>The input data can have arbitrary length. | 
| Chris@10 | 70        FFTW employs <i>O</i>(<i>n</i> log <i>n</i>) algorithms for all lengths, including | 
| Chris@10 | 71        prime numbers. | 
| Chris@10 | 72 | 
| Chris@10 | 73      <li>FFTW supports arbitrary multi-dimensional data. | 
| Chris@10 | 74 | 
| Chris@10 | 75      <li>FFTW supports the SSE, SSE2, AVX, Altivec, and MIPS PS instruction | 
| Chris@10 | 76        sets. | 
| Chris@10 | 77 | 
| Chris@10 | 78      <li>FFTW includes parallel (multi-threaded) transforms | 
| Chris@10 | 79        for shared-memory systems. | 
| Chris@10 | 80 <li>Starting with version 3.3, FFTW includes distributed-memory parallel | 
| Chris@10 | 81        transforms using MPI. | 
| Chris@10 | 82 </ul> | 
| Chris@10 | 83 | 
| Chris@10 | 84    <p>We assume herein that you are familiar with the properties and uses of | 
| Chris@10 | 85 the DFT that are relevant to your application.  Otherwise, see | 
| Chris@10 | 86 e.g. <cite>The Fast Fourier Transform and Its Applications</cite> by E. O. Brigham | 
| Chris@10 | 87 (Prentice-Hall, Englewood Cliffs, NJ, 1988). | 
| Chris@10 | 88 <a href="http://www.fftw.org">Our web page</a> also has links to FFT-related | 
| Chris@10 | 89 information online. | 
| Chris@10 | 90 <a name="index-FFTW-3"></a> | 
| Chris@10 | 91 <!-- TODO: revise.  We don't need to brag any longer --> | 
| Chris@10 | 92 <!-- FFTW is usually faster (and sometimes much faster) than all other --> | 
| Chris@10 | 93 <!-- freely-available Fourier transform programs found on the Net.  It is --> | 
| Chris@10 | 94 <!-- competitive with (and often faster than) the FFT codes in Sun's --> | 
| Chris@10 | 95 <!-- Performance Library, IBM's ESSL library, HP's CXML library, and --> | 
| Chris@10 | 96 <!-- Intel's MKL library, which are targeted at specific machines. --> | 
| Chris@10 | 97 <!-- Moreover, FFTW's performance is @emph{portable}.  Indeed, FFTW is --> | 
| Chris@10 | 98 <!-- unique in that it automatically adapts itself to your machine, your --> | 
| Chris@10 | 99 <!-- cache, the size of your memory, your number of registers, and all the --> | 
| Chris@10 | 100 <!-- other factors that normally make it impossible to optimize a program --> | 
| Chris@10 | 101 <!-- for more than one machine.  An extensive comparison of FFTW's --> | 
| Chris@10 | 102 <!-- performance with that of other Fourier transform codes has been made, --> | 
| Chris@10 | 103 <!-- and the results are available on the Web at --> | 
| Chris@10 | 104 <!-- @uref{http://fftw.org/benchfft, the benchFFT home page}. --> | 
| Chris@10 | 105 <!-- @cindex benchmark --> | 
| Chris@10 | 106 <!-- @fpindex benchfft --> | 
| Chris@10 | 107 | 
| Chris@10 | 108    <p>In order to use FFTW effectively, you need to learn one basic concept | 
| Chris@10 | 109 of FFTW's internal structure: FFTW does not use a fixed algorithm for | 
| Chris@10 | 110 computing the transform, but instead it adapts the DFT algorithm to | 
| Chris@10 | 111 details of the underlying hardware in order to maximize performance. | 
| Chris@10 | 112 Hence, the computation of the transform is split into two phases. | 
| Chris@10 | 113 First, FFTW's <dfn>planner</dfn> “learns” the fastest way to compute the | 
| Chris@10 | 114 transform on your machine.  The planner | 
| Chris@10 | 115 <a name="index-planner-4"></a>produces a data structure called a <dfn>plan</dfn> that contains this | 
| Chris@10 | 116 <a name="index-plan-5"></a>information.  Subsequently, the plan is <dfn>executed</dfn> | 
| Chris@10 | 117 <a name="index-execute-6"></a>to transform the array of input data as dictated by the plan.  The | 
| Chris@10 | 118 plan can be reused as many times as needed.  In typical | 
| Chris@10 | 119 high-performance applications, many transforms of the same size are | 
| Chris@10 | 120 computed and, consequently, a relatively expensive initialization of | 
| Chris@10 | 121 this sort is acceptable.  On the other hand, if you need a single | 
| Chris@10 | 122 transform of a given size, the one-time cost of the planner becomes | 
| Chris@10 | 123 significant.  For this case, FFTW provides fast planners based on | 
| Chris@10 | 124 heuristics or on previously computed plans. | 
| Chris@10 | 125 | 
| Chris@10 | 126    <p>FFTW supports transforms of data with arbitrary length, rank, | 
| Chris@10 | 127 multiplicity, and a general memory layout.  In simple cases, however, | 
| Chris@10 | 128 this generality may be unnecessary and confusing.  Consequently, we | 
| Chris@10 | 129 organized the interface to FFTW into three levels of increasing | 
| Chris@10 | 130 generality. | 
| Chris@10 | 131      <ul> | 
| Chris@10 | 132 <li>The <dfn>basic interface</dfn> computes a single | 
| Chris@10 | 133       transform of contiguous data. | 
| Chris@10 | 134 <li>The <dfn>advanced interface</dfn> computes transforms | 
| Chris@10 | 135       of multiple or strided arrays. | 
| Chris@10 | 136 <li>The <dfn>guru interface</dfn> supports the most general data | 
| Chris@10 | 137       layouts, multiplicities, and strides. | 
| Chris@10 | 138 </ul> | 
| Chris@10 | 139    We expect that most users will be best served by the basic interface, | 
| Chris@10 | 140 whereas the guru interface requires careful attention to the | 
| Chris@10 | 141 documentation to avoid problems. | 
| Chris@10 | 142 <a name="index-basic-interface-7"></a><a name="index-advanced-interface-8"></a><a name="index-guru-interface-9"></a> | 
| Chris@10 | 143 | 
| Chris@10 | 144    <p>Besides the automatic performance adaptation performed by the planner, | 
| Chris@10 | 145 it is also possible for advanced users to customize FFTW manually.  For | 
| Chris@10 | 146 example, if code space is a concern, we provide a tool that links only | 
| Chris@10 | 147 the subset of FFTW needed by your application.  Conversely, you may need | 
| Chris@10 | 148 to extend FFTW because the standard distribution is not sufficient for | 
| Chris@10 | 149 your needs.  For example, the standard FFTW distribution works most | 
| Chris@10 | 150 efficiently for arrays whose size can be factored into small primes | 
| Chris@10 | 151 (2, 3, 5, and 7), and otherwise it uses a | 
| Chris@10 | 152 slower general-purpose routine.  If you need efficient transforms of | 
| Chris@10 | 153 other sizes, you can use FFTW's code generator, which produces fast C | 
| Chris@10 | 154 programs (“codelets”) for any particular array size you may care | 
| Chris@10 | 155 about. | 
| Chris@10 | 156 <a name="index-code-generator-10"></a><a name="index-codelet-11"></a>For example, if you need transforms of size | 
| Chris@10 | 157 513 = 19*3<sup>3</sup>,you can customize FFTW to support the factor 19 efficiently. | 
| Chris@10 | 158 | 
| Chris@10 | 159    <p>For more information regarding FFTW, see the paper, “The Design and | 
| Chris@10 | 160 Implementation of FFTW3,” by M. Frigo and S. G. Johnson, which was an | 
| Chris@10 | 161 invited paper in <cite>Proc. IEEE</cite> <b>93</b> (2), p. 216 (2005).  The | 
| Chris@10 | 162 code generator is described in the paper “A fast Fourier transform | 
| Chris@10 | 163 compiler”, | 
| Chris@10 | 164 <a name="index-compiler-12"></a>by M. Frigo, in the <cite>Proceedings of the 1999 ACM SIGPLAN Conference | 
| Chris@10 | 165 on Programming Language Design and Implementation (PLDI), Atlanta, | 
| Chris@10 | 166 Georgia, May 1999</cite>.  These papers, along with the latest version of | 
| Chris@10 | 167 FFTW, the FAQ, benchmarks, and other links, are available at | 
| Chris@10 | 168 <a href="http://www.fftw.org">the FFTW home page</a>. | 
| Chris@10 | 169 | 
| Chris@10 | 170    <p>The current version of FFTW incorporates many good ideas from the past | 
| Chris@10 | 171 thirty years of FFT literature.  In one way or another, FFTW uses the | 
| Chris@10 | 172 Cooley-Tukey algorithm, the prime factor algorithm, Rader's algorithm | 
| Chris@10 | 173 for prime sizes, and a split-radix algorithm (with a | 
| Chris@10 | 174 “conjugate-pair” variation pointed out to us by Dan Bernstein). | 
| Chris@10 | 175 FFTW's code generator also produces new algorithms that we do not | 
| Chris@10 | 176 completely understand. | 
| Chris@10 | 177 <a name="index-algorithm-13"></a>The reader is referred to the cited papers for the appropriate | 
| Chris@10 | 178 references. | 
| Chris@10 | 179 | 
| Chris@10 | 180    <p>The rest of this manual is organized as follows.  We first discuss the | 
| Chris@10 | 181 sequential (single-processor) implementation.  We start by describing | 
| Chris@10 | 182 the basic interface/features of FFTW in <a href="Tutorial.html#Tutorial">Tutorial</a>. | 
| Chris@10 | 183 Next, <a href="Other-Important-Topics.html#Other-Important-Topics">Other Important Topics</a> discusses data alignment | 
| Chris@10 | 184 (see <a href="SIMD-alignment-and-fftw_005fmalloc.html#SIMD-alignment-and-fftw_005fmalloc">SIMD alignment and fftw_malloc</a>), | 
| Chris@10 | 185 the storage scheme of multi-dimensional arrays | 
| Chris@10 | 186 (see <a href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format">Multi-dimensional Array Format</a>), and FFTW's mechanism for | 
| Chris@10 | 187 storing plans on disk (see <a href="Words-of-Wisdom_002dSaving-Plans.html#Words-of-Wisdom_002dSaving-Plans">Words of Wisdom-Saving Plans</a>).  Next, | 
| Chris@10 | 188 <a href="FFTW-Reference.html#FFTW-Reference">FFTW Reference</a> provides comprehensive documentation of all | 
| Chris@10 | 189 FFTW's features.  Parallel transforms are discussed in their own | 
| Chris@10 | 190 chapters: <a href="Multi_002dthreaded-FFTW.html#Multi_002dthreaded-FFTW">Multi-threaded FFTW</a> and <a href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI">Distributed-memory FFTW with MPI</a>.  Fortran programmers can also use FFTW, as described in | 
| Chris@10 | 191 <a href="Calling-FFTW-from-Legacy-Fortran.html#Calling-FFTW-from-Legacy-Fortran">Calling FFTW from Legacy Fortran</a> and <a href="Calling-FFTW-from-Modern-Fortran.html#Calling-FFTW-from-Modern-Fortran">Calling FFTW from Modern Fortran</a>.  <a href="Installation-and-Customization.html#Installation-and-Customization">Installation and Customization</a> explains how to | 
| Chris@10 | 192 install FFTW in your computer system and how to adapt FFTW to your | 
| Chris@10 | 193 needs.  License and copyright information is given in <a href="License-and-Copyright.html#License-and-Copyright">License and Copyright</a>.  Finally, we thank all the people who helped us in | 
| Chris@10 | 194 <a href="Acknowledgments.html#Acknowledgments">Acknowledgments</a>. | 
| Chris@10 | 195 | 
| Chris@10 | 196    </body></html> | 
| Chris@10 | 197 |