d@0
|
1 <html lang="en">
|
d@0
|
2 <head>
|
d@0
|
3 <title>Introduction - FFTW 3.2.1</title>
|
d@0
|
4 <meta http-equiv="Content-Type" content="text/html">
|
d@0
|
5 <meta name="description" content="FFTW 3.2.1">
|
d@0
|
6 <meta name="generator" content="makeinfo 4.8">
|
d@0
|
7 <link title="Top" rel="start" href="index.html#Top">
|
d@0
|
8 <link rel="prev" href="index.html#Top" title="Top">
|
d@0
|
9 <link rel="next" href="Tutorial.html#Tutorial" title="Tutorial">
|
d@0
|
10 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
|
d@0
|
11 <!--
|
d@0
|
12 This manual is for FFTW
|
d@0
|
13 (version 3.2.1, 5 February 2009).
|
d@0
|
14
|
d@0
|
15 Copyright (C) 2003 Matteo Frigo.
|
d@0
|
16
|
d@0
|
17 Copyright (C) 2003 Massachusetts Institute of Technology.
|
d@0
|
18
|
d@0
|
19 Permission is granted to make and distribute verbatim copies of
|
d@0
|
20 this manual provided the copyright notice and this permission
|
d@0
|
21 notice are preserved on all copies.
|
d@0
|
22
|
d@0
|
23 Permission is granted to copy and distribute modified versions of
|
d@0
|
24 this manual under the conditions for verbatim copying, provided
|
d@0
|
25 that the entire resulting derived work is distributed under the
|
d@0
|
26 terms of a permission notice identical to this one.
|
d@0
|
27
|
d@0
|
28 Permission is granted to copy and distribute translations of this
|
d@0
|
29 manual into another language, under the above conditions for
|
d@0
|
30 modified versions, except that this permission notice may be
|
d@0
|
31 stated in a translation approved by the Free Software Foundation.
|
d@0
|
32 -->
|
d@0
|
33 <meta http-equiv="Content-Style-Type" content="text/css">
|
d@0
|
34 <style type="text/css"><!--
|
d@0
|
35 pre.display { font-family:inherit }
|
d@0
|
36 pre.format { font-family:inherit }
|
d@0
|
37 pre.smalldisplay { font-family:inherit; font-size:smaller }
|
d@0
|
38 pre.smallformat { font-family:inherit; font-size:smaller }
|
d@0
|
39 pre.smallexample { font-size:smaller }
|
d@0
|
40 pre.smalllisp { font-size:smaller }
|
d@0
|
41 span.sc { font-variant:small-caps }
|
d@0
|
42 span.roman { font-family:serif; font-weight:normal; }
|
d@0
|
43 span.sansserif { font-family:sans-serif; font-weight:normal; }
|
d@0
|
44 --></style>
|
d@0
|
45 </head>
|
d@0
|
46 <body>
|
d@0
|
47 <div class="node">
|
d@0
|
48 <p>
|
d@0
|
49 <a name="Introduction"></a>
|
d@0
|
50 Next: <a rel="next" accesskey="n" href="Tutorial.html#Tutorial">Tutorial</a>,
|
d@0
|
51 Previous: <a rel="previous" accesskey="p" href="index.html#Top">Top</a>,
|
d@0
|
52 Up: <a rel="up" accesskey="u" href="index.html#Top">Top</a>
|
d@0
|
53 <hr>
|
d@0
|
54 </div>
|
d@0
|
55
|
d@0
|
56 <h2 class="chapter">1 Introduction</h2>
|
d@0
|
57
|
d@0
|
58 <p>This manual documents version 3.2.1 of FFTW, the
|
d@0
|
59 <em>Fastest Fourier Transform in the West</em>. FFTW is a comprehensive
|
d@0
|
60 collection of fast C routines for computing the discrete Fourier
|
d@0
|
61 transform (DFT) and various special cases thereof.
|
d@0
|
62 <a name="index-discrete-Fourier-transform-1"></a><a name="index-DFT-2"></a>
|
d@0
|
63 <ul>
|
d@0
|
64 <li>FFTW computes the DFT of complex data, real data, even-
|
d@0
|
65 or odd-symmetric real data (these symmetric transforms are usually
|
d@0
|
66 known as the discrete cosine or sine transform, respectively), and the
|
d@0
|
67 discrete Hartley transform (DHT) of real data.
|
d@0
|
68
|
d@0
|
69 <li>The input data can have arbitrary length.
|
d@0
|
70 FFTW employs <i>O</i>(<i>n</i> log <i>n</i>) algorithms for all lengths, including
|
d@0
|
71 prime numbers.
|
d@0
|
72
|
d@0
|
73 <li>FFTW supports arbitrary multi-dimensional data.
|
d@0
|
74
|
d@0
|
75 <li>FFTW supports the SSE, SSE2, Altivec, and MIPS PS instruction
|
d@0
|
76 sets.
|
d@0
|
77
|
d@0
|
78 <li>FFTW 3.2.1 includes parallel (multi-threaded) transforms
|
d@0
|
79 for shared-memory systems.
|
d@0
|
80 FFTW 3.2.1 does not include distributed-memory parallel
|
d@0
|
81 transforms, but we plan to implement an MPI version soon. (Meanwhile,
|
d@0
|
82 you can use the MPI implementation from FFTW 2.1.3.)
|
d@0
|
83 </ul>
|
d@0
|
84
|
d@0
|
85 <p>We assume herein that you are familiar with the properties and uses of
|
d@0
|
86 the DFT that are relevant to your application. Otherwise, see
|
d@0
|
87 e.g. <cite>The Fast Fourier Transform and Its Applications</cite> by E. O. Brigham
|
d@0
|
88 (Prentice-Hall, Englewood Cliffs, NJ, 1988).
|
d@0
|
89 <a href="http://www.fftw.org">Our web page</a> also has links to FFT-related
|
d@0
|
90 information online.
|
d@0
|
91 <a name="index-FFTW-3"></a>
|
d@0
|
92 <!-- TODO: revise. We don't need to brag any longer -->
|
d@0
|
93 <!-- FFTW is usually faster (and sometimes much faster) than all other -->
|
d@0
|
94 <!-- freely-available Fourier transform programs found on the Net. It is -->
|
d@0
|
95 <!-- competitive with (and often faster than) the FFT codes in Sun's -->
|
d@0
|
96 <!-- Performance Library, IBM's ESSL library, HP's CXML library, and -->
|
d@0
|
97 <!-- Intel's MKL library, which are targeted at specific machines. -->
|
d@0
|
98 <!-- Moreover, FFTW's performance is @emph{portable}. Indeed, FFTW is -->
|
d@0
|
99 <!-- unique in that it automatically adapts itself to your machine, your -->
|
d@0
|
100 <!-- cache, the size of your memory, your number of registers, and all the -->
|
d@0
|
101 <!-- other factors that normally make it impossible to optimize a program -->
|
d@0
|
102 <!-- for more than one machine. An extensive comparison of FFTW's -->
|
d@0
|
103 <!-- performance with that of other Fourier transform codes has been made, -->
|
d@0
|
104 <!-- and the results are available on the Web at -->
|
d@0
|
105 <!-- @uref{http://fftw.org/~benchfft, the benchFFT home page}. -->
|
d@0
|
106 <!-- @cindex benchmark -->
|
d@0
|
107 <!-- @fpindex benchfft -->
|
d@0
|
108
|
d@0
|
109 <p>In order to use FFTW effectively, you need to learn one basic concept
|
d@0
|
110 of FFTW's internal structure: FFTW does not use a fixed algorithm for
|
d@0
|
111 computing the transform, but instead it adapts the DFT algorithm to
|
d@0
|
112 details of the underlying hardware in order to maximize performance.
|
d@0
|
113 Hence, the computation of the transform is split into two phases.
|
d@0
|
114 First, FFTW's <dfn>planner</dfn> “learns” the fastest way to compute the
|
d@0
|
115 transform on your machine. The planner
|
d@0
|
116 <a name="index-planner-4"></a>produces a data structure called a <dfn>plan</dfn> that contains this
|
d@0
|
117 <a name="index-plan-5"></a>information. Subsequently, the plan is <dfn>executed</dfn>
|
d@0
|
118 <a name="index-execute-6"></a>to transform the array of input data as dictated by the plan. The
|
d@0
|
119 plan can be reused as many times as needed. In typical
|
d@0
|
120 high-performance applications, many transforms of the same size are
|
d@0
|
121 computed and, consequently, a relatively expensive initialization of
|
d@0
|
122 this sort is acceptable. On the other hand, if you need a single
|
d@0
|
123 transform of a given size, the one-time cost of the planner becomes
|
d@0
|
124 significant. For this case, FFTW provides fast planners based on
|
d@0
|
125 heuristics or on previously computed plans.
|
d@0
|
126
|
d@0
|
127 <p>FFTW supports transforms of data with arbitrary length, rank,
|
d@0
|
128 multiplicity, and a general memory layout. In simple cases, however,
|
d@0
|
129 this generality may be unnecessary and confusing. Consequently, we
|
d@0
|
130 organized the interface to FFTW into three levels of increasing
|
d@0
|
131 generality.
|
d@0
|
132 <ul>
|
d@0
|
133 <li>The <dfn>basic interface</dfn> computes a single
|
d@0
|
134 transform of contiguous data.
|
d@0
|
135 <li>The <dfn>advanced interface</dfn> computes transforms
|
d@0
|
136 of multiple or strided arrays.
|
d@0
|
137 <li>The <dfn>guru interface</dfn> supports the most general data
|
d@0
|
138 layouts, multiplicities, and strides.
|
d@0
|
139 </ul>
|
d@0
|
140 We expect that most users will be best served by the basic interface,
|
d@0
|
141 whereas the guru interface requires careful attention to the
|
d@0
|
142 documentation to avoid problems.
|
d@0
|
143 <a name="index-basic-interface-7"></a><a name="index-advanced-interface-8"></a><a name="index-guru-interface-9"></a>
|
d@0
|
144 Besides the automatic performance adaptation performed by the planner,
|
d@0
|
145 it is also possible for advanced users to customize FFTW manually. For
|
d@0
|
146 example, if code space is a concern, we provide a tool that links only
|
d@0
|
147 the subset of FFTW needed by your application. Conversely, you may need
|
d@0
|
148 to extend FFTW because the standard distribution is not sufficient for
|
d@0
|
149 your needs. For example, the standard FFTW distribution works most
|
d@0
|
150 efficiently for arrays whose size can be factored into small primes
|
d@0
|
151 (2, 3, 5, and 7), and otherwise it uses a
|
d@0
|
152 slower general-purpose routine. If you need efficient transforms of
|
d@0
|
153 other sizes, you can use FFTW's code generator, which produces fast C
|
d@0
|
154 programs (“codelets”) for any particular array size you may care
|
d@0
|
155 about.
|
d@0
|
156 <a name="index-code-generator-10"></a><a name="index-codelet-11"></a>For example, if you need transforms of size
|
d@0
|
157 513 = 19*3<sup>3</sup>,you can customize FFTW to support the factor 19 efficiently.
|
d@0
|
158
|
d@0
|
159 <p>For more information regarding FFTW, see the paper, “The Design and
|
d@0
|
160 Implementation of FFTW3,” by M. Frigo and S. G. Johnson, which was an
|
d@0
|
161 invited paper in <cite>Proc. IEEE</cite> <b>93</b> (2), p. 216 (2005). The
|
d@0
|
162 code generator is described in the paper “A fast Fourier transform
|
d@0
|
163 compiler”,
|
d@0
|
164 <a name="index-compiler-12"></a>by M. Frigo, in the <cite>Proceedings of the 1999 ACM SIGPLAN Conference
|
d@0
|
165 on Programming Language Design and Implementation (PLDI), Atlanta,
|
d@0
|
166 Georgia, May 1999</cite>. These papers, along with the latest version of
|
d@0
|
167 FFTW, the FAQ, benchmarks, and other links, are available at
|
d@0
|
168 <a href="http://www.fftw.org">the FFTW home page</a>.
|
d@0
|
169
|
d@0
|
170 <p>The current version of FFTW incorporates many good ideas from the past
|
d@0
|
171 thirty years of FFT literature. In one way or another, FFTW uses the
|
d@0
|
172 Cooley-Tukey algorithm, the prime factor algorithm, Rader's algorithm
|
d@0
|
173 for prime sizes, and a split-radix algorithm (with a variation due to
|
d@0
|
174 Dan Bernstein). FFTW's code generator also produces new algorithms
|
d@0
|
175 that we do not completely understand.
|
d@0
|
176 <a name="index-algorithm-13"></a>The reader is referred to the cited papers for the appropriate
|
d@0
|
177 references.
|
d@0
|
178
|
d@0
|
179 <p>The rest of this manual is organized as follows. We first discuss the
|
d@0
|
180 sequential (single-processor) implementation. We start by describing
|
d@0
|
181 the basic interface/features of FFTW in <a href="Tutorial.html#Tutorial">Tutorial</a>. The following
|
d@0
|
182 chapter discusses <a href="Other-Important-Topics.html#Other-Important-Topics">Other Important Topics</a>, including <a href="Data-Alignment.html#Data-Alignment">Data Alignment</a>, the storage scheme of multi-dimensional arrays
|
d@0
|
183 (see <a href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format">Multi-dimensional Array Format</a>), and FFTW's mechanism for
|
d@0
|
184 storing plans on disk (see <a href="Words-of-Wisdom_002dSaving-Plans.html#Words-of-Wisdom_002dSaving-Plans">Words of Wisdom-Saving Plans</a>). Next,
|
d@0
|
185 <a href="FFTW-Reference.html#FFTW-Reference">FFTW Reference</a> provides comprehensive documentation of all
|
d@0
|
186 FFTW's features. Parallel transforms are discussed in their own
|
d@0
|
187 chapters: <a href="Multi_002dthreaded-FFTW.html#Multi_002dthreaded-FFTW">Multi-threaded FFTW</a>. Fortran programmers can also use
|
d@0
|
188 FFTW, as described in <a href="Calling-FFTW-from-Fortran.html#Calling-FFTW-from-Fortran">Calling FFTW from Fortran</a>.
|
d@0
|
189 <a href="Installation-and-Customization.html#Installation-and-Customization">Installation and Customization</a> explains how to install FFTW in
|
d@0
|
190 your computer system and how to adapt FFTW to your needs. License and
|
d@0
|
191 copyright information is given in <a href="License-and-Copyright.html#License-and-Copyright">License and Copyright</a>.
|
d@0
|
192 Finally, we thank all the people who helped us in
|
d@0
|
193 <a href="Acknowledgments.html#Acknowledgments">Acknowledgments</a>.
|
d@0
|
194
|
d@0
|
195 <!-- ************************************************************ -->
|
d@0
|
196 </body></html>
|
d@0
|
197
|