annotate src/fftw-3.3.5/doc/FAQ/fftw-faq.html/section3.html @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 2cd0e3b3e1fd
children
rev   line source
Chris@42 1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
Chris@42 2 <html>
Chris@42 3 <head><title>
Chris@42 4 FFTW FAQ - Section 3
Chris@42 5 </title>
Chris@42 6 <link rev="made" href="mailto:fftw@fftw.org">
Chris@42 7 <link rel="Contents" href="index.html">
Chris@42 8 <link rel="Start" href="index.html">
Chris@42 9 <link rel="Next" href="section4.html"><link rel="Previous" href="section2.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
Chris@42 10 </head><body text="#000000" bgcolor="#FFFFFF"><h1>
Chris@42 11 FFTW FAQ - Section 3 <br>
Chris@42 12 Using FFTW
Chris@42 13 </h1>
Chris@42 14
Chris@42 15 <ul>
Chris@42 16 <li><a href="#fftw2to3" rel=subdocument>Q3.1. Why not support the FFTW 2 interface in FFTW
Chris@42 17 3?</a>
Chris@42 18 <li><a href="#planperarray" rel=subdocument>Q3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just
Chris@42 19 the algorithm?</a>
Chris@42 20 <li><a href="#slow" rel=subdocument>Q3.3. FFTW seems really slow.</a>
Chris@42 21 <li><a href="#slows" rel=subdocument>Q3.4. FFTW slows down after repeated calls.</a>
Chris@42 22 <li><a href="#segfault" rel=subdocument>Q3.5. An FFTW routine is crashing when I call it.</a>
Chris@42 23 <li><a href="#fortran64" rel=subdocument>Q3.6. My Fortran program crashes when calling FFTW.</a>
Chris@42 24 <li><a href="#conventions" rel=subdocument>Q3.7. FFTW gives results different from my old
Chris@42 25 FFT.</a>
Chris@42 26 <li><a href="#nondeterministic" rel=subdocument>Q3.8. FFTW gives different results between runs</a>
Chris@42 27 <li><a href="#savePlans" rel=subdocument>Q3.9. Can I save FFTW's plans?</a>
Chris@42 28 <li><a href="#whyscaled" rel=subdocument>Q3.10. Why does your inverse transform return a scaled
Chris@42 29 result?</a>
Chris@42 30 <li><a href="#centerorigin" rel=subdocument>Q3.11. How can I make FFTW put the origin (zero frequency) at the center of
Chris@42 31 its output?</a>
Chris@42 32 <li><a href="#imageaudio" rel=subdocument>Q3.12. How do I FFT an image/audio file in <i>foobar</i> format?</a>
Chris@42 33 <li><a href="#linkfails" rel=subdocument>Q3.13. My program does not link (on Unix).</a>
Chris@42 34 <li><a href="#linkheader" rel=subdocument>Q3.14. I included your header, but linking still
Chris@42 35 fails.</a>
Chris@42 36 <li><a href="#nostack" rel=subdocument>Q3.15. My program crashes, complaining about stack
Chris@42 37 space.</a>
Chris@42 38 <li><a href="#leaks" rel=subdocument>Q3.16. FFTW seems to have a memory leak.</a>
Chris@42 39 <li><a href="#allzero" rel=subdocument>Q3.17. The output of FFTW's transform is all zeros.</a>
Chris@42 40 <li><a href="#vbetalia" rel=subdocument>Q3.18. How do I call FFTW from the Microsoft language du
Chris@42 41 jour?</a>
Chris@42 42 <li><a href="#pruned" rel=subdocument>Q3.19. Can I compute only a subset of the DFT outputs?</a>
Chris@42 43 <li><a href="#transpose" rel=subdocument>Q3.20. Can I use FFTW's routines for in-place and out-of-place matrix
Chris@42 44 transposition?</a>
Chris@42 45 </ul><hr>
Chris@42 46
Chris@42 47 <h2><A name="fftw2to3">
Chris@42 48 Question 3.1. Why not support the FFTW 2 interface in FFTW
Chris@42 49 3?
Chris@42 50 </A></h2>
Chris@42 51
Chris@42 52 FFTW 3 has semantics incompatible with earlier versions: its plans can
Chris@42 53 only be used for a given stride, multiplicity, and other
Chris@42 54 characteristics of the input and output arrays; these stronger
Chris@42 55 semantics are necessary for performance reasons. Thus, it is
Chris@42 56 impossible to efficiently emulate the older interface (whose plans can
Chris@42 57 be used for any transform of the same size). We believe that it
Chris@42 58 should be possible to upgrade most programs without any difficulty,
Chris@42 59 however.
Chris@42 60 <h2><A name="planperarray">
Chris@42 61 Question 3.2. Why do FFTW 3 plans encapsulate the input/output arrays
Chris@42 62 and not just the algorithm?
Chris@42 63 </A></h2>
Chris@42 64
Chris@42 65 There are several reasons:
Chris@42 66 <ul>
Chris@42 67 <li>It was important for performance reasons that the plan be specific to
Chris@42 68 array characteristics like the stride (and alignment, for SIMD), and
Chris@42 69 requiring that the user maintain these invariants is error prone.
Chris@42 70
Chris@42 71 <li>In most high-performance applications, as far as we can tell, you are
Chris@42 72 usually transforming the same array over and over, so FFTW's semantics
Chris@42 73 should not be a burden.
Chris@42 74 <li>If you need to transform another array of the same size, creating a
Chris@42 75 new plan once the first exists is a cheap operation.
Chris@42 76
Chris@42 77 <li>If you need to transform many arrays of the same size at once, you
Chris@42 78 should really use the <code>plan_many</code> routines in FFTW's &quot;advanced&quot;
Chris@42 79 interface.
Chris@42 80 <li>If the abovementioned array characteristics are the same, you are
Chris@42 81 willing to pay close attention to the documentation, and you really
Chris@42 82 need to, we provide a &quot;new-array execution&quot; interface to
Chris@42 83 apply a plan to a new array.
Chris@42 84 </ul>
Chris@42 85
Chris@42 86 <h2><A name="slow">
Chris@42 87 Question 3.3. FFTW seems really slow.
Chris@42 88 </A></h2>
Chris@42 89
Chris@42 90 You are probably recreating the plan before every transform, rather
Chris@42 91 than creating it once and reusing it for all transforms of the same
Chris@42 92 size. FFTW is designed to be used in the following way:
Chris@42 93
Chris@42 94 <ul>
Chris@42 95 <li>First, you create a plan. This will take several seconds.
Chris@42 96
Chris@42 97 <li>Then, you reuse the plan many times to perform FFTs. These are fast.
Chris@42 98
Chris@42 99 </ul>
Chris@42 100 If you don't need to compute many transforms and the time for the
Chris@42 101 planner is significant, you have two options. First, you can use the
Chris@42 102 <code>FFTW_ESTIMATE</code> option in the planner, which uses heuristics
Chris@42 103 instead of runtime measurements and produces a good plan in a short
Chris@42 104 time. Second, you can use the wisdom feature to precompute the plan;
Chris@42 105 see <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A>
Chris@42 106 <h2><A name="slows">
Chris@42 107 Question 3.4. FFTW slows down after repeated
Chris@42 108 calls.
Chris@42 109 </A></h2>
Chris@42 110
Chris@42 111 Probably, NaNs or similar are creeping into your data, and the
Chris@42 112 slowdown is due to the resulting floating-point exceptions. For
Chris@42 113 example, be aware that repeatedly FFTing the same array is a diverging
Chris@42 114 process (because FFTW computes the unnormalized transform).
Chris@42 115
Chris@42 116 <h2><A name="segfault">
Chris@42 117 Question 3.5. An FFTW routine is crashing when I call
Chris@42 118 it.
Chris@42 119 </A></h2>
Chris@42 120
Chris@42 121 Did the FFTW test programs pass (<code>make check</code>, or <code>cd tests; make bigcheck</code> if you want to be paranoid)? If so, you almost
Chris@42 122 certainly have a bug in your own code. For example, you could be
Chris@42 123 passing invalid arguments (such as wrongly-sized arrays) to FFTW, or
Chris@42 124 you could simply have memory corruption elsewhere in your program that
Chris@42 125 causes random crashes later on. Please don't complain to us unless
Chris@42 126 you can come up with a minimal self-contained program (preferably
Chris@42 127 under 30 lines) that illustrates the problem.
Chris@42 128
Chris@42 129 <h2><A name="fortran64">
Chris@42 130 Question 3.6. My Fortran program crashes when calling
Chris@42 131 FFTW.
Chris@42 132 </A></h2>
Chris@42 133
Chris@42 134 As described in the manual, on 64-bit machines you must store the
Chris@42 135 plans in variables large enough to hold a pointer, for example
Chris@42 136 <code>integer*8</code>. We recommend using <code>integer*8</code> on 32-bit machines as well, to simplify porting.
Chris@42 137
Chris@42 138 <h2><A name="conventions">
Chris@42 139 Question 3.7. FFTW gives results different from my old
Chris@42 140 FFT.
Chris@42 141 </A></h2>
Chris@42 142
Chris@42 143 People follow many different conventions for the DFT, and you should
Chris@42 144 be sure to know the ones that we use (described in the FFTW manual).
Chris@42 145 In particular, you should be aware that the
Chris@42 146 <code>FFTW_FORWARD</code>/<code>FFTW_BACKWARD</code> directions correspond to signs of -1/+1 in the exponent of the DFT definition.
Chris@42 147 (<i>Numerical Recipes</i> uses the opposite convention.)
Chris@42 148 <p>
Chris@42 149 You should also know that we compute an unnormalized transform. In
Chris@42 150 contrast, Matlab is an example of program that computes a normalized
Chris@42 151 transform. See <A href="#whyscaled">Q3.10 `Why does your inverse transform return a scaled
Chris@42 152 result?'</A>.
Chris@42 153 <p>
Chris@42 154 Finally, note that floating-point arithmetic is not exact, so
Chris@42 155 different FFT algorithms will give slightly different results (on the
Chris@42 156 order of the numerical accuracy; typically a fractional difference of
Chris@42 157 1e-15 or so in double precision).
Chris@42 158 <h2><A name="nondeterministic">
Chris@42 159 Question 3.8. FFTW gives different results between
Chris@42 160 runs
Chris@42 161 </A></h2>
Chris@42 162
Chris@42 163 If you use <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> mode, then the algorithm FFTW employs is not deterministic: it depends on
Chris@42 164 runtime performance measurements. This will cause the results to vary
Chris@42 165 slightly from run to run. However, the differences should be slight,
Chris@42 166 on the order of the floating-point precision, and therefore should
Chris@42 167 have no practical impact on most applications.
Chris@42 168
Chris@42 169 <p>
Chris@42 170 If you use saved plans (wisdom) or <code>FFTW_ESTIMATE</code> mode, however, then the algorithm is deterministic and the results should be
Chris@42 171 identical between runs.
Chris@42 172 <h2><A name="savePlans">
Chris@42 173 Question 3.9. Can I save FFTW's plans?
Chris@42 174 </A></h2>
Chris@42 175
Chris@42 176 Yes. Starting with version 1.2, FFTW provides the
Chris@42 177 <code>wisdom</code> mechanism for saving plans; see the FFTW manual.
Chris@42 178
Chris@42 179 <h2><A name="whyscaled">
Chris@42 180 Question 3.10. Why does your inverse transform return a scaled
Chris@42 181 result?
Chris@42 182 </A></h2>
Chris@42 183
Chris@42 184 Computing the forward transform followed by the backward transform (or
Chris@42 185 vice versa) yields the original array scaled by the size of the array.
Chris@42 186 (For multi-dimensional transforms, the size of the array is the
Chris@42 187 product of the dimensions.) We could, instead, have chosen a
Chris@42 188 normalization that would have returned the unscaled array. Or, to
Chris@42 189 accomodate the many conventions in this matter, the transform routines
Chris@42 190 could have accepted a &quot;scale factor&quot; parameter. We did not
Chris@42 191 do this, however, for two reasons. First, we didn't want to sacrifice
Chris@42 192 performance in the common case where the scale factor is 1. Second, in
Chris@42 193 real applications the FFT is followed or preceded by some computation
Chris@42 194 on the data, into which the scale factor can typically be absorbed at
Chris@42 195 little or no cost.
Chris@42 196 <h2><A name="centerorigin">
Chris@42 197 Question 3.11. How can I make FFTW put the origin (zero frequency) at
Chris@42 198 the center of its output?
Chris@42 199 </A></h2>
Chris@42 200
Chris@42 201 For human viewing of a spectrum, it is often convenient to put the
Chris@42 202 origin in frequency space at the center of the output array, rather
Chris@42 203 than in the zero-th element (the default in FFTW). If all of the
Chris@42 204 dimensions of your array are even, you can accomplish this by simply
Chris@42 205 multiplying each element of the input array by (-1)^(i + j + ...),
Chris@42 206 where i, j, etcetera are the indices of the element. (This trick is a
Chris@42 207 general property of the DFT, and is not specific to FFTW.)
Chris@42 208
Chris@42 209 <h2><A name="imageaudio">
Chris@42 210 Question 3.12. How do I FFT an image/audio file in
Chris@42 211 <i>foobar</i> format?
Chris@42 212 </A></h2>
Chris@42 213
Chris@42 214 FFTW performs an FFT on an array of floating-point values. You can
Chris@42 215 certainly use it to compute the transform of an image or audio stream,
Chris@42 216 but you are responsible for figuring out your data format and
Chris@42 217 converting it to the form FFTW requires.
Chris@42 218
Chris@42 219 <h2><A name="linkfails">
Chris@42 220 Question 3.13. My program does not link (on
Chris@42 221 Unix).
Chris@42 222 </A></h2>
Chris@42 223
Chris@42 224 The libraries must be listed in the correct order
Chris@42 225 (<code>-lfftw3 -lm</code> for FFTW 3.x) and <i>after</i> your program sources/objects. (The general rule is that if <i>A</i> uses <i>B</i>, then <i>A</i> must be listed before <i>B</i> in the link command.).
Chris@42 226 <h2><A name="linkheader">
Chris@42 227 Question 3.14. I included your header, but linking still
Chris@42 228 fails.
Chris@42 229 </A></h2>
Chris@42 230
Chris@42 231 You're a C++ programmer, aren't you? You have to compile the FFTW
Chris@42 232 library and link it into your program, not just
Chris@42 233 <code>#include &lt;fftw3.h&gt;</code>. (Yes, this is really a FAQ.)
Chris@42 234 <h2><A name="nostack">
Chris@42 235 Question 3.15. My program crashes, complaining about stack
Chris@42 236 space.
Chris@42 237 </A></h2>
Chris@42 238
Chris@42 239 You cannot declare large arrays with automatic storage (e.g. via
Chris@42 240 <code>fftw_complex array[N]</code>); you should use <code>fftw_malloc</code> (or equivalent) to allocate the arrays you want
Chris@42 241 to transform if they are larger than a few hundred elements.
Chris@42 242
Chris@42 243 <h2><A name="leaks">
Chris@42 244 Question 3.16. FFTW seems to have a memory
Chris@42 245 leak.
Chris@42 246 </A></h2>
Chris@42 247
Chris@42 248 After you create a plan, FFTW caches the information required to
Chris@42 249 quickly recreate the plan. (See <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A>) It also maintains a small amount of other persistent memory. You can deallocate all of
Chris@42 250 FFTW's internally allocated memory, if you wish, by calling
Chris@42 251 <code>fftw_cleanup()</code>, as documented in the manual.
Chris@42 252 <h2><A name="allzero">
Chris@42 253 Question 3.17. The output of FFTW's transform is all
Chris@42 254 zeros.
Chris@42 255 </A></h2>
Chris@42 256
Chris@42 257 You should initialize your input array <i>after</i> creating the plan, unless you use <code>FFTW_ESTIMATE</code>: planning with <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> overwrites the input/output arrays, as described in the manual.
Chris@42 258
Chris@42 259 <h2><A name="vbetalia">
Chris@42 260 Question 3.18. How do I call FFTW from the Microsoft language du
Chris@42 261 jour?
Chris@42 262 </A></h2>
Chris@42 263
Chris@42 264 Please <i>do not</i> ask us Windows-specific questions. We do not
Chris@42 265 use Windows. We know nothing about Visual Basic, Visual C++, or .NET.
Chris@42 266 Please find the appropriate Usenet discussion group and ask your
Chris@42 267 question there. See also <A href="section2.html#runOnWindows">Q2.2 `Does FFTW run on Windows?'</A>.
Chris@42 268 <h2><A name="pruned">
Chris@42 269 Question 3.19. Can I compute only a subset of the DFT
Chris@42 270 outputs?
Chris@42 271 </A></h2>
Chris@42 272
Chris@42 273 In general, no, an FFT intrinsically computes all outputs from all
Chris@42 274 inputs. In principle, there is something called a
Chris@42 275 <i>pruned FFT</i> that can do what you want, but to compute K outputs out of N the
Chris@42 276 complexity is in general O(N log K) instead of O(N log N), thus saving
Chris@42 277 only a small additive factor in the log. (The same argument holds if
Chris@42 278 you instead have only K nonzero inputs.)
Chris@42 279
Chris@42 280 <p>
Chris@42 281 There are some specific cases in which you can get the O(N log K)
Chris@42 282 performance benefits easily, however, by combining a few ordinary
Chris@42 283 FFTs. In particular, the case where you want the first K outputs,
Chris@42 284 where K divides N, can be handled by performing N/K transforms of size
Chris@42 285 K and then summing the outputs multiplied by appropriate phase
Chris@42 286 factors. For more details, see <A href="http://www.fftw.org/pruned.html">pruned FFTs with FFTW</A>.
Chris@42 287 <p>
Chris@42 288 There are also some algorithms that compute pruned transforms
Chris@42 289 <i>approximately</i>, but they are beyond the scope of this FAQ.
Chris@42 290
Chris@42 291 <h2><A name="transpose">
Chris@42 292 Question 3.20. Can I use FFTW's routines for in-place and
Chris@42 293 out-of-place matrix transposition?
Chris@42 294 </A></h2>
Chris@42 295
Chris@42 296 You can use the FFTW guru interface to create a rank-0 transform of
Chris@42 297 vector rank 2 where the vector strides are transposed. (A rank-0
Chris@42 298 transform is equivalent to a 1D transform of size 1, which. just
Chris@42 299 copies the input into the output.) Specifying the same location for
Chris@42 300 the input and output makes the transpose in-place.
Chris@42 301
Chris@42 302 <p>
Chris@42 303 For double-valued data stored in row-major format, plan creation looks
Chris@42 304 like this: <pre>
Chris@42 305 fftw_plan plan_transpose(int rows, int cols, double *in, double *out)
Chris@42 306 {
Chris@42 307 const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */
Chris@42 308 fftw_iodim howmany_dims[2];
Chris@42 309
Chris@42 310 howmany_dims[0].n = rows;
Chris@42 311 howmany_dims[0].is = cols;
Chris@42 312 howmany_dims[0].os = 1;
Chris@42 313
Chris@42 314 howmany_dims[1].n = cols;
Chris@42 315 howmany_dims[1].is = 1;
Chris@42 316 howmany_dims[1].os = rows;
Chris@42 317
Chris@42 318 return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL,
Chris@42 319 /*howmany_rank=*/ 2, howmany_dims,
Chris@42 320 in, out, /*kind=*/ NULL, flags);
Chris@42 321 }
Chris@42 322 </pre>
Chris@42 323 (This entry was written by Rhys Ulerich.)
Chris@42 324 <hr>
Chris@42 325 Next: <a href="section4.html" rel=precedes>Internals of FFTW</a>.<br>
Chris@42 326 Back: <a href="section2.html" rev=precedes>Installing FFTW</a>.<br>
Chris@42 327 <a href="index.html" rev=subdocument>Return to contents</a>.<p>
Chris@42 328 <address>
Chris@42 329 <A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
Chris@42 330 - 30 July 2016
Chris@42 331 </address><br>
Chris@42 332 Extracted from FFTW Frequently Asked Questions with Answers,
Chris@42 333 Copyright &copy; 2016 Matteo Frigo and Massachusetts Institute of Technology.
Chris@42 334 </body></html>