annotate src/fftw-3.3.8/doc/FAQ/fftw-faq.bfnn @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents d0c2a83c1364
children
rev   line source
Chris@82 1 \comment This is the source for the FFTW FAQ list, in
Chris@82 2 \comment the Bizarre Format With No Name. It is turned into Lout
Chris@82 3 \comment input, HTML, plain ASCII and an Info document by a Perl script.
Chris@82 4 \comment
Chris@82 5 \comment The format and scripts come from the Linux FAQ, by
Chris@82 6 \comment Ian Jackson.
Chris@82 7 \set brieftitle FFTW FAQ
Chris@82 8 \set author <A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
Chris@82 9 \set authormail fftw@fftw.org
Chris@82 10 \set title FFTW Frequently Asked Questions with Answers
Chris@82 11 \set copyholder Matteo Frigo and Massachusetts Institute of Technology
Chris@82 12 \call-html startup html.refs2
Chris@82 13 \copyto ASCII
Chris@82 14 FFTW FREQUENTLY ASKED QUESTIONS WITH ANSWERS
Chris@82 15 `date '+%d %h %Y'`
Chris@82 16 Matteo Frigo
Chris@82 17 Steven G. Johnson
Chris@82 18 <fftw@fftw.org>
Chris@82 19
Chris@82 20 \endcopy
Chris@82 21 \copyto INFO
Chris@82 22 INFO-DIR-SECTION Development
Chris@82 23 START-INFO-DIR-ENTRY
Chris@82 24 * FFTW FAQ: (fftw-faq). FFTW Frequently Asked Questions with Answers.
Chris@82 25 END-INFO-DIR-ENTRY
Chris@82 26
Chris@82 27 
Chris@82 28 File: $prefix.info, Node: Top, Next: Question 1.1, Up: (dir)
Chris@82 29
Chris@82 30 FFTW FREQUENTLY ASKED QUESTIONS WITH ANSWERS
Chris@82 31 `date '+%d %h %Y'`
Chris@82 32 Matteo Frigo
Chris@82 33 Steven G. Johnson
Chris@82 34 <fftw@fftw.org>
Chris@82 35
Chris@82 36 \endcopy
Chris@82 37
Chris@82 38 This is the list of Frequently Asked Questions about FFTW, a
Chris@82 39 collection of fast C routines for computing the Discrete Fourier
Chris@82 40 Transform in one or more dimensions.
Chris@82 41
Chris@82 42 \section Index
Chris@82 43
Chris@82 44 \index
Chris@82 45
Chris@82 46 \comment ######################################################################
Chris@82 47
Chris@82 48 \section Introduction and General Information
Chris@82 49
Chris@82 50 \question 26aug:whatisfftw What is FFTW?
Chris@82 51
Chris@82 52 FFTW is a free collection of fast C routines for computing the
Chris@82 53 Discrete Fourier Transform in one or more dimensions. It includes
Chris@82 54 complex, real, symmetric, and parallel transforms, and can handle
Chris@82 55 arbitrary array sizes efficiently. FFTW is typically faster than
Chris@82 56 other publically-available FFT implementations, and is even
Chris@82 57 competitive with vendor-tuned libraries. (See our web page for
Chris@82 58 extensive benchmarks.) To achieve this performance, FFTW uses novel
Chris@82 59 code-generation and runtime self-optimization techniques (along with
Chris@82 60 many other tricks).
Chris@82 61
Chris@82 62 \question 26aug:whereisfftw How do I obtain FFTW?
Chris@82 63
Chris@82 64 FFTW can be found at \docref{the FFTW web page\}. You can also
Chris@82 65 retrieve it from \ftpon ftp.fftw.org in \ftpin /pub/fftw.
Chris@82 66
Chris@82 67 \question 26aug:isfftwfree Is FFTW free software?
Chris@82 68
Chris@82 69 Starting with version 1.3, FFTW is Free Software in the technical
Chris@82 70 sense defined by the Free Software Foundation (see \docref{Categories
Chris@82 71 of Free and Non-Free Software\}), and is distributed under the terms
Chris@82 72 of the GNU General Public License. Previous versions of FFTW were
Chris@82 73 distributed without fee for noncommercial use, but were not
Chris@82 74 technically ``free.''
Chris@82 75
Chris@82 76 Non-free licenses for FFTW are also available that permit different
Chris@82 77 terms of use than the GPL.
Chris@82 78
Chris@82 79 \question 10apr:nonfree What is this about non-free licenses?
Chris@82 80
Chris@82 81 The non-free licenses are for companies that wish to use FFTW in their
Chris@82 82 products but are unwilling to release their software under the GPL
Chris@82 83 (which would require them to release source code and allow free
Chris@82 84 redistribution). Such users can purchase an unlimited-use license
Chris@82 85 from MIT. Contact us for more details.
Chris@82 86
Chris@82 87 We could instead have released FFTW under the LGPL, or even disallowed
Chris@82 88 non-Free usage. Suffice it to say, however, that MIT owns the
Chris@82 89 copyright to FFTW and they only let us GPL it because we convinced
Chris@82 90 them that it would neither affect their licensing revenue nor irritate
Chris@82 91 existing licensees.
Chris@82 92
Chris@82 93 \question 24oct:west In the West? I thought MIT was in the East?
Chris@82 94
Chris@82 95 Not to an Italian. You could say that we're a Spaghetti Western
Chris@82 96 (with apologies to Sergio Leone).
Chris@82 97
Chris@82 98 \comment ######################################################################
Chris@82 99
Chris@82 100 \section Installing FFTW
Chris@82 101
Chris@82 102 \question 26aug:systems Which systems does FFTW run on?
Chris@82 103
Chris@82 104 FFTW is written in ANSI C, and should work on any system with a decent
Chris@82 105 C compiler. (See also \qref runOnWindows, \qref compilerCrashes.)
Chris@82 106 FFTW can also take advantage of certain hardware-specific features,
Chris@82 107 such as cycle counters and SIMD instructions, but this is optional.
Chris@82 108
Chris@82 109 \question 26aug:runOnWindows Does FFTW run on Windows?
Chris@82 110
Chris@82 111 Yes, many people have reported successfully using FFTW on Windows with
Chris@82 112 various compilers. FFTW was not developed on Windows, but the source
Chris@82 113 code is essentially straight ANSI C. See also the \docref{FFTW
Chris@82 114 Windows installation notes\}, \qref compilerCrashes, and \qref
Chris@82 115 vbetalia.
Chris@82 116
Chris@82 117 \question 26aug:compilerCrashes My compiler has trouble with FFTW.
Chris@82 118
Chris@82 119 Complain fiercely to the vendor of the compiler.
Chris@82 120
Chris@82 121 We have successfully used \courier{gcc\} 3.2.x on x86 and PPC, a
Chris@82 122 recent Compaq C compiler for Alpha, version 6 of IBM's \courier{xlc\}
Chris@82 123 compiler for AIX, Intel's \courier{icc\} versions 5-7, and Sun
Chris@82 124 WorkShop \courier{cc\} version 6.
Chris@82 125
Chris@82 126 FFTW is likely to push compilers to their limits, however, and several
Chris@82 127 compiler bugs have been exposed by FFTW. A partial list follows.
Chris@82 128
Chris@82 129 \courier{gcc\} 2.95.x for Solaris/SPARC produces incorrect code for
Chris@82 130 the test program (workaround: recompile the \courier{libbench2\}
Chris@82 131 directory with \courier{-O2\}).
Chris@82 132
Chris@82 133 NetBSD/macppc 1.6 comes with a \courier{gcc\} version that also
Chris@82 134 miscompiles the test program. (Please report a workaround if you know
Chris@82 135 one.)
Chris@82 136
Chris@82 137 \courier{gcc\} 3.2.3 for ARM reportedly crashes during compilation.
Chris@82 138 This bug is reportedly fixed in later versions of \courier{gcc\}.
Chris@82 139
Chris@82 140 Versions 8.0 and 8.1 of Intel's \courier{icc\} falsely claim to be
Chris@82 141 \courier{gcc\}, so you should specify \courier{CC="icc -no-gcc"\};
Chris@82 142 this is automatic in FFTW 3.1. \courier{icc-8.0.066\} reportely
Chris@82 143 produces incorrect code for FFTW 2.1.5, but is fixed in version 8.1.
Chris@82 144 \courier{icc-7.1\} compiler build 20030402Z appears to produce
Chris@82 145 incorrect dependencies, causing the compilation to fail.
Chris@82 146 \courier{icc-7.1\} build 20030307Z appears to work fine. (Use
Chris@82 147 \courier{icc -V\} to check which build you have.) As of 2003/04/18,
Chris@82 148 build 20030402Z appears not to be available any longer on Intel's
Chris@82 149 website, whereas the older build 20030307Z is available.
Chris@82 150
Chris@82 151 \courier{ranlib\} of GNU \courier{binutils\} 2.9.1 on Irix has been
Chris@82 152 observed to corrupt the FFTW libraries, causing a link failure when
Chris@82 153 FFTW is compiled. Since \courier{ranlib\} is completely superfluous
Chris@82 154 on Irix, we suggest deleting it from your system and replacing it with
Chris@82 155 a symbolic link to \courier{/bin/echo\}.
Chris@82 156
Chris@82 157 If support for SIMD instructions is enabled in FFTW, further compiler
Chris@82 158 problems may appear:
Chris@82 159
Chris@82 160 \courier{gcc\} 3.4.[0123] for x86 produces incorrect SSE2 code for
Chris@82 161 FFTW when \courier{-O2\} (the best choice for FFTW) is used, causing
Chris@82 162 FFTW to crash (\courier{make check\} crashes). This bug is fixed in
Chris@82 163 \courier{gcc\} 3.4.4. On x86_64 (amd64/em64t), \courier{gcc\} 3.4.4
Chris@82 164 reportedly still has a similar problem, but this is fixed as of
Chris@82 165 \courier{gcc\} 3.4.6.
Chris@82 166
Chris@82 167 \courier{gcc-3.2\} for x86 produces incorrect SIMD code if
Chris@82 168 \courier{-O3\} is used. The same compiler produces incorrect SIMD
Chris@82 169 code if no optimization is used, too. When using \courier{gcc-3.2\},
Chris@82 170 it is a good idea not to change the default \courier{CFLAGS\} selected
Chris@82 171 by the \courier{configure\} script.
Chris@82 172
Chris@82 173 Some 3.0.x and 3.1.x versions of \courier{gcc\} on \courier{x86\} may
Chris@82 174 crash. \courier{gcc\} so-called 2.96 shipping with RedHat 7.3 crashes
Chris@82 175 when compiling SIMD code. In both cases, please upgrade to
Chris@82 176 \courier{gcc-3.2\} or later.
Chris@82 177
Chris@82 178 Intel's \courier{icc\} 6.0 misaligns SSE constants, but FFTW has a
Chris@82 179 workaround. \courier{icc\} 8.x fails to compile FFTW 3.0.x because it
Chris@82 180 falsely claims to be \courier{gcc\}; we believe this to be a bug in
Chris@82 181 \courier{icc\}, but FFTW 3.1 has a workaround.
Chris@82 182
Chris@82 183 Visual C++ 2003 reportedly produces incorrect code for SSE/SSE2 when
Chris@82 184 compiling FFTW. This bug was reportedly fixed in VC++ 2005;
Chris@82 185 alternatively, you could switch to the Intel compiler. VC++ 6.0 also
Chris@82 186 reportedly produces incorrect code for the file
Chris@82 187 \courier{reodft11e-r2hc-odd.c\} unless optimizations are disabled for
Chris@82 188 that file.
Chris@82 189
Chris@82 190 \courier{gcc\} 2.95 on MacOS X miscompiles AltiVec code (fixed in
Chris@82 191 later versions). \courier{gcc\} 3.2.x miscompiles AltiVec
Chris@82 192 permutations, but FFTW has a workaround. \courier{gcc\} 4.0.1 on
Chris@82 193 MacOS for Intel crashes when compiling FFTW; a workaround is to
Chris@82 194 compile one file without optimization: \courier{cd kernel; make
Chris@82 195 CFLAGS=" " trig.lo\}.
Chris@82 196
Chris@82 197 \courier{gcc\} 4.1.1 reportedly crashes when compiling FFTW for MIPS;
Chris@82 198 the workaround is to compile the file it crashes on
Chris@82 199 (\courier{t2_64.c\}) with a lower optimization level.
Chris@82 200
Chris@82 201 \courier{gcc\} versions 4.1.2 to 4.2.0 for x86 reportedly miscompile
Chris@82 202 FFTW 3.1's test program, causing \courier{make check\} to crash
Chris@82 203 (\courier{gcc\} bug #26528). The bug was reportedly fixed in
Chris@82 204 \courier{gcc\} version 4.2.1 and later. A workaround is to compile
Chris@82 205 \courier{libbench2/verify-lib.c\} without optimization.
Chris@82 206
Chris@82 207 \question 26aug:solarisSucks FFTW does not compile on Solaris, complaining about \courier{const\}.
Chris@82 208
Chris@82 209 We know that at least on Solaris 2.5.x with Sun's compilers 4.2 you
Chris@82 210 might get error messages from \courier{make\} such as
Chris@82 211
Chris@82 212 \courier{"./fftw.h", line 88: warning: const is a keyword in ANSI C\}
Chris@82 213
Chris@82 214 This is the case when the \courier{configure\} script reports that
Chris@82 215 \courier{const\} does not work:
Chris@82 216
Chris@82 217 \courier{checking for working const... (cached) no\}
Chris@82 218
Chris@82 219 You should be aware that Solaris comes with two compilers, namely,
Chris@82 220 \courier{/opt/SUNWspro/SC4.2/bin/cc\} and \courier{/usr/ucb/cc\}. The
Chris@82 221 latter compiler is non-ANSI. Indeed, it is a perverse shell script
Chris@82 222 that calls the real compiler in non-ANSI mode. In order
Chris@82 223 to compile FFTW, change your path so that the right \courier{cc\}
Chris@82 224 is used.
Chris@82 225
Chris@82 226 To know whether your compiler is the right one, type
Chris@82 227 \courier{cc -V\}. If the compiler prints ``\courier{ucbcc\}'',
Chris@82 228 as in
Chris@82 229
Chris@82 230 \courier{ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2\}
Chris@82 231
Chris@82 232 then the compiler is wrong. The right message is something like
Chris@82 233
Chris@82 234 \courier{cc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2\}
Chris@82 235
Chris@82 236 \question 19mar:3dnow What's the difference between \courier{--enable-3dnow\} and \courier{--enable-k7\}?
Chris@82 237
Chris@82 238 \courier{--enable-k7\} enables 3DNow! instructions on K7 processors
Chris@82 239 (AMD Athlon and its variants). K7 support is provided by assembly
Chris@82 240 routines generated by a special purpose compiler.
Chris@82 241 As of fftw-3.2, --enable-k7 is no longer supported.
Chris@82 242
Chris@82 243 \courier{--enable-3dnow\} enables generic 3DNow! support using
Chris@82 244 \courier{gcc\} builtin functions. This works on earlier AMD
Chris@82 245 processors, but it is not as fast as our special assembly routines.
Chris@82 246 As of fftw-3.1, --enable-3dnow is no longer supported.
Chris@82 247
Chris@82 248 \question 18apr:fma What's the difference between the fma and the non-fma versions?
Chris@82 249
Chris@82 250 The fma version tries to exploit the fused multiply-add instructions
Chris@82 251 implemented in many processors such as PowerPC, ia-64, and MIPS. The
Chris@82 252 two FFTW packages are otherwise identical. In FFTW 3.1, the fma and
Chris@82 253 non-fma versions were merged together into a single package, and the
Chris@82 254 \courier{configure\} script attempts to automatically guess which
Chris@82 255 version to use.
Chris@82 256
Chris@82 257 The FFTW 3.1 \courier{configure\} script enables fma by default on
Chris@82 258 PowerPC, Itanium, and PA-RISC, and disables it otherwise. You can
Chris@82 259 force one or the other by using the \courier{--enable-fma\} or
Chris@82 260 \courier{--disable-fma\} flag for \courier{configure\}.
Chris@82 261
Chris@82 262 Definitely use fma if you have a PowerPC-based system with
Chris@82 263 \courier{gcc\} (or IBM \courier{xlc\}). This includes all GNU/Linux
Chris@82 264 systems for PowerPC and the older PowerPC-based MacOS systems. Also
Chris@82 265 use it on PA-RISC and Itanium with the HP/UX compiler.
Chris@82 266
Chris@82 267 Definitely do not use the fma version if you have an ia-32 processor
Chris@82 268 (Intel, AMD, MacOS on Intel, etcetera).
Chris@82 269
Chris@82 270 For other architectures/compilers, the situation is not so clear. For
Chris@82 271 example, ia-64 has the fma instruction, but \courier{gcc-3.2\} appears
Chris@82 272 not to exploit it correctly. Other compilers may do the right thing,
Chris@82 273 but we have not tried them. Please send us your feedback so that we
Chris@82 274 can update this FAQ entry.
Chris@82 275
Chris@82 276 \question 26aug:languages Which language is FFTW written in?
Chris@82 277
Chris@82 278 FFTW is written in ANSI C. Most of the code, however, was
Chris@82 279 automatically generated by a program called \courier{genfft\}, written
Chris@82 280 in the Objective Caml dialect of ML. You do not need to know ML or to
Chris@82 281 have an Objective Caml compiler in order to use FFTW.
Chris@82 282
Chris@82 283 \courier{genfft\} is provided with the FFTW sources, which means that
Chris@82 284 you can play with the code generator if you want. In this case, you
Chris@82 285 need a working Objective Caml system. Objective Caml is available
Chris@82 286 from \docref{the Caml web page\}.
Chris@82 287
Chris@82 288 \question 26aug:fortran Can I call FFTW from Fortran?
Chris@82 289
Chris@82 290 Yes, FFTW (versions 1.3 and higher) contains a Fortran-callable
Chris@82 291 interface, documented in the FFTW manual.
Chris@82 292
Chris@82 293 By default, FFTW configures its Fortran interface to work with the
Chris@82 294 first compiler it finds, e.g. \courier{g77\}. To configure for a
Chris@82 295 different, incompatible Fortran compiler \courier{foobar\}, use
Chris@82 296 \courier{./configure F77=foobar\} when installing FFTW. (In the case
Chris@82 297 of \courier{g77\}, however, FFTW 3.x also includes an extra set of
Chris@82 298 Fortran-callable routines with one less underscore at the end of
Chris@82 299 identifiers, which should cover most other Fortran compilers on Linux
Chris@82 300 at least.)
Chris@82 301
Chris@82 302 \question 26aug:cplusplus Can I call FFTW from C++?
Chris@82 303
Chris@82 304 Most definitely. FFTW should compile and/or link under any C++
Chris@82 305 compiler. Moreover, it is likely that the C++ \courier{<complex>\}
Chris@82 306 template class is bit-compatible with FFTW's complex-number format
Chris@82 307 (see the FFTW manual for more details).
Chris@82 308
Chris@82 309 \question 26aug:whynotfortran Why isn't FFTW written in Fortran/C++?
Chris@82 310
Chris@82 311 Because we don't like those languages, and neither approaches the
Chris@82 312 portability of C.
Chris@82 313
Chris@82 314 \question 29mar:singleprec How do I compile FFTW to run in single precision?
Chris@82 315
Chris@82 316 On a Unix system: \courier{configure --enable-float\}. On a non-Unix
Chris@82 317 system: edit \courier{config.h\} to \courier{#define\} the symbol
Chris@82 318 \courier{FFTW_SINGLE\} (for FFTW 3.x). In both cases, you must then
Chris@82 319 recompile FFTW. In FFTW 3, all FFTW identifiers will then begin with
Chris@82 320 \courier{fftwf_\} instead of \courier{fftw_\}.
Chris@82 321
Chris@82 322 \question 28mar:64bitk7 --enable-k7 does not work on x86-64
Chris@82 323
Chris@82 324 Support for --enable-k7 was discontinued in fftw-3.2.
Chris@82 325
Chris@82 326 The fftw-3.1 release supports --enable-k7. This option only works on
Chris@82 327 32-bit x86 machines that implement 3DNow!, including the AMD Athlon
Chris@82 328 and the AMD Opteron in 32-bit mode. --enable-k7 does not work on AMD
Chris@82 329 Opteron in 64-bit mode. Use --enable-sse for x86-64 machines.
Chris@82 330
Chris@82 331 FFTW supports 3DNow! by means of assembly code generated by a
Chris@82 332 special-purpose compiler. It is hard to produce assembly code that
Chris@82 333 works in both 32-bit and 64-bit mode.
Chris@82 334
Chris@82 335 \comment ######################################################################
Chris@82 336
Chris@82 337 \section Using FFTW
Chris@82 338
Chris@82 339 \question 15mar:fftw2to3 Why not support the FFTW 2 interface in FFTW 3?
Chris@82 340
Chris@82 341 FFTW 3 has semantics incompatible with earlier versions: its plans can
Chris@82 342 only be used for a given stride, multiplicity, and other
Chris@82 343 characteristics of the input and output arrays; these stronger
Chris@82 344 semantics are necessary for performance reasons. Thus, it is
Chris@82 345 impossible to efficiently emulate the older interface (whose plans can
Chris@82 346 be used for any transform of the same size). We believe that it
Chris@82 347 should be possible to upgrade most programs without any difficulty,
Chris@82 348 however.
Chris@82 349
Chris@82 350 \question 20mar:planperarray Why do FFTW 3 plans encapsulate the input/output arrays and not just the algorithm?
Chris@82 351
Chris@82 352 There are several reasons:
Chris@82 353
Chris@82 354 \call startlist
Chris@82 355 \call item
Chris@82 356 It was important for performance reasons that the plan be specific to
Chris@82 357 array characteristics like the stride (and alignment, for SIMD), and
Chris@82 358 requiring that the user maintain these invariants is error prone.
Chris@82 359 \call item
Chris@82 360 In most high-performance applications, as far as we can tell, you are
Chris@82 361 usually transforming the same array over and over, so FFTW's semantics
Chris@82 362 should not be a burden.
Chris@82 363 \call item
Chris@82 364 If you need to transform another array of the same size, creating a
Chris@82 365 new plan once the first exists is a cheap operation.
Chris@82 366 \call item
Chris@82 367 If you need to transform many arrays of the same size at once, you
Chris@82 368 should really use the \courier{plan_many\} routines in FFTW's "advanced"
Chris@82 369 interface.
Chris@82 370 \call item
Chris@82 371 If the abovementioned array characteristics are the same, you are
Chris@82 372 willing to pay close attention to the documentation, and you really
Chris@82 373 need to, we provide a "new-array execution" interface to apply a plan
Chris@82 374 to a new array.
Chris@82 375 \call endlist
Chris@82 376
Chris@82 377 \question 25may:slow FFTW seems really slow.
Chris@82 378
Chris@82 379 You are probably recreating the plan before every transform, rather
Chris@82 380 than creating it once and reusing it for all transforms of the same
Chris@82 381 size. FFTW is designed to be used in the following way:
Chris@82 382
Chris@82 383 \call startlist
Chris@82 384 \call item
Chris@82 385 First, you create a plan. This will take several seconds.
Chris@82 386 \call item
Chris@82 387 Then, you reuse the plan many times to perform FFTs. These are fast.
Chris@82 388 \call endlist
Chris@82 389
Chris@82 390 If you don't need to compute many transforms and the time for the
Chris@82 391 planner is significant, you have two options. First, you can use the
Chris@82 392 \courier{FFTW_ESTIMATE\} option in the planner, which uses heuristics
Chris@82 393 instead of runtime measurements and produces a good plan in a short
Chris@82 394 time. Second, you can use the wisdom feature to precompute the plan;
Chris@82 395 see \qref savePlans
Chris@82 396
Chris@82 397 \question 22oct:slows FFTW slows down after repeated calls.
Chris@82 398
Chris@82 399 Probably, NaNs or similar are creeping into your data, and the
Chris@82 400 slowdown is due to the resulting floating-point exceptions. For
Chris@82 401 example, be aware that repeatedly FFTing the same array is a diverging
Chris@82 402 process (because FFTW computes the unnormalized transform).
Chris@82 403
Chris@82 404 \question 22oct:segfault An FFTW routine is crashing when I call it.
Chris@82 405
Chris@82 406 Did the FFTW test programs pass (\courier{make check\}, or \courier{cd
Chris@82 407 tests; make bigcheck\} if you want to be paranoid)? If so, you almost
Chris@82 408 certainly have a bug in your own code. For example, you could be
Chris@82 409 passing invalid arguments (such as wrongly-sized arrays) to FFTW, or
Chris@82 410 you could simply have memory corruption elsewhere in your program that
Chris@82 411 causes random crashes later on. Please don't complain to us unless
Chris@82 412 you can come up with a minimal self-contained program (preferably
Chris@82 413 under 30 lines) that illustrates the problem.
Chris@82 414
Chris@82 415 \question 22oct:fortran64 My Fortran program crashes when calling FFTW.
Chris@82 416
Chris@82 417 As described in the manual, on 64-bit machines you must store the
Chris@82 418 plans in variables large enough to hold a pointer, for example
Chris@82 419 \courier{integer*8\}. We recommend using \courier{integer*8\} on
Chris@82 420 32-bit machines as well, to simplify porting.
Chris@82 421
Chris@82 422 \question 24mar:conventions FFTW gives results different from my old FFT.
Chris@82 423
Chris@82 424 People follow many different conventions for the DFT, and you should
Chris@82 425 be sure to know the ones that we use (described in the FFTW manual).
Chris@82 426 In particular, you should be aware that the
Chris@82 427 \courier{FFTW_FORWARD\}/\courier{FFTW_BACKWARD\} directions correspond
Chris@82 428 to signs of -1/+1 in the exponent of the DFT definition.
Chris@82 429 (\italic{Numerical Recipes\} uses the opposite convention.)
Chris@82 430
Chris@82 431 You should also know that we compute an unnormalized transform. In
Chris@82 432 contrast, Matlab is an example of program that computes a normalized
Chris@82 433 transform. See \qref whyscaled.
Chris@82 434
Chris@82 435 Finally, note that floating-point arithmetic is not exact, so
Chris@82 436 different FFT algorithms will give slightly different results (on the
Chris@82 437 order of the numerical accuracy; typically a fractional difference of
Chris@82 438 1e-15 or so in double precision).
Chris@82 439
Chris@82 440 \question 31aug:nondeterministic FFTW gives different results between runs
Chris@82 441
Chris@82 442 If you use \courier{FFTW_MEASURE\} or \courier{FFTW_PATIENT\} mode,
Chris@82 443 then the algorithm FFTW employs is not deterministic: it depends on
Chris@82 444 runtime performance measurements. This will cause the results to vary
Chris@82 445 slightly from run to run. However, the differences should be slight,
Chris@82 446 on the order of the floating-point precision, and therefore should
Chris@82 447 have no practical impact on most applications.
Chris@82 448
Chris@82 449 If you use saved plans (wisdom) or \courier{FFTW_ESTIMATE\} mode,
Chris@82 450 however, then the algorithm is deterministic and the results should be
Chris@82 451 identical between runs.
Chris@82 452
Chris@82 453 \question 26aug:savePlans Can I save FFTW's plans?
Chris@82 454
Chris@82 455 Yes. Starting with version 1.2, FFTW provides the \courier{wisdom\}
Chris@82 456 mechanism for saving plans; see the FFTW manual.
Chris@82 457
Chris@82 458 \question 14sep:whyscaled Why does your inverse transform return a scaled result?
Chris@82 459
Chris@82 460 Computing the forward transform followed by the backward transform (or
Chris@82 461 vice versa) yields the original array scaled by the size of the array.
Chris@82 462 (For multi-dimensional transforms, the size of the array is the
Chris@82 463 product of the dimensions.) We could, instead, have chosen a
Chris@82 464 normalization that would have returned the unscaled array. Or, to
Chris@82 465 accomodate the many conventions in this matter, the transform routines
Chris@82 466 could have accepted a "scale factor" parameter. We did not do this,
Chris@82 467 however, for two reasons. First, we didn't want to sacrifice
Chris@82 468 performance in the common case where the scale factor is 1. Second, in
Chris@82 469 real applications the FFT is followed or preceded by some computation
Chris@82 470 on the data, into which the scale factor can typically be absorbed at
Chris@82 471 little or no cost.
Chris@82 472
Chris@82 473 \question 02dec:centerorigin How can I make FFTW put the origin (zero frequency) at the center of its output?
Chris@82 474
Chris@82 475 For human viewing of a spectrum, it is often convenient to put the
Chris@82 476 origin in frequency space at the center of the output array, rather
Chris@82 477 than in the zero-th element (the default in FFTW). If all of the
Chris@82 478 dimensions of your array are even, you can accomplish this by simply
Chris@82 479 multiplying each element of the input array by (-1)^(i + j + ...),
Chris@82 480 where i, j, etcetera are the indices of the element. (This trick is a
Chris@82 481 general property of the DFT, and is not specific to FFTW.)
Chris@82 482
Chris@82 483 \question 08may:imageaudio How do I FFT an image/audio file in \italic{foobar\} format?
Chris@82 484
Chris@82 485 FFTW performs an FFT on an array of floating-point values. You can
Chris@82 486 certainly use it to compute the transform of an image or audio stream,
Chris@82 487 but you are responsible for figuring out your data format and
Chris@82 488 converting it to the form FFTW requires.
Chris@82 489
Chris@82 490 \question 09apr:linkfails My program does not link (on Unix).
Chris@82 491
Chris@82 492 The libraries must be listed in the correct order (\courier{-lfftw3
Chris@82 493 -lm\} for FFTW 3.x) and \italic{after\} your program sources/objects.
Chris@82 494 (The general rule is that if \italic{A\} uses \italic{B\}, then
Chris@82 495 \italic{A\} must be listed before \italic{B\} in the link command.).
Chris@82 496
Chris@82 497 \question 15mar:linkheader I included your header, but linking still fails.
Chris@82 498
Chris@82 499 You're a C++ programmer, aren't you? You have to compile the FFTW
Chris@82 500 library and link it into your program, not just \courier{#include
Chris@82 501 <fftw3.h>\}. (Yes, this is really a FAQ.)
Chris@82 502
Chris@82 503 \question 22oct:nostack My program crashes, complaining about stack space.
Chris@82 504
Chris@82 505 You cannot declare large arrays with automatic storage (e.g. via
Chris@82 506 \courier{fftw_complex array[N]\}); you should use
Chris@82 507 \courier{fftw_malloc\} (or equivalent) to allocate the arrays you want
Chris@82 508 to transform if they are larger than a few hundred elements.
Chris@82 509
Chris@82 510 \question 13may:leaks FFTW seems to have a memory leak.
Chris@82 511
Chris@82 512 After you create a plan, FFTW caches the information required to
Chris@82 513 quickly recreate the plan. (See \qref savePlans) It also maintains a
Chris@82 514 small amount of other persistent memory. You can deallocate all of
Chris@82 515 FFTW's internally allocated memory, if you wish, by calling
Chris@82 516 \courier{fftw_cleanup()\}, as documented in the manual.
Chris@82 517
Chris@82 518 \question 16may:allzero The output of FFTW's transform is all zeros.
Chris@82 519
Chris@82 520 You should initialize your input array \italic{after\} creating the
Chris@82 521 plan, unless you use \courier{FFTW_ESTIMATE\}: planning with
Chris@82 522 \courier{FFTW_MEASURE\} or \courier{FFTW_PATIENT\} overwrites the
Chris@82 523 input/output arrays, as described in the manual.
Chris@82 524
Chris@82 525 \question 05sep:vbetalia How do I call FFTW from the Microsoft language du jour?
Chris@82 526
Chris@82 527 Please \italic{do not\} ask us Windows-specific questions. We do not
Chris@82 528 use Windows. We know nothing about Visual Basic, Visual C++, or .NET.
Chris@82 529 Please find the appropriate Usenet discussion group and ask your
Chris@82 530 question there. See also \qref runOnWindows.
Chris@82 531
Chris@82 532 \question 15oct:pruned Can I compute only a subset of the DFT outputs?
Chris@82 533
Chris@82 534 In general, no, an FFT intrinsically computes all outputs from all
Chris@82 535 inputs. In principle, there is something called a \italic{pruned
Chris@82 536 FFT\} that can do what you want, but to compute K outputs out of N the
Chris@82 537 complexity is in general O(N log K) instead of O(N log N), thus saving
Chris@82 538 only a small additive factor in the log. (The same argument holds if
Chris@82 539 you instead have only K nonzero inputs.)
Chris@82 540
Chris@82 541 There are some specific cases in which you can get the O(N log K)
Chris@82 542 performance benefits easily, however, by combining a few ordinary
Chris@82 543 FFTs. In particular, the case where you want the first K outputs,
Chris@82 544 where K divides N, can be handled by performing N/K transforms of size
Chris@82 545 K and then summing the outputs multiplied by appropriate phase
Chris@82 546 factors. For more details, see \docref{pruned FFTs with FFTW\}.
Chris@82 547
Chris@82 548 There are also some algorithms that compute pruned transforms
Chris@82 549 \italic{approximately\}, but they are beyond the scope of this FAQ.
Chris@82 550
Chris@82 551 \question 21jan:transpose Can I use FFTW's routines for in-place and out-of-place matrix transposition?
Chris@82 552
Chris@82 553 You can use the FFTW guru interface to create a rank-0 transform of
Chris@82 554 vector rank 2 where the vector strides are transposed. (A rank-0
Chris@82 555 transform is equivalent to a 1D transform of size 1, which. just
Chris@82 556 copies the input into the output.) Specifying the same location for
Chris@82 557 the input and output makes the transpose in-place.
Chris@82 558
Chris@82 559 For double-valued data stored in row-major format, plan creation looks like
Chris@82 560 this:
Chris@82 561
Chris@82 562 \verbatim
Chris@82 563 fftw_plan plan_transpose(int rows, int cols, double *in, double *out)
Chris@82 564 {
Chris@82 565 const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */
Chris@82 566 fftw_iodim howmany_dims[2];
Chris@82 567
Chris@82 568 howmany_dims[0].n = rows;
Chris@82 569 howmany_dims[0].is = cols;
Chris@82 570 howmany_dims[0].os = 1;
Chris@82 571
Chris@82 572 howmany_dims[1].n = cols;
Chris@82 573 howmany_dims[1].is = 1;
Chris@82 574 howmany_dims[1].os = rows;
Chris@82 575
Chris@82 576 return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL,
Chris@82 577 /*howmany_rank=*/ 2, howmany_dims,
Chris@82 578 in, out, /*kind=*/ NULL, flags);
Chris@82 579 }
Chris@82 580 \endverbatim
Chris@82 581
Chris@82 582 (This entry was written by Rhys Ulerich.)
Chris@82 583
Chris@82 584 \comment ######################################################################
Chris@82 585
Chris@82 586 \section Internals of FFTW
Chris@82 587
Chris@82 588 \question 26aug:howworks How does FFTW work?
Chris@82 589
Chris@82 590 The innovation (if it can be so called) in FFTW consists in having a
Chris@82 591 variety of composable \italic{solvers\}, representing different FFT
Chris@82 592 algorithms and implementation strategies, whose combination into a
Chris@82 593 particular \italic{plan\} for a given size can be determined at
Chris@82 594 runtime according to the characteristics of your machine/compiler.
Chris@82 595 This peculiar software architecture allows FFTW to adapt itself to
Chris@82 596 almost any machine.
Chris@82 597
Chris@82 598 For more details (albeit somewhat outdated), see the paper "FFTW: An
Chris@82 599 Adaptive Software Architecture for the FFT", by M. Frigo and
Chris@82 600 S. G. Johnson, \italic{Proc. ICASSP\} 3, 1381 (1998), also
Chris@82 601 available at \docref{the FFTW web page\}.
Chris@82 602
Chris@82 603 \question 26aug:whyfast Why is FFTW so fast?
Chris@82 604
Chris@82 605 This is a complex question, and there is no simple answer. In fact,
Chris@82 606 the authors do not fully know the answer, either. In addition to many
Chris@82 607 small performance hacks throughout FFTW, there are three general
Chris@82 608 reasons for FFTW's speed.
Chris@82 609
Chris@82 610 \call startlist
Chris@82 611 \call item
Chris@82 612 FFTW uses a variety of FFT algorithms and implementation styles
Chris@82 613 that can be arbitrarily composed to adapt itself to
Chris@82 614 a machine. See \qref howworks.
Chris@82 615 \call item
Chris@82 616 FFTW uses a code generator to produce highly-optimized
Chris@82 617 routines for computing small transforms.
Chris@82 618 \call item
Chris@82 619 FFTW uses explicit divide-and-conquer to take advantage
Chris@82 620 of the memory hierarchy.
Chris@82 621 \call endlist
Chris@82 622
Chris@82 623 For more details (albeit somewhat outdated), see the paper "FFTW: An
Chris@82 624 Adaptive Software Architecture for the FFT", by M. Frigo and
Chris@82 625 S. G. Johnson, \italic{Proc. ICASSP\} 3, 1381 (1998),
Chris@82 626 available along with other references at \docref{the FFTW web page\}.
Chris@82 627
Chris@82 628 \comment ######################################################################
Chris@82 629
Chris@82 630 \section Known bugs
Chris@82 631
Chris@82 632 \question 27aug:rfftwndbug FFTW 1.1 crashes in rfftwnd on Linux.
Chris@82 633
Chris@82 634 This bug was fixed in FFTW 1.2. There was a bug in \courier{rfftwnd\}
Chris@82 635 causing an incorrect amount of memory to be allocated. The bug showed
Chris@82 636 up in Linux with libc-5.3.12 (and nowhere else that we know of).
Chris@82 637
Chris@82 638 \question 15oct:fftwmpibug The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
Chris@82 639
Chris@82 640 These bugs were corrected in FFTW 1.2.1. The MPI transforms (really,
Chris@82 641 just the transpose routines) in FFTW 1.2 had bugs that could cause
Chris@82 642 errors in some situations.
Chris@82 643
Chris@82 644 \question 05nov:testsingbug The test programs in FFTW 1.2.1 fail when I change FFTW to use single precision.
Chris@82 645
Chris@82 646 This bug was fixed in FFTW 1.3. (Older versions of FFTW did
Chris@82 647 work in single precision, but the test programs didn't--the error
Chris@82 648 tolerances in the tests were set for double precision.)
Chris@82 649
Chris@82 650 \question 24mar:teststoobig The test program in FFTW 1.2.1 fails for n > 46340.
Chris@82 651
Chris@82 652 This bug was fixed in FFTW 1.3. FFTW 1.2.1 produced the right answer,
Chris@82 653 but the test program was wrong. For large n, n*n in the naive
Chris@82 654 transform that we used for comparison overflows 32 bit integer
Chris@82 655 precision, breaking the test.
Chris@82 656
Chris@82 657 \question 24aug:linuxthreads The threaded code fails on Linux Redhat 5.0
Chris@82 658
Chris@82 659 We had problems with glibc-2.0.5. The code should work with
Chris@82 660 glibc-2.0.7.
Chris@82 661
Chris@82 662 \question 26sep:bigrfftwnd FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dimension >= 65536.
Chris@82 663
Chris@82 664 This bug was fixed in FFTW 2.0.1. (There was a 32-bit integer overflow due
Chris@82 665 to a poorly-parenthesized expression.)
Chris@82 666
Chris@82 667 \question 26mar:primebug FFTW 2.0's complex transforms give the wrong results with prime factors 17 to 97.
Chris@82 668
Chris@82 669 There was a bug in the complex transforms that could cause incorrect
Chris@82 670 results under (hopefully rare) circumstances for lengths with
Chris@82 671 intermediate-size prime factors (17-97). This bug was fixed in FFTW
Chris@82 672 2.1.1.
Chris@82 673
Chris@82 674 \question 05apr:mpichbug FFTW 2.1.1's MPI test programs crash with MPICH.
Chris@82 675
Chris@82 676 This bug was fixed in FFTW 2.1.2. The 2.1/2.1.1 MPI test programs crashed
Chris@82 677 when using the MPICH implementation of MPI with the \courier{ch_p4\}
Chris@82 678 device (TCP/IP); the transforms themselves worked fine.
Chris@82 679
Chris@82 680 \question 25may:aixthreadbug FFTW 2.1.2's multi-threaded transforms don't work on AIX.
Chris@82 681
Chris@82 682 This bug was fixed in FFTW 2.1.3. The multi-threaded transforms in
Chris@82 683 previous versions didn't work with AIX's \courier{pthreads\}
Chris@82 684 implementation, which idiosyncratically creates threads in detached
Chris@82 685 (non-joinable) mode by default.
Chris@82 686
Chris@82 687 \question 27sep:bigprimebug FFTW 2.1.2's complex transforms give incorrect results for large prime sizes.
Chris@82 688
Chris@82 689 This bug was fixed in FFTW 2.1.3. FFTW's complex-transform algorithm
Chris@82 690 for prime sizes (in versions 2.0 to 2.1.2) had an integer overflow
Chris@82 691 problem that caused incorrect results for many primes greater than
Chris@82 692 32768 (on 32-bit machines). (Sizes without large prime factors are
Chris@82 693 not affected.)
Chris@82 694
Chris@82 695 \question 25may:solaristhreadbug FFTW 2.1.3's multi-threaded transforms don't give any speedup on Solaris.
Chris@82 696
Chris@82 697 This bug was fixed in FFTW 2.1.4. (By default, Solaris creates
Chris@82 698 threads that do not parallelize over multiple processors, so one has
Chris@82 699 to request the proper behavior specifically.)
Chris@82 700
Chris@82 701 \question 03may:aixflags FFTW 2.1.3 crashes on AIX.
Chris@82 702
Chris@82 703 The FFTW 2.1.3 \courier{configure\} script picked incorrect compiler
Chris@82 704 flags for the \courier{xlc\} compiler on newer IBM processors. This
Chris@82 705 is fixed in FFTW 2.1.4.
Chris@82 706
Chris@82 707 \comment Here it ends!
Chris@82 708