annotate src/fftw-3.3.3/doc/FAQ/fftw-faq.bfnn @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 37bf6b4a2645
children
rev   line source
Chris@10 1 \comment This is the source for the FFTW FAQ list, in
Chris@10 2 \comment the Bizarre Format With No Name. It is turned into Lout
Chris@10 3 \comment input, HTML, plain ASCII and an Info document by a Perl script.
Chris@10 4 \comment
Chris@10 5 \comment The format and scripts come from the Linux FAQ, by
Chris@10 6 \comment Ian Jackson.
Chris@10 7 \set brieftitle FFTW FAQ
Chris@10 8 \set author <A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
Chris@10 9 \set authormail fftw@fftw.org
Chris@10 10 \set title FFTW Frequently Asked Questions with Answers
Chris@10 11 \set copyholder Matteo Frigo and Massachusetts Institute of Technology
Chris@10 12 \call-html startup html.refs2
Chris@10 13 \copyto ASCII
Chris@10 14 FFTW FREQUENTLY ASKED QUESTIONS WITH ANSWERS
Chris@10 15 `date '+%d %h %Y'`
Chris@10 16 Matteo Frigo
Chris@10 17 Steven G. Johnson
Chris@10 18 <fftw@fftw.org>
Chris@10 19
Chris@10 20 \endcopy
Chris@10 21 \copyto INFO
Chris@10 22 START-INFO-DIR-ENTRY
Chris@10 23 * FFTW FAQ: (fftw-faq). FFTW Frequently Asked Questions with Answers.
Chris@10 24 END-INFO-DIR-ENTRY
Chris@10 25
Chris@10 26 
Chris@10 27 File: $prefix.info, Node: Top, Next: Question 1.1, Up: (dir)
Chris@10 28
Chris@10 29 FFTW FREQUENTLY ASKED QUESTIONS WITH ANSWERS
Chris@10 30 `date '+%d %h %Y'`
Chris@10 31 Matteo Frigo
Chris@10 32 Steven G. Johnson
Chris@10 33 <fftw@fftw.org>
Chris@10 34
Chris@10 35 \endcopy
Chris@10 36
Chris@10 37 This is the list of Frequently Asked Questions about FFTW, a
Chris@10 38 collection of fast C routines for computing the Discrete Fourier
Chris@10 39 Transform in one or more dimensions.
Chris@10 40
Chris@10 41 \section Index
Chris@10 42
Chris@10 43 \index
Chris@10 44
Chris@10 45 \comment ######################################################################
Chris@10 46
Chris@10 47 \section Introduction and General Information
Chris@10 48
Chris@10 49 \question 26aug:whatisfftw What is FFTW?
Chris@10 50
Chris@10 51 FFTW is a free collection of fast C routines for computing the
Chris@10 52 Discrete Fourier Transform in one or more dimensions. It includes
Chris@10 53 complex, real, symmetric, and parallel transforms, and can handle
Chris@10 54 arbitrary array sizes efficiently. FFTW is typically faster than
Chris@10 55 other publically-available FFT implementations, and is even
Chris@10 56 competitive with vendor-tuned libraries. (See our web page for
Chris@10 57 extensive benchmarks.) To achieve this performance, FFTW uses novel
Chris@10 58 code-generation and runtime self-optimization techniques (along with
Chris@10 59 many other tricks).
Chris@10 60
Chris@10 61 \question 26aug:whereisfftw How do I obtain FFTW?
Chris@10 62
Chris@10 63 FFTW can be found at \docref{the FFTW web page\}. You can also
Chris@10 64 retrieve it from \ftpon ftp.fftw.org in \ftpin /pub/fftw.
Chris@10 65
Chris@10 66 \question 26aug:isfftwfree Is FFTW free software?
Chris@10 67
Chris@10 68 Starting with version 1.3, FFTW is Free Software in the technical
Chris@10 69 sense defined by the Free Software Foundation (see \docref{Categories
Chris@10 70 of Free and Non-Free Software\}), and is distributed under the terms
Chris@10 71 of the GNU General Public License. Previous versions of FFTW were
Chris@10 72 distributed without fee for noncommercial use, but were not
Chris@10 73 technically ``free.''
Chris@10 74
Chris@10 75 Non-free licenses for FFTW are also available that permit different
Chris@10 76 terms of use than the GPL.
Chris@10 77
Chris@10 78 \question 10apr:nonfree What is this about non-free licenses?
Chris@10 79
Chris@10 80 The non-free licenses are for companies that wish to use FFTW in their
Chris@10 81 products but are unwilling to release their software under the GPL
Chris@10 82 (which would require them to release source code and allow free
Chris@10 83 redistribution). Such users can purchase an unlimited-use license
Chris@10 84 from MIT. Contact us for more details.
Chris@10 85
Chris@10 86 We could instead have released FFTW under the LGPL, or even disallowed
Chris@10 87 non-Free usage. Suffice it to say, however, that MIT owns the
Chris@10 88 copyright to FFTW and they only let us GPL it because we convinced
Chris@10 89 them that it would neither affect their licensing revenue nor irritate
Chris@10 90 existing licensees.
Chris@10 91
Chris@10 92 \question 24oct:west In the West? I thought MIT was in the East?
Chris@10 93
Chris@10 94 Not to an Italian. You could say that we're a Spaghetti Western
Chris@10 95 (with apologies to Sergio Leone).
Chris@10 96
Chris@10 97 \comment ######################################################################
Chris@10 98
Chris@10 99 \section Installing FFTW
Chris@10 100
Chris@10 101 \question 26aug:systems Which systems does FFTW run on?
Chris@10 102
Chris@10 103 FFTW is written in ANSI C, and should work on any system with a decent
Chris@10 104 C compiler. (See also \qref runOnWindows, \qref compilerCrashes.)
Chris@10 105 FFTW can also take advantage of certain hardware-specific features,
Chris@10 106 such as cycle counters and SIMD instructions, but this is optional.
Chris@10 107
Chris@10 108 \question 26aug:runOnWindows Does FFTW run on Windows?
Chris@10 109
Chris@10 110 Yes, many people have reported successfully using FFTW on Windows with
Chris@10 111 various compilers. FFTW was not developed on Windows, but the source
Chris@10 112 code is essentially straight ANSI C. See also the \docref{FFTW
Chris@10 113 Windows installation notes\}, \qref compilerCrashes, and \qref
Chris@10 114 vbetalia.
Chris@10 115
Chris@10 116 \question 26aug:compilerCrashes My compiler has trouble with FFTW.
Chris@10 117
Chris@10 118 Complain fiercely to the vendor of the compiler.
Chris@10 119
Chris@10 120 We have successfully used \courier{gcc\} 3.2.x on x86 and PPC, a
Chris@10 121 recent Compaq C compiler for Alpha, version 6 of IBM's \courier{xlc\}
Chris@10 122 compiler for AIX, Intel's \courier{icc\} versions 5-7, and Sun
Chris@10 123 WorkShop \courier{cc\} version 6.
Chris@10 124
Chris@10 125 FFTW is likely to push compilers to their limits, however, and several
Chris@10 126 compiler bugs have been exposed by FFTW. A partial list follows.
Chris@10 127
Chris@10 128 \courier{gcc\} 2.95.x for Solaris/SPARC produces incorrect code for
Chris@10 129 the test program (workaround: recompile the \courier{libbench2\}
Chris@10 130 directory with \courier{-O2\}).
Chris@10 131
Chris@10 132 NetBSD/macppc 1.6 comes with a \courier{gcc\} version that also
Chris@10 133 miscompiles the test program. (Please report a workaround if you know
Chris@10 134 one.)
Chris@10 135
Chris@10 136 \courier{gcc\} 3.2.3 for ARM reportedly crashes during compilation.
Chris@10 137 This bug is reportedly fixed in later versions of \courier{gcc\}.
Chris@10 138
Chris@10 139 Versions 8.0 and 8.1 of Intel's \courier{icc\} falsely claim to be
Chris@10 140 \courier{gcc\}, so you should specify \courier{CC="icc -no-gcc"\};
Chris@10 141 this is automatic in FFTW 3.1. \courier{icc-8.0.066\} reportely
Chris@10 142 produces incorrect code for FFTW 2.1.5, but is fixed in version 8.1.
Chris@10 143 \courier{icc-7.1\} compiler build 20030402Z appears to produce
Chris@10 144 incorrect dependencies, causing the compilation to fail.
Chris@10 145 \courier{icc-7.1\} build 20030307Z appears to work fine. (Use
Chris@10 146 \courier{icc -V\} to check which build you have.) As of 2003/04/18,
Chris@10 147 build 20030402Z appears not to be available any longer on Intel's
Chris@10 148 website, whereas the older build 20030307Z is available.
Chris@10 149
Chris@10 150 \courier{ranlib\} of GNU \courier{binutils\} 2.9.1 on Irix has been
Chris@10 151 observed to corrupt the FFTW libraries, causing a link failure when
Chris@10 152 FFTW is compiled. Since \courier{ranlib\} is completely superfluous
Chris@10 153 on Irix, we suggest deleting it from your system and replacing it with
Chris@10 154 a symbolic link to \courier{/bin/echo\}.
Chris@10 155
Chris@10 156 If support for SIMD instructions is enabled in FFTW, further compiler
Chris@10 157 problems may appear:
Chris@10 158
Chris@10 159 \courier{gcc\} 3.4.[0123] for x86 produces incorrect SSE2 code for
Chris@10 160 FFTW when \courier{-O2\} (the best choice for FFTW) is used, causing
Chris@10 161 FFTW to crash (\courier{make check\} crashes). This bug is fixed in
Chris@10 162 \courier{gcc\} 3.4.4. On x86_64 (amd64/em64t), \courier{gcc\} 3.4.4
Chris@10 163 reportedly still has a similar problem, but this is fixed as of
Chris@10 164 \courier{gcc\} 3.4.6.
Chris@10 165
Chris@10 166 \courier{gcc-3.2\} for x86 produces incorrect SIMD code if
Chris@10 167 \courier{-O3\} is used. The same compiler produces incorrect SIMD
Chris@10 168 code if no optimization is used, too. When using \courier{gcc-3.2\},
Chris@10 169 it is a good idea not to change the default \courier{CFLAGS\} selected
Chris@10 170 by the \courier{configure\} script.
Chris@10 171
Chris@10 172 Some 3.0.x and 3.1.x versions of \courier{gcc\} on \courier{x86\} may
Chris@10 173 crash. \courier{gcc\} so-called 2.96 shipping with RedHat 7.3 crashes
Chris@10 174 when compiling SIMD code. In both cases, please upgrade to
Chris@10 175 \courier{gcc-3.2\} or later.
Chris@10 176
Chris@10 177 Intel's \courier{icc\} 6.0 misaligns SSE constants, but FFTW has a
Chris@10 178 workaround. \courier{icc\} 8.x fails to compile FFTW 3.0.x because it
Chris@10 179 falsely claims to be \courier{gcc\}; we believe this to be a bug in
Chris@10 180 \courier{icc\}, but FFTW 3.1 has a workaround.
Chris@10 181
Chris@10 182 Visual C++ 2003 reportedly produces incorrect code for SSE/SSE2 when
Chris@10 183 compiling FFTW. This bug was reportedly fixed in VC++ 2005;
Chris@10 184 alternatively, you could switch to the Intel compiler. VC++ 6.0 also
Chris@10 185 reportedly produces incorrect code for the file
Chris@10 186 \courier{reodft11e-r2hc-odd.c\} unless optimizations are disabled for
Chris@10 187 that file.
Chris@10 188
Chris@10 189 \courier{gcc\} 2.95 on MacOS X miscompiles AltiVec code (fixed in
Chris@10 190 later versions). \courier{gcc\} 3.2.x miscompiles AltiVec
Chris@10 191 permutations, but FFTW has a workaround. \courier{gcc\} 4.0.1 on
Chris@10 192 MacOS for Intel crashes when compiling FFTW; a workaround is to
Chris@10 193 compile one file without optimization: \courier{cd kernel; make
Chris@10 194 CFLAGS=" " trig.lo\}.
Chris@10 195
Chris@10 196 \courier{gcc\} 4.1.1 reportedly crashes when compiling FFTW for MIPS;
Chris@10 197 the workaround is to compile the file it crashes on
Chris@10 198 (\courier{t2_64.c\}) with a lower optimization level.
Chris@10 199
Chris@10 200 \courier{gcc\} versions 4.1.2 to 4.2.0 for x86 reportedly miscompile
Chris@10 201 FFTW 3.1's test program, causing \courier{make check\} to crash
Chris@10 202 (\courier{gcc\} bug #26528). The bug was reportedly fixed in
Chris@10 203 \courier{gcc\} version 4.2.1 and later. A workaround is to compile
Chris@10 204 \courier{libbench2/verify-lib.c\} without optimization.
Chris@10 205
Chris@10 206 \question 26aug:solarisSucks FFTW does not compile on Solaris, complaining about \courier{const\}.
Chris@10 207
Chris@10 208 We know that at least on Solaris 2.5.x with Sun's compilers 4.2 you
Chris@10 209 might get error messages from \courier{make\} such as
Chris@10 210
Chris@10 211 \courier{"./fftw.h", line 88: warning: const is a keyword in ANSI C\}
Chris@10 212
Chris@10 213 This is the case when the \courier{configure\} script reports that
Chris@10 214 \courier{const\} does not work:
Chris@10 215
Chris@10 216 \courier{checking for working const... (cached) no\}
Chris@10 217
Chris@10 218 You should be aware that Solaris comes with two compilers, namely,
Chris@10 219 \courier{/opt/SUNWspro/SC4.2/bin/cc\} and \courier{/usr/ucb/cc\}. The
Chris@10 220 latter compiler is non-ANSI. Indeed, it is a perverse shell script
Chris@10 221 that calls the real compiler in non-ANSI mode. In order
Chris@10 222 to compile FFTW, change your path so that the right \courier{cc\}
Chris@10 223 is used.
Chris@10 224
Chris@10 225 To know whether your compiler is the right one, type
Chris@10 226 \courier{cc -V\}. If the compiler prints ``\courier{ucbcc\}'',
Chris@10 227 as in
Chris@10 228
Chris@10 229 \courier{ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2\}
Chris@10 230
Chris@10 231 then the compiler is wrong. The right message is something like
Chris@10 232
Chris@10 233 \courier{cc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2\}
Chris@10 234
Chris@10 235 \question 19mar:3dnow What's the difference between \courier{--enable-3dnow\} and \courier{--enable-k7\}?
Chris@10 236
Chris@10 237 \courier{--enable-k7\} enables 3DNow! instructions on K7 processors
Chris@10 238 (AMD Athlon and its variants). K7 support is provided by assembly
Chris@10 239 routines generated by a special purpose compiler.
Chris@10 240 As of fftw-3.2, --enable-k7 is no longer supported.
Chris@10 241
Chris@10 242 \courier{--enable-3dnow\} enables generic 3DNow! support using
Chris@10 243 \courier{gcc\} builtin functions. This works on earlier AMD
Chris@10 244 processors, but it is not as fast as our special assembly routines.
Chris@10 245 As of fftw-3.1, --enable-3dnow is no longer supported.
Chris@10 246
Chris@10 247 \question 18apr:fma What's the difference between the fma and the non-fma versions?
Chris@10 248
Chris@10 249 The fma version tries to exploit the fused multiply-add instructions
Chris@10 250 implemented in many processors such as PowerPC, ia-64, and MIPS. The
Chris@10 251 two FFTW packages are otherwise identical. In FFTW 3.1, the fma and
Chris@10 252 non-fma versions were merged together into a single package, and the
Chris@10 253 \courier{configure\} script attempts to automatically guess which
Chris@10 254 version to use.
Chris@10 255
Chris@10 256 The FFTW 3.1 \courier{configure\} script enables fma by default on
Chris@10 257 PowerPC, Itanium, and PA-RISC, and disables it otherwise. You can
Chris@10 258 force one or the other by using the \courier{--enable-fma\} or
Chris@10 259 \courier{--disable-fma\} flag for \courier{configure\}.
Chris@10 260
Chris@10 261 Definitely use fma if you have a PowerPC-based system with
Chris@10 262 \courier{gcc\} (or IBM \courier{xlc\}). This includes all GNU/Linux
Chris@10 263 systems for PowerPC and the older PowerPC-based MacOS systems. Also
Chris@10 264 use it on PA-RISC and Itanium with the HP/UX compiler.
Chris@10 265
Chris@10 266 Definitely do not use the fma version if you have an ia-32 processor
Chris@10 267 (Intel, AMD, MacOS on Intel, etcetera).
Chris@10 268
Chris@10 269 For other architectures/compilers, the situation is not so clear. For
Chris@10 270 example, ia-64 has the fma instruction, but \courier{gcc-3.2\} appears
Chris@10 271 not to exploit it correctly. Other compilers may do the right thing,
Chris@10 272 but we have not tried them. Please send us your feedback so that we
Chris@10 273 can update this FAQ entry.
Chris@10 274
Chris@10 275 \question 26aug:languages Which language is FFTW written in?
Chris@10 276
Chris@10 277 FFTW is written in ANSI C. Most of the code, however, was
Chris@10 278 automatically generated by a program called \courier{genfft\}, written
Chris@10 279 in the Objective Caml dialect of ML. You do not need to know ML or to
Chris@10 280 have an Objective Caml compiler in order to use FFTW.
Chris@10 281
Chris@10 282 \courier{genfft\} is provided with the FFTW sources, which means that
Chris@10 283 you can play with the code generator if you want. In this case, you
Chris@10 284 need a working Objective Caml system. Objective Caml is available
Chris@10 285 from \docref{the Caml web page\}.
Chris@10 286
Chris@10 287 \question 26aug:fortran Can I call FFTW from Fortran?
Chris@10 288
Chris@10 289 Yes, FFTW (versions 1.3 and higher) contains a Fortran-callable
Chris@10 290 interface, documented in the FFTW manual.
Chris@10 291
Chris@10 292 By default, FFTW configures its Fortran interface to work with the
Chris@10 293 first compiler it finds, e.g. \courier{g77\}. To configure for a
Chris@10 294 different, incompatible Fortran compiler \courier{foobar\}, use
Chris@10 295 \courier{./configure F77=foobar\} when installing FFTW. (In the case
Chris@10 296 of \courier{g77\}, however, FFTW 3.x also includes an extra set of
Chris@10 297 Fortran-callable routines with one less underscore at the end of
Chris@10 298 identifiers, which should cover most other Fortran compilers on Linux
Chris@10 299 at least.)
Chris@10 300
Chris@10 301 \question 26aug:cplusplus Can I call FFTW from C++?
Chris@10 302
Chris@10 303 Most definitely. FFTW should compile and/or link under any C++
Chris@10 304 compiler. Moreover, it is likely that the C++ \courier{<complex>\}
Chris@10 305 template class is bit-compatible with FFTW's complex-number format
Chris@10 306 (see the FFTW manual for more details).
Chris@10 307
Chris@10 308 \question 26aug:whynotfortran Why isn't FFTW written in Fortran/C++?
Chris@10 309
Chris@10 310 Because we don't like those languages, and neither approaches the
Chris@10 311 portability of C.
Chris@10 312
Chris@10 313 \question 29mar:singleprec How do I compile FFTW to run in single precision?
Chris@10 314
Chris@10 315 On a Unix system: \courier{configure --enable-float\}. On a non-Unix
Chris@10 316 system: edit \courier{config.h\} to \courier{#define\} the symbol
Chris@10 317 \courier{FFTW_SINGLE\} (for FFTW 3.x). In both cases, you must then
Chris@10 318 recompile FFTW. In FFTW 3, all FFTW identifiers will then begin with
Chris@10 319 \courier{fftwf_\} instead of \courier{fftw_\}.
Chris@10 320
Chris@10 321 \question 28mar:64bitk7 --enable-k7 does not work on x86-64
Chris@10 322
Chris@10 323 Support for --enable-k7 was discontinued in fftw-3.2.
Chris@10 324
Chris@10 325 The fftw-3.1 release supports --enable-k7. This option only works on
Chris@10 326 32-bit x86 machines that implement 3DNow!, including the AMD Athlon
Chris@10 327 and the AMD Opteron in 32-bit mode. --enable-k7 does not work on AMD
Chris@10 328 Opteron in 64-bit mode. Use --enable-sse for x86-64 machines.
Chris@10 329
Chris@10 330 FFTW supports 3DNow! by means of assembly code generated by a
Chris@10 331 special-purpose compiler. It is hard to produce assembly code that
Chris@10 332 works in both 32-bit and 64-bit mode.
Chris@10 333
Chris@10 334 \comment ######################################################################
Chris@10 335
Chris@10 336 \section Using FFTW
Chris@10 337
Chris@10 338 \question 15mar:fftw2to3 Why not support the FFTW 2 interface in FFTW 3?
Chris@10 339
Chris@10 340 FFTW 3 has semantics incompatible with earlier versions: its plans can
Chris@10 341 only be used for a given stride, multiplicity, and other
Chris@10 342 characteristics of the input and output arrays; these stronger
Chris@10 343 semantics are necessary for performance reasons. Thus, it is
Chris@10 344 impossible to efficiently emulate the older interface (whose plans can
Chris@10 345 be used for any transform of the same size). We believe that it
Chris@10 346 should be possible to upgrade most programs without any difficulty,
Chris@10 347 however.
Chris@10 348
Chris@10 349 \question 20mar:planperarray Why do FFTW 3 plans encapsulate the input/output arrays and not just the algorithm?
Chris@10 350
Chris@10 351 There are several reasons:
Chris@10 352
Chris@10 353 \call startlist
Chris@10 354 \call item
Chris@10 355 It was important for performance reasons that the plan be specific to
Chris@10 356 array characteristics like the stride (and alignment, for SIMD), and
Chris@10 357 requiring that the user maintain these invariants is error prone.
Chris@10 358 \call item
Chris@10 359 In most high-performance applications, as far as we can tell, you are
Chris@10 360 usually transforming the same array over and over, so FFTW's semantics
Chris@10 361 should not be a burden.
Chris@10 362 \call item
Chris@10 363 If you need to transform another array of the same size, creating a
Chris@10 364 new plan once the first exists is a cheap operation.
Chris@10 365 \call item
Chris@10 366 If you need to transform many arrays of the same size at once, you
Chris@10 367 should really use the \courier{plan_many\} routines in FFTW's "advanced"
Chris@10 368 interface.
Chris@10 369 \call item
Chris@10 370 If the abovementioned array characteristics are the same, you are
Chris@10 371 willing to pay close attention to the documentation, and you really
Chris@10 372 need to, we provide a "new-array execution" interface to apply a plan
Chris@10 373 to a new array.
Chris@10 374 \call endlist
Chris@10 375
Chris@10 376 \question 25may:slow FFTW seems really slow.
Chris@10 377
Chris@10 378 You are probably recreating the plan before every transform, rather
Chris@10 379 than creating it once and reusing it for all transforms of the same
Chris@10 380 size. FFTW is designed to be used in the following way:
Chris@10 381
Chris@10 382 \call startlist
Chris@10 383 \call item
Chris@10 384 First, you create a plan. This will take several seconds.
Chris@10 385 \call item
Chris@10 386 Then, you reuse the plan many times to perform FFTs. These are fast.
Chris@10 387 \call endlist
Chris@10 388
Chris@10 389 If you don't need to compute many transforms and the time for the
Chris@10 390 planner is significant, you have two options. First, you can use the
Chris@10 391 \courier{FFTW_ESTIMATE\} option in the planner, which uses heuristics
Chris@10 392 instead of runtime measurements and produces a good plan in a short
Chris@10 393 time. Second, you can use the wisdom feature to precompute the plan;
Chris@10 394 see \qref savePlans
Chris@10 395
Chris@10 396 \question 22oct:slows FFTW slows down after repeated calls.
Chris@10 397
Chris@10 398 Probably, NaNs or similar are creeping into your data, and the
Chris@10 399 slowdown is due to the resulting floating-point exceptions. For
Chris@10 400 example, be aware that repeatedly FFTing the same array is a diverging
Chris@10 401 process (because FFTW computes the unnormalized transform).
Chris@10 402
Chris@10 403 \question 22oct:segfault An FFTW routine is crashing when I call it.
Chris@10 404
Chris@10 405 Did the FFTW test programs pass (\courier{make check\}, or \courier{cd
Chris@10 406 tests; make bigcheck\} if you want to be paranoid)? If so, you almost
Chris@10 407 certainly have a bug in your own code. For example, you could be
Chris@10 408 passing invalid arguments (such as wrongly-sized arrays) to FFTW, or
Chris@10 409 you could simply have memory corruption elsewhere in your program that
Chris@10 410 causes random crashes later on. Please don't complain to us unless
Chris@10 411 you can come up with a minimal self-contained program (preferably
Chris@10 412 under 30 lines) that illustrates the problem.
Chris@10 413
Chris@10 414 \question 22oct:fortran64 My Fortran program crashes when calling FFTW.
Chris@10 415
Chris@10 416 As described in the manual, on 64-bit machines you must store the
Chris@10 417 plans in variables large enough to hold a pointer, for example
Chris@10 418 \courier{integer*8\}. We recommend using \courier{integer*8\} on
Chris@10 419 32-bit machines as well, to simplify porting.
Chris@10 420
Chris@10 421 \question 24mar:conventions FFTW gives results different from my old FFT.
Chris@10 422
Chris@10 423 People follow many different conventions for the DFT, and you should
Chris@10 424 be sure to know the ones that we use (described in the FFTW manual).
Chris@10 425 In particular, you should be aware that the
Chris@10 426 \courier{FFTW_FORWARD\}/\courier{FFTW_BACKWARD\} directions correspond
Chris@10 427 to signs of -1/+1 in the exponent of the DFT definition.
Chris@10 428 (\italic{Numerical Recipes\} uses the opposite convention.)
Chris@10 429
Chris@10 430 You should also know that we compute an unnormalized transform. In
Chris@10 431 contrast, Matlab is an example of program that computes a normalized
Chris@10 432 transform. See \qref whyscaled.
Chris@10 433
Chris@10 434 Finally, note that floating-point arithmetic is not exact, so
Chris@10 435 different FFT algorithms will give slightly different results (on the
Chris@10 436 order of the numerical accuracy; typically a fractional difference of
Chris@10 437 1e-15 or so in double precision).
Chris@10 438
Chris@10 439 \question 31aug:nondeterministic FFTW gives different results between runs
Chris@10 440
Chris@10 441 If you use \courier{FFTW_MEASURE\} or \courier{FFTW_PATIENT\} mode,
Chris@10 442 then the algorithm FFTW employs is not deterministic: it depends on
Chris@10 443 runtime performance measurements. This will cause the results to vary
Chris@10 444 slightly from run to run. However, the differences should be slight,
Chris@10 445 on the order of the floating-point precision, and therefore should
Chris@10 446 have no practical impact on most applications.
Chris@10 447
Chris@10 448 If you use saved plans (wisdom) or \courier{FFTW_ESTIMATE\} mode,
Chris@10 449 however, then the algorithm is deterministic and the results should be
Chris@10 450 identical between runs.
Chris@10 451
Chris@10 452 \question 26aug:savePlans Can I save FFTW's plans?
Chris@10 453
Chris@10 454 Yes. Starting with version 1.2, FFTW provides the \courier{wisdom\}
Chris@10 455 mechanism for saving plans; see the FFTW manual.
Chris@10 456
Chris@10 457 \question 14sep:whyscaled Why does your inverse transform return a scaled result?
Chris@10 458
Chris@10 459 Computing the forward transform followed by the backward transform (or
Chris@10 460 vice versa) yields the original array scaled by the size of the array.
Chris@10 461 (For multi-dimensional transforms, the size of the array is the
Chris@10 462 product of the dimensions.) We could, instead, have chosen a
Chris@10 463 normalization that would have returned the unscaled array. Or, to
Chris@10 464 accomodate the many conventions in this matter, the transform routines
Chris@10 465 could have accepted a "scale factor" parameter. We did not do this,
Chris@10 466 however, for two reasons. First, we didn't want to sacrifice
Chris@10 467 performance in the common case where the scale factor is 1. Second, in
Chris@10 468 real applications the FFT is followed or preceded by some computation
Chris@10 469 on the data, into which the scale factor can typically be absorbed at
Chris@10 470 little or no cost.
Chris@10 471
Chris@10 472 \question 02dec:centerorigin How can I make FFTW put the origin (zero frequency) at the center of its output?
Chris@10 473
Chris@10 474 For human viewing of a spectrum, it is often convenient to put the
Chris@10 475 origin in frequency space at the center of the output array, rather
Chris@10 476 than in the zero-th element (the default in FFTW). If all of the
Chris@10 477 dimensions of your array are even, you can accomplish this by simply
Chris@10 478 multiplying each element of the input array by (-1)^(i + j + ...),
Chris@10 479 where i, j, etcetera are the indices of the element. (This trick is a
Chris@10 480 general property of the DFT, and is not specific to FFTW.)
Chris@10 481
Chris@10 482 \question 08may:imageaudio How do I FFT an image/audio file in \italic{foobar\} format?
Chris@10 483
Chris@10 484 FFTW performs an FFT on an array of floating-point values. You can
Chris@10 485 certainly use it to compute the transform of an image or audio stream,
Chris@10 486 but you are responsible for figuring out your data format and
Chris@10 487 converting it to the form FFTW requires.
Chris@10 488
Chris@10 489 \question 09apr:linkfails My program does not link (on Unix).
Chris@10 490
Chris@10 491 The libraries must be listed in the correct order (\courier{-lfftw3
Chris@10 492 -lm\} for FFTW 3.x) and \italic{after\} your program sources/objects.
Chris@10 493 (The general rule is that if \italic{A\} uses \italic{B\}, then
Chris@10 494 \italic{A\} must be listed before \italic{B\} in the link command.).
Chris@10 495
Chris@10 496 \question 15mar:linkheader I included your header, but linking still fails.
Chris@10 497
Chris@10 498 You're a C++ programmer, aren't you? You have to compile the FFTW
Chris@10 499 library and link it into your program, not just \courier{#include
Chris@10 500 <fftw3.h>\}. (Yes, this is really a FAQ.)
Chris@10 501
Chris@10 502 \question 22oct:nostack My program crashes, complaining about stack space.
Chris@10 503
Chris@10 504 You cannot declare large arrays with automatic storage (e.g. via
Chris@10 505 \courier{fftw_complex array[N]\}); you should use
Chris@10 506 \courier{fftw_malloc\} (or equivalent) to allocate the arrays you want
Chris@10 507 to transform if they are larger than a few hundred elements.
Chris@10 508
Chris@10 509 \question 13may:leaks FFTW seems to have a memory leak.
Chris@10 510
Chris@10 511 After you create a plan, FFTW caches the information required to
Chris@10 512 quickly recreate the plan. (See \qref savePlans) It also maintains a
Chris@10 513 small amount of other persistent memory. You can deallocate all of
Chris@10 514 FFTW's internally allocated memory, if you wish, by calling
Chris@10 515 \courier{fftw_cleanup()\}, as documented in the manual.
Chris@10 516
Chris@10 517 \question 16may:allzero The output of FFTW's transform is all zeros.
Chris@10 518
Chris@10 519 You should initialize your input array \italic{after\} creating the
Chris@10 520 plan, unless you use \courier{FFTW_ESTIMATE\}: planning with
Chris@10 521 \courier{FFTW_MEASURE\} or \courier{FFTW_PATIENT\} overwrites the
Chris@10 522 input/output arrays, as described in the manual.
Chris@10 523
Chris@10 524 \question 05sep:vbetalia How do I call FFTW from the Microsoft language du jour?
Chris@10 525
Chris@10 526 Please \italic{do not\} ask us Windows-specific questions. We do not
Chris@10 527 use Windows. We know nothing about Visual Basic, Visual C++, or .NET.
Chris@10 528 Please find the appropriate Usenet discussion group and ask your
Chris@10 529 question there. See also \qref runOnWindows.
Chris@10 530
Chris@10 531 \question 15oct:pruned Can I compute only a subset of the DFT outputs?
Chris@10 532
Chris@10 533 In general, no, an FFT intrinsically computes all outputs from all
Chris@10 534 inputs. In principle, there is something called a \italic{pruned
Chris@10 535 FFT\} that can do what you want, but to compute K outputs out of N the
Chris@10 536 complexity is in general O(N log K) instead of O(N log N), thus saving
Chris@10 537 only a small additive factor in the log. (The same argument holds if
Chris@10 538 you instead have only K nonzero inputs.)
Chris@10 539
Chris@10 540 There are some specific cases in which you can get the O(N log K)
Chris@10 541 performance benefits easily, however, by combining a few ordinary
Chris@10 542 FFTs. In particular, the case where you want the first K outputs,
Chris@10 543 where K divides N, can be handled by performing N/K transforms of size
Chris@10 544 K and then summing the outputs multiplied by appropriate phase
Chris@10 545 factors. For more details, see \docref{pruned FFTs with FFTW\}.
Chris@10 546
Chris@10 547 There are also some algorithms that compute pruned transforms
Chris@10 548 \italic{approximately\}, but they are beyond the scope of this FAQ.
Chris@10 549
Chris@10 550 \question 21jan:transpose Can I use FFTW's routines for in-place and out-of-place matrix transposition?
Chris@10 551
Chris@10 552 You can use the FFTW guru interface to create a rank-0 transform of
Chris@10 553 vector rank 2 where the vector strides are transposed. (A rank-0
Chris@10 554 transform is equivalent to a 1D transform of size 1, which. just
Chris@10 555 copies the input into the output.) Specifying the same location for
Chris@10 556 the input and output makes the transpose in-place.
Chris@10 557
Chris@10 558 For double-valued data stored in row-major format, plan creation looks like
Chris@10 559 this:
Chris@10 560
Chris@10 561 \verbatim
Chris@10 562 fftw_plan plan_transpose(int rows, int cols, double *in, double *out)
Chris@10 563 {
Chris@10 564 const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */
Chris@10 565 fftw_iodim howmany_dims[2];
Chris@10 566
Chris@10 567 howmany_dims[0].n = rows;
Chris@10 568 howmany_dims[0].is = cols;
Chris@10 569 howmany_dims[0].os = 1;
Chris@10 570
Chris@10 571 howmany_dims[1].n = cols;
Chris@10 572 howmany_dims[1].is = 1;
Chris@10 573 howmany_dims[1].os = rows;
Chris@10 574
Chris@10 575 return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL,
Chris@10 576 /*howmany_rank=*/ 2, howmany_dims,
Chris@10 577 in, out, /*kind=*/ NULL, flags);
Chris@10 578 }
Chris@10 579 \endverbatim
Chris@10 580
Chris@10 581 (This entry was written by Rhys Ulerich.)
Chris@10 582
Chris@10 583 \comment ######################################################################
Chris@10 584
Chris@10 585 \section Internals of FFTW
Chris@10 586
Chris@10 587 \question 26aug:howworks How does FFTW work?
Chris@10 588
Chris@10 589 The innovation (if it can be so called) in FFTW consists in having a
Chris@10 590 variety of composable \italic{solvers\}, representing different FFT
Chris@10 591 algorithms and implementation strategies, whose combination into a
Chris@10 592 particular \italic{plan\} for a given size can be determined at
Chris@10 593 runtime according to the characteristics of your machine/compiler.
Chris@10 594 This peculiar software architecture allows FFTW to adapt itself to
Chris@10 595 almost any machine.
Chris@10 596
Chris@10 597 For more details (albeit somewhat outdated), see the paper "FFTW: An
Chris@10 598 Adaptive Software Architecture for the FFT", by M. Frigo and
Chris@10 599 S. G. Johnson, \italic{Proc. ICASSP\} 3, 1381 (1998), also
Chris@10 600 available at \docref{the FFTW web page\}.
Chris@10 601
Chris@10 602 \question 26aug:whyfast Why is FFTW so fast?
Chris@10 603
Chris@10 604 This is a complex question, and there is no simple answer. In fact,
Chris@10 605 the authors do not fully know the answer, either. In addition to many
Chris@10 606 small performance hacks throughout FFTW, there are three general
Chris@10 607 reasons for FFTW's speed.
Chris@10 608
Chris@10 609 \call startlist
Chris@10 610 \call item
Chris@10 611 FFTW uses a variety of FFT algorithms and implementation styles
Chris@10 612 that can be arbitrarily composed to adapt itself to
Chris@10 613 a machine. See \qref howworks.
Chris@10 614 \call item
Chris@10 615 FFTW uses a code generator to produce highly-optimized
Chris@10 616 routines for computing small transforms.
Chris@10 617 \call item
Chris@10 618 FFTW uses explicit divide-and-conquer to take advantage
Chris@10 619 of the memory hierarchy.
Chris@10 620 \call endlist
Chris@10 621
Chris@10 622 For more details (albeit somewhat outdated), see the paper "FFTW: An
Chris@10 623 Adaptive Software Architecture for the FFT", by M. Frigo and
Chris@10 624 S. G. Johnson, \italic{Proc. ICASSP\} 3, 1381 (1998),
Chris@10 625 available along with other references at \docref{the FFTW web page\}.
Chris@10 626
Chris@10 627 \comment ######################################################################
Chris@10 628
Chris@10 629 \section Known bugs
Chris@10 630
Chris@10 631 \question 27aug:rfftwndbug FFTW 1.1 crashes in rfftwnd on Linux.
Chris@10 632
Chris@10 633 This bug was fixed in FFTW 1.2. There was a bug in \courier{rfftwnd\}
Chris@10 634 causing an incorrect amount of memory to be allocated. The bug showed
Chris@10 635 up in Linux with libc-5.3.12 (and nowhere else that we know of).
Chris@10 636
Chris@10 637 \question 15oct:fftwmpibug The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
Chris@10 638
Chris@10 639 These bugs were corrected in FFTW 1.2.1. The MPI transforms (really,
Chris@10 640 just the transpose routines) in FFTW 1.2 had bugs that could cause
Chris@10 641 errors in some situations.
Chris@10 642
Chris@10 643 \question 05nov:testsingbug The test programs in FFTW 1.2.1 fail when I change FFTW to use single precision.
Chris@10 644
Chris@10 645 This bug was fixed in FFTW 1.3. (Older versions of FFTW did
Chris@10 646 work in single precision, but the test programs didn't--the error
Chris@10 647 tolerances in the tests were set for double precision.)
Chris@10 648
Chris@10 649 \question 24mar:teststoobig The test program in FFTW 1.2.1 fails for n > 46340.
Chris@10 650
Chris@10 651 This bug was fixed in FFTW 1.3. FFTW 1.2.1 produced the right answer,
Chris@10 652 but the test program was wrong. For large n, n*n in the naive
Chris@10 653 transform that we used for comparison overflows 32 bit integer
Chris@10 654 precision, breaking the test.
Chris@10 655
Chris@10 656 \question 24aug:linuxthreads The threaded code fails on Linux Redhat 5.0
Chris@10 657
Chris@10 658 We had problems with glibc-2.0.5. The code should work with
Chris@10 659 glibc-2.0.7.
Chris@10 660
Chris@10 661 \question 26sep:bigrfftwnd FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dimension >= 65536.
Chris@10 662
Chris@10 663 This bug was fixed in FFTW 2.0.1. (There was a 32-bit integer overflow due
Chris@10 664 to a poorly-parenthesized expression.)
Chris@10 665
Chris@10 666 \question 26mar:primebug FFTW 2.0's complex transforms give the wrong results with prime factors 17 to 97.
Chris@10 667
Chris@10 668 There was a bug in the complex transforms that could cause incorrect
Chris@10 669 results under (hopefully rare) circumstances for lengths with
Chris@10 670 intermediate-size prime factors (17-97). This bug was fixed in FFTW
Chris@10 671 2.1.1.
Chris@10 672
Chris@10 673 \question 05apr:mpichbug FFTW 2.1.1's MPI test programs crash with MPICH.
Chris@10 674
Chris@10 675 This bug was fixed in FFTW 2.1.2. The 2.1/2.1.1 MPI test programs crashed
Chris@10 676 when using the MPICH implementation of MPI with the \courier{ch_p4\}
Chris@10 677 device (TCP/IP); the transforms themselves worked fine.
Chris@10 678
Chris@10 679 \question 25may:aixthreadbug FFTW 2.1.2's multi-threaded transforms don't work on AIX.
Chris@10 680
Chris@10 681 This bug was fixed in FFTW 2.1.3. The multi-threaded transforms in
Chris@10 682 previous versions didn't work with AIX's \courier{pthreads\}
Chris@10 683 implementation, which idiosyncratically creates threads in detached
Chris@10 684 (non-joinable) mode by default.
Chris@10 685
Chris@10 686 \question 27sep:bigprimebug FFTW 2.1.2's complex transforms give incorrect results for large prime sizes.
Chris@10 687
Chris@10 688 This bug was fixed in FFTW 2.1.3. FFTW's complex-transform algorithm
Chris@10 689 for prime sizes (in versions 2.0 to 2.1.2) had an integer overflow
Chris@10 690 problem that caused incorrect results for many primes greater than
Chris@10 691 32768 (on 32-bit machines). (Sizes without large prime factors are
Chris@10 692 not affected.)
Chris@10 693
Chris@10 694 \question 25may:solaristhreadbug FFTW 2.1.3's multi-threaded transforms don't give any speedup on Solaris.
Chris@10 695
Chris@10 696 This bug was fixed in FFTW 2.1.4. (By default, Solaris creates
Chris@10 697 threads that do not parallelize over multiple processors, so one has
Chris@10 698 to request the proper behavior specifically.)
Chris@10 699
Chris@10 700 \question 03may:aixflags FFTW 2.1.3 crashes on AIX.
Chris@10 701
Chris@10 702 The FFTW 2.1.3 \courier{configure\} script picked incorrect compiler
Chris@10 703 flags for the \courier{xlc\} compiler on newer IBM processors. This
Chris@10 704 is fixed in FFTW 2.1.4.
Chris@10 705
Chris@10 706 \comment Here it ends!
Chris@10 707