annotate src/fftw-3.3.5/doc/FAQ/fftw-faq.ascii @ 169:223a55898ab9 tip default

Add null config files
author Chris Cannam <cannam@all-day-breakfast.com>
date Mon, 02 Mar 2020 14:03:47 +0000
parents 7867fa7e1b6b
children
rev   line source
cannam@127 1 FFTW FREQUENTLY ASKED QUESTIONS WITH ANSWERS
cannam@127 2 30 Jul 2016
cannam@127 3 Matteo Frigo
cannam@127 4 Steven G. Johnson
cannam@127 5 <fftw@fftw.org>
cannam@127 6
cannam@127 7 This is the list of Frequently Asked Questions about FFTW, a collection of
cannam@127 8 fast C routines for computing the Discrete Fourier Transform in one or
cannam@127 9 more dimensions.
cannam@127 10
cannam@127 11 ===============================================================================
cannam@127 12
cannam@127 13 Index
cannam@127 14
cannam@127 15 Section 1. Introduction and General Information
cannam@127 16 Q1.1 What is FFTW?
cannam@127 17 Q1.2 How do I obtain FFTW?
cannam@127 18 Q1.3 Is FFTW free software?
cannam@127 19 Q1.4 What is this about non-free licenses?
cannam@127 20 Q1.5 In the West? I thought MIT was in the East?
cannam@127 21
cannam@127 22 Section 2. Installing FFTW
cannam@127 23 Q2.1 Which systems does FFTW run on?
cannam@127 24 Q2.2 Does FFTW run on Windows?
cannam@127 25 Q2.3 My compiler has trouble with FFTW.
cannam@127 26 Q2.4 FFTW does not compile on Solaris, complaining about const.
cannam@127 27 Q2.5 What's the difference between --enable-3dnow and --enable-k7?
cannam@127 28 Q2.6 What's the difference between the fma and the non-fma versions?
cannam@127 29 Q2.7 Which language is FFTW written in?
cannam@127 30 Q2.8 Can I call FFTW from Fortran?
cannam@127 31 Q2.9 Can I call FFTW from C++?
cannam@127 32 Q2.10 Why isn't FFTW written in Fortran/C++?
cannam@127 33 Q2.11 How do I compile FFTW to run in single precision?
cannam@127 34 Q2.12 --enable-k7 does not work on x86-64
cannam@127 35
cannam@127 36 Section 3. Using FFTW
cannam@127 37 Q3.1 Why not support the FFTW 2 interface in FFTW 3?
cannam@127 38 Q3.2 Why do FFTW 3 plans encapsulate the input/output arrays and not ju
cannam@127 39 Q3.3 FFTW seems really slow.
cannam@127 40 Q3.4 FFTW slows down after repeated calls.
cannam@127 41 Q3.5 An FFTW routine is crashing when I call it.
cannam@127 42 Q3.6 My Fortran program crashes when calling FFTW.
cannam@127 43 Q3.7 FFTW gives results different from my old FFT.
cannam@127 44 Q3.8 FFTW gives different results between runs
cannam@127 45 Q3.9 Can I save FFTW's plans?
cannam@127 46 Q3.10 Why does your inverse transform return a scaled result?
cannam@127 47 Q3.11 How can I make FFTW put the origin (zero frequency) at the center
cannam@127 48 Q3.12 How do I FFT an image/audio file in *foobar* format?
cannam@127 49 Q3.13 My program does not link (on Unix).
cannam@127 50 Q3.14 I included your header, but linking still fails.
cannam@127 51 Q3.15 My program crashes, complaining about stack space.
cannam@127 52 Q3.16 FFTW seems to have a memory leak.
cannam@127 53 Q3.17 The output of FFTW's transform is all zeros.
cannam@127 54 Q3.18 How do I call FFTW from the Microsoft language du jour?
cannam@127 55 Q3.19 Can I compute only a subset of the DFT outputs?
cannam@127 56 Q3.20 Can I use FFTW's routines for in-place and out-of-place matrix tra
cannam@127 57
cannam@127 58 Section 4. Internals of FFTW
cannam@127 59 Q4.1 How does FFTW work?
cannam@127 60 Q4.2 Why is FFTW so fast?
cannam@127 61
cannam@127 62 Section 5. Known bugs
cannam@127 63 Q5.1 FFTW 1.1 crashes in rfftwnd on Linux.
cannam@127 64 Q5.2 The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
cannam@127 65 Q5.3 The test programs in FFTW 1.2.1 fail when I change FFTW to use sin
cannam@127 66 Q5.4 The test program in FFTW 1.2.1 fails for n > 46340.
cannam@127 67 Q5.5 The threaded code fails on Linux Redhat 5.0
cannam@127 68 Q5.6 FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dime
cannam@127 69 Q5.7 FFTW 2.0's complex transforms give the wrong results with prime fa
cannam@127 70 Q5.8 FFTW 2.1.1's MPI test programs crash with MPICH.
cannam@127 71 Q5.9 FFTW 2.1.2's multi-threaded transforms don't work on AIX.
cannam@127 72 Q5.10 FFTW 2.1.2's complex transforms give incorrect results for large p
cannam@127 73 Q5.11 FFTW 2.1.3's multi-threaded transforms don't give any speedup on S
cannam@127 74 Q5.12 FFTW 2.1.3 crashes on AIX.
cannam@127 75
cannam@127 76 ===============================================================================
cannam@127 77
cannam@127 78 Section 1. Introduction and General Information
cannam@127 79
cannam@127 80 Q1.1 What is FFTW?
cannam@127 81 Q1.2 How do I obtain FFTW?
cannam@127 82 Q1.3 Is FFTW free software?
cannam@127 83 Q1.4 What is this about non-free licenses?
cannam@127 84 Q1.5 In the West? I thought MIT was in the East?
cannam@127 85
cannam@127 86 -------------------------------------------------------------------------------
cannam@127 87
cannam@127 88 Question 1.1. What is FFTW?
cannam@127 89
cannam@127 90 FFTW is a free collection of fast C routines for computing the Discrete
cannam@127 91 Fourier Transform in one or more dimensions. It includes complex, real,
cannam@127 92 symmetric, and parallel transforms, and can handle arbitrary array sizes
cannam@127 93 efficiently. FFTW is typically faster than other publically-available FFT
cannam@127 94 implementations, and is even competitive with vendor-tuned libraries.
cannam@127 95 (See our web page for extensive benchmarks.) To achieve this performance,
cannam@127 96 FFTW uses novel code-generation and runtime self-optimization techniques
cannam@127 97 (along with many other tricks).
cannam@127 98
cannam@127 99 -------------------------------------------------------------------------------
cannam@127 100
cannam@127 101 Question 1.2. How do I obtain FFTW?
cannam@127 102
cannam@127 103 FFTW can be found at the FFTW web page. You can also retrieve it from
cannam@127 104 ftp.fftw.org in /pub/fftw.
cannam@127 105
cannam@127 106 -------------------------------------------------------------------------------
cannam@127 107
cannam@127 108 Question 1.3. Is FFTW free software?
cannam@127 109
cannam@127 110 Starting with version 1.3, FFTW is Free Software in the technical sense
cannam@127 111 defined by the Free Software Foundation (see Categories of Free and
cannam@127 112 Non-Free Software), and is distributed under the terms of the GNU General
cannam@127 113 Public License. Previous versions of FFTW were distributed without fee
cannam@127 114 for noncommercial use, but were not technically ``free.''
cannam@127 115
cannam@127 116 Non-free licenses for FFTW are also available that permit different terms
cannam@127 117 of use than the GPL.
cannam@127 118
cannam@127 119 -------------------------------------------------------------------------------
cannam@127 120
cannam@127 121 Question 1.4. What is this about non-free licenses?
cannam@127 122
cannam@127 123 The non-free licenses are for companies that wish to use FFTW in their
cannam@127 124 products but are unwilling to release their software under the GPL (which
cannam@127 125 would require them to release source code and allow free redistribution).
cannam@127 126 Such users can purchase an unlimited-use license from MIT. Contact us for
cannam@127 127 more details.
cannam@127 128
cannam@127 129 We could instead have released FFTW under the LGPL, or even disallowed
cannam@127 130 non-Free usage. Suffice it to say, however, that MIT owns the copyright
cannam@127 131 to FFTW and they only let us GPL it because we convinced them that it
cannam@127 132 would neither affect their licensing revenue nor irritate existing
cannam@127 133 licensees.
cannam@127 134
cannam@127 135 -------------------------------------------------------------------------------
cannam@127 136
cannam@127 137 Question 1.5. In the West? I thought MIT was in the East?
cannam@127 138
cannam@127 139 Not to an Italian. You could say that we're a Spaghetti Western (with
cannam@127 140 apologies to Sergio Leone).
cannam@127 141
cannam@127 142 ===============================================================================
cannam@127 143
cannam@127 144 Section 2. Installing FFTW
cannam@127 145
cannam@127 146 Q2.1 Which systems does FFTW run on?
cannam@127 147 Q2.2 Does FFTW run on Windows?
cannam@127 148 Q2.3 My compiler has trouble with FFTW.
cannam@127 149 Q2.4 FFTW does not compile on Solaris, complaining about const.
cannam@127 150 Q2.5 What's the difference between --enable-3dnow and --enable-k7?
cannam@127 151 Q2.6 What's the difference between the fma and the non-fma versions?
cannam@127 152 Q2.7 Which language is FFTW written in?
cannam@127 153 Q2.8 Can I call FFTW from Fortran?
cannam@127 154 Q2.9 Can I call FFTW from C++?
cannam@127 155 Q2.10 Why isn't FFTW written in Fortran/C++?
cannam@127 156 Q2.11 How do I compile FFTW to run in single precision?
cannam@127 157 Q2.12 --enable-k7 does not work on x86-64
cannam@127 158
cannam@127 159 -------------------------------------------------------------------------------
cannam@127 160
cannam@127 161 Question 2.1. Which systems does FFTW run on?
cannam@127 162
cannam@127 163 FFTW is written in ANSI C, and should work on any system with a decent C
cannam@127 164 compiler. (See also Q2.2 `Does FFTW run on Windows?', Q2.3 `My compiler
cannam@127 165 has trouble with FFTW.'.) FFTW can also take advantage of certain
cannam@127 166 hardware-specific features, such as cycle counters and SIMD instructions,
cannam@127 167 but this is optional.
cannam@127 168
cannam@127 169 -------------------------------------------------------------------------------
cannam@127 170
cannam@127 171 Question 2.2. Does FFTW run on Windows?
cannam@127 172
cannam@127 173 Yes, many people have reported successfully using FFTW on Windows with
cannam@127 174 various compilers. FFTW was not developed on Windows, but the source code
cannam@127 175 is essentially straight ANSI C. See also the FFTW Windows installation
cannam@127 176 notes, Q2.3 `My compiler has trouble with FFTW.', and Q3.18 `How do I call
cannam@127 177 FFTW from the Microsoft language du jour?'.
cannam@127 178
cannam@127 179 -------------------------------------------------------------------------------
cannam@127 180
cannam@127 181 Question 2.3. My compiler has trouble with FFTW.
cannam@127 182
cannam@127 183 Complain fiercely to the vendor of the compiler.
cannam@127 184
cannam@127 185 We have successfully used gcc 3.2.x on x86 and PPC, a recent Compaq C
cannam@127 186 compiler for Alpha, version 6 of IBM's xlc compiler for AIX, Intel's icc
cannam@127 187 versions 5-7, and Sun WorkShop cc version 6.
cannam@127 188
cannam@127 189 FFTW is likely to push compilers to their limits, however, and several
cannam@127 190 compiler bugs have been exposed by FFTW. A partial list follows.
cannam@127 191
cannam@127 192 gcc 2.95.x for Solaris/SPARC produces incorrect code for the test program
cannam@127 193 (workaround: recompile the libbench2 directory with -O2).
cannam@127 194
cannam@127 195 NetBSD/macppc 1.6 comes with a gcc version that also miscompiles the test
cannam@127 196 program. (Please report a workaround if you know one.)
cannam@127 197
cannam@127 198 gcc 3.2.3 for ARM reportedly crashes during compilation. This bug is
cannam@127 199 reportedly fixed in later versions of gcc.
cannam@127 200
cannam@127 201 Versions 8.0 and 8.1 of Intel's icc falsely claim to be gcc, so you should
cannam@127 202 specify CC="icc -no-gcc"; this is automatic in FFTW 3.1. icc-8.0.066
cannam@127 203 reportely produces incorrect code for FFTW 2.1.5, but is fixed in version
cannam@127 204 8.1. icc-7.1 compiler build 20030402Z appears to produce incorrect
cannam@127 205 dependencies, causing the compilation to fail. icc-7.1 build 20030307Z
cannam@127 206 appears to work fine. (Use icc -V to check which build you have.) As of
cannam@127 207 2003/04/18, build 20030402Z appears not to be available any longer on
cannam@127 208 Intel's website, whereas the older build 20030307Z is available.
cannam@127 209
cannam@127 210 ranlib of GNU binutils 2.9.1 on Irix has been observed to corrupt the FFTW
cannam@127 211 libraries, causing a link failure when FFTW is compiled. Since ranlib is
cannam@127 212 completely superfluous on Irix, we suggest deleting it from your system
cannam@127 213 and replacing it with a symbolic link to /bin/echo.
cannam@127 214
cannam@127 215 If support for SIMD instructions is enabled in FFTW, further compiler
cannam@127 216 problems may appear:
cannam@127 217
cannam@127 218 gcc 3.4.[0123] for x86 produces incorrect SSE2 code for FFTW when -O2 (the
cannam@127 219 best choice for FFTW) is used, causing FFTW to crash (make check crashes).
cannam@127 220 This bug is fixed in gcc 3.4.4. On x86_64 (amd64/em64t), gcc 3.4.4
cannam@127 221 reportedly still has a similar problem, but this is fixed as of gcc 3.4.6.
cannam@127 222
cannam@127 223 gcc-3.2 for x86 produces incorrect SIMD code if -O3 is used. The same
cannam@127 224 compiler produces incorrect SIMD code if no optimization is used, too.
cannam@127 225 When using gcc-3.2, it is a good idea not to change the default CFLAGS
cannam@127 226 selected by the configure script.
cannam@127 227
cannam@127 228 Some 3.0.x and 3.1.x versions of gcc on x86 may crash. gcc so-called 2.96
cannam@127 229 shipping with RedHat 7.3 crashes when compiling SIMD code. In both cases,
cannam@127 230 please upgrade to gcc-3.2 or later.
cannam@127 231
cannam@127 232 Intel's icc 6.0 misaligns SSE constants, but FFTW has a workaround. icc
cannam@127 233 8.x fails to compile FFTW 3.0.x because it falsely claims to be gcc; we
cannam@127 234 believe this to be a bug in icc, but FFTW 3.1 has a workaround.
cannam@127 235
cannam@127 236 Visual C++ 2003 reportedly produces incorrect code for SSE/SSE2 when
cannam@127 237 compiling FFTW. This bug was reportedly fixed in VC++ 2005;
cannam@127 238 alternatively, you could switch to the Intel compiler. VC++ 6.0 also
cannam@127 239 reportedly produces incorrect code for the file reodft11e-r2hc-odd.c
cannam@127 240 unless optimizations are disabled for that file.
cannam@127 241
cannam@127 242 gcc 2.95 on MacOS X miscompiles AltiVec code (fixed in later versions).
cannam@127 243 gcc 3.2.x miscompiles AltiVec permutations, but FFTW has a workaround.
cannam@127 244 gcc 4.0.1 on MacOS for Intel crashes when compiling FFTW; a workaround is
cannam@127 245 to compile one file without optimization: cd kernel; make CFLAGS=" "
cannam@127 246 trig.lo.
cannam@127 247
cannam@127 248 gcc 4.1.1 reportedly crashes when compiling FFTW for MIPS; the workaround
cannam@127 249 is to compile the file it crashes on (t2_64.c) with a lower optimization
cannam@127 250 level.
cannam@127 251
cannam@127 252 gcc versions 4.1.2 to 4.2.0 for x86 reportedly miscompile FFTW 3.1's test
cannam@127 253 program, causing make check to crash (gcc bug #26528). The bug was
cannam@127 254 reportedly fixed in gcc version 4.2.1 and later. A workaround is to
cannam@127 255 compile libbench2/verify-lib.c without optimization.
cannam@127 256
cannam@127 257 -------------------------------------------------------------------------------
cannam@127 258
cannam@127 259 Question 2.4. FFTW does not compile on Solaris, complaining about const.
cannam@127 260
cannam@127 261 We know that at least on Solaris 2.5.x with Sun's compilers 4.2 you might
cannam@127 262 get error messages from make such as
cannam@127 263
cannam@127 264 "./fftw.h", line 88: warning: const is a keyword in ANSI C
cannam@127 265
cannam@127 266 This is the case when the configure script reports that const does not
cannam@127 267 work:
cannam@127 268
cannam@127 269 checking for working const... (cached) no
cannam@127 270
cannam@127 271 You should be aware that Solaris comes with two compilers, namely,
cannam@127 272 /opt/SUNWspro/SC4.2/bin/cc and /usr/ucb/cc. The latter compiler is
cannam@127 273 non-ANSI. Indeed, it is a perverse shell script that calls the real
cannam@127 274 compiler in non-ANSI mode. In order to compile FFTW, change your path so
cannam@127 275 that the right cc is used.
cannam@127 276
cannam@127 277 To know whether your compiler is the right one, type cc -V. If the
cannam@127 278 compiler prints ``ucbcc'', as in
cannam@127 279
cannam@127 280 ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2
cannam@127 281
cannam@127 282 then the compiler is wrong. The right message is something like
cannam@127 283
cannam@127 284 cc: WorkShop Compilers 4.2 30 Oct 1996 C 4.2
cannam@127 285
cannam@127 286 -------------------------------------------------------------------------------
cannam@127 287
cannam@127 288 Question 2.5. What's the difference between --enable-3dnow and --enable-k7?
cannam@127 289
cannam@127 290 --enable-k7 enables 3DNow! instructions on K7 processors (AMD Athlon and
cannam@127 291 its variants). K7 support is provided by assembly routines generated by a
cannam@127 292 special purpose compiler. As of fftw-3.2, --enable-k7 is no longer
cannam@127 293 supported.
cannam@127 294
cannam@127 295 --enable-3dnow enables generic 3DNow! support using gcc builtin functions.
cannam@127 296 This works on earlier AMD processors, but it is not as fast as our special
cannam@127 297 assembly routines. As of fftw-3.1, --enable-3dnow is no longer supported.
cannam@127 298
cannam@127 299 -------------------------------------------------------------------------------
cannam@127 300
cannam@127 301 Question 2.6. What's the difference between the fma and the non-fma versions?
cannam@127 302
cannam@127 303 The fma version tries to exploit the fused multiply-add instructions
cannam@127 304 implemented in many processors such as PowerPC, ia-64, and MIPS. The two
cannam@127 305 FFTW packages are otherwise identical. In FFTW 3.1, the fma and non-fma
cannam@127 306 versions were merged together into a single package, and the configure
cannam@127 307 script attempts to automatically guess which version to use.
cannam@127 308
cannam@127 309 The FFTW 3.1 configure script enables fma by default on PowerPC, Itanium,
cannam@127 310 and PA-RISC, and disables it otherwise. You can force one or the other by
cannam@127 311 using the --enable-fma or --disable-fma flag for configure.
cannam@127 312
cannam@127 313 Definitely use fma if you have a PowerPC-based system with gcc (or IBM
cannam@127 314 xlc). This includes all GNU/Linux systems for PowerPC and the older
cannam@127 315 PowerPC-based MacOS systems. Also use it on PA-RISC and Itanium with the
cannam@127 316 HP/UX compiler.
cannam@127 317
cannam@127 318 Definitely do not use the fma version if you have an ia-32 processor
cannam@127 319 (Intel, AMD, MacOS on Intel, etcetera).
cannam@127 320
cannam@127 321 For other architectures/compilers, the situation is not so clear. For
cannam@127 322 example, ia-64 has the fma instruction, but gcc-3.2 appears not to exploit
cannam@127 323 it correctly. Other compilers may do the right thing, but we have not
cannam@127 324 tried them. Please send us your feedback so that we can update this FAQ
cannam@127 325 entry.
cannam@127 326
cannam@127 327 -------------------------------------------------------------------------------
cannam@127 328
cannam@127 329 Question 2.7. Which language is FFTW written in?
cannam@127 330
cannam@127 331 FFTW is written in ANSI C. Most of the code, however, was automatically
cannam@127 332 generated by a program called genfft, written in the Objective Caml
cannam@127 333 dialect of ML. You do not need to know ML or to have an Objective Caml
cannam@127 334 compiler in order to use FFTW.
cannam@127 335
cannam@127 336 genfft is provided with the FFTW sources, which means that you can play
cannam@127 337 with the code generator if you want. In this case, you need a working
cannam@127 338 Objective Caml system. Objective Caml is available from the Caml web
cannam@127 339 page.
cannam@127 340
cannam@127 341 -------------------------------------------------------------------------------
cannam@127 342
cannam@127 343 Question 2.8. Can I call FFTW from Fortran?
cannam@127 344
cannam@127 345 Yes, FFTW (versions 1.3 and higher) contains a Fortran-callable interface,
cannam@127 346 documented in the FFTW manual.
cannam@127 347
cannam@127 348 By default, FFTW configures its Fortran interface to work with the first
cannam@127 349 compiler it finds, e.g. g77. To configure for a different, incompatible
cannam@127 350 Fortran compiler foobar, use ./configure F77=foobar when installing FFTW.
cannam@127 351 (In the case of g77, however, FFTW 3.x also includes an extra set of
cannam@127 352 Fortran-callable routines with one less underscore at the end of
cannam@127 353 identifiers, which should cover most other Fortran compilers on Linux at
cannam@127 354 least.)
cannam@127 355
cannam@127 356 -------------------------------------------------------------------------------
cannam@127 357
cannam@127 358 Question 2.9. Can I call FFTW from C++?
cannam@127 359
cannam@127 360 Most definitely. FFTW should compile and/or link under any C++ compiler.
cannam@127 361 Moreover, it is likely that the C++ <complex> template class is
cannam@127 362 bit-compatible with FFTW's complex-number format (see the FFTW manual for
cannam@127 363 more details).
cannam@127 364
cannam@127 365 -------------------------------------------------------------------------------
cannam@127 366
cannam@127 367 Question 2.10. Why isn't FFTW written in Fortran/C++?
cannam@127 368
cannam@127 369 Because we don't like those languages, and neither approaches the
cannam@127 370 portability of C.
cannam@127 371
cannam@127 372 -------------------------------------------------------------------------------
cannam@127 373
cannam@127 374 Question 2.11. How do I compile FFTW to run in single precision?
cannam@127 375
cannam@127 376 On a Unix system: configure --enable-float. On a non-Unix system: edit
cannam@127 377 config.h to #define the symbol FFTW_SINGLE (for FFTW 3.x). In both cases,
cannam@127 378 you must then recompile FFTW. In FFTW 3, all FFTW identifiers will then
cannam@127 379 begin with fftwf_ instead of fftw_.
cannam@127 380
cannam@127 381 -------------------------------------------------------------------------------
cannam@127 382
cannam@127 383 Question 2.12. --enable-k7 does not work on x86-64
cannam@127 384
cannam@127 385 Support for --enable-k7 was discontinued in fftw-3.2.
cannam@127 386
cannam@127 387 The fftw-3.1 release supports --enable-k7. This option only works on
cannam@127 388 32-bit x86 machines that implement 3DNow!, including the AMD Athlon and
cannam@127 389 the AMD Opteron in 32-bit mode. --enable-k7 does not work on AMD Opteron
cannam@127 390 in 64-bit mode. Use --enable-sse for x86-64 machines.
cannam@127 391
cannam@127 392 FFTW supports 3DNow! by means of assembly code generated by a
cannam@127 393 special-purpose compiler. It is hard to produce assembly code that works
cannam@127 394 in both 32-bit and 64-bit mode.
cannam@127 395
cannam@127 396 ===============================================================================
cannam@127 397
cannam@127 398 Section 3. Using FFTW
cannam@127 399
cannam@127 400 Q3.1 Why not support the FFTW 2 interface in FFTW 3?
cannam@127 401 Q3.2 Why do FFTW 3 plans encapsulate the input/output arrays and not ju
cannam@127 402 Q3.3 FFTW seems really slow.
cannam@127 403 Q3.4 FFTW slows down after repeated calls.
cannam@127 404 Q3.5 An FFTW routine is crashing when I call it.
cannam@127 405 Q3.6 My Fortran program crashes when calling FFTW.
cannam@127 406 Q3.7 FFTW gives results different from my old FFT.
cannam@127 407 Q3.8 FFTW gives different results between runs
cannam@127 408 Q3.9 Can I save FFTW's plans?
cannam@127 409 Q3.10 Why does your inverse transform return a scaled result?
cannam@127 410 Q3.11 How can I make FFTW put the origin (zero frequency) at the center
cannam@127 411 Q3.12 How do I FFT an image/audio file in *foobar* format?
cannam@127 412 Q3.13 My program does not link (on Unix).
cannam@127 413 Q3.14 I included your header, but linking still fails.
cannam@127 414 Q3.15 My program crashes, complaining about stack space.
cannam@127 415 Q3.16 FFTW seems to have a memory leak.
cannam@127 416 Q3.17 The output of FFTW's transform is all zeros.
cannam@127 417 Q3.18 How do I call FFTW from the Microsoft language du jour?
cannam@127 418 Q3.19 Can I compute only a subset of the DFT outputs?
cannam@127 419 Q3.20 Can I use FFTW's routines for in-place and out-of-place matrix tra
cannam@127 420
cannam@127 421 -------------------------------------------------------------------------------
cannam@127 422
cannam@127 423 Question 3.1. Why not support the FFTW 2 interface in FFTW 3?
cannam@127 424
cannam@127 425 FFTW 3 has semantics incompatible with earlier versions: its plans can
cannam@127 426 only be used for a given stride, multiplicity, and other characteristics
cannam@127 427 of the input and output arrays; these stronger semantics are necessary for
cannam@127 428 performance reasons. Thus, it is impossible to efficiently emulate the
cannam@127 429 older interface (whose plans can be used for any transform of the same
cannam@127 430 size). We believe that it should be possible to upgrade most programs
cannam@127 431 without any difficulty, however.
cannam@127 432
cannam@127 433 -------------------------------------------------------------------------------
cannam@127 434
cannam@127 435 Question 3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just the algorithm?
cannam@127 436
cannam@127 437 There are several reasons:
cannam@127 438
cannam@127 439 * It was important for performance reasons that the plan be specific to
cannam@127 440 array characteristics like the stride (and alignment, for SIMD), and
cannam@127 441 requiring that the user maintain these invariants is error prone.
cannam@127 442 * In most high-performance applications, as far as we can tell, you are
cannam@127 443 usually transforming the same array over and over, so FFTW's semantics
cannam@127 444 should not be a burden.
cannam@127 445 * If you need to transform another array of the same size, creating a new
cannam@127 446 plan once the first exists is a cheap operation.
cannam@127 447 * If you need to transform many arrays of the same size at once, you
cannam@127 448 should really use the plan_many routines in FFTW's "advanced" interface.
cannam@127 449 * If the abovementioned array characteristics are the same, you are
cannam@127 450 willing to pay close attention to the documentation, and you really need
cannam@127 451 to, we provide a "new-array execution" interface to apply a plan to a
cannam@127 452 new array.
cannam@127 453
cannam@127 454 -------------------------------------------------------------------------------
cannam@127 455
cannam@127 456 Question 3.3. FFTW seems really slow.
cannam@127 457
cannam@127 458 You are probably recreating the plan before every transform, rather than
cannam@127 459 creating it once and reusing it for all transforms of the same size. FFTW
cannam@127 460 is designed to be used in the following way:
cannam@127 461
cannam@127 462 * First, you create a plan. This will take several seconds.
cannam@127 463 * Then, you reuse the plan many times to perform FFTs. These are fast.
cannam@127 464
cannam@127 465 If you don't need to compute many transforms and the time for the planner
cannam@127 466 is significant, you have two options. First, you can use the
cannam@127 467 FFTW_ESTIMATE option in the planner, which uses heuristics instead of
cannam@127 468 runtime measurements and produces a good plan in a short time. Second,
cannam@127 469 you can use the wisdom feature to precompute the plan; see Q3.9 `Can I
cannam@127 470 save FFTW's plans?'
cannam@127 471
cannam@127 472 -------------------------------------------------------------------------------
cannam@127 473
cannam@127 474 Question 3.4. FFTW slows down after repeated calls.
cannam@127 475
cannam@127 476 Probably, NaNs or similar are creeping into your data, and the slowdown is
cannam@127 477 due to the resulting floating-point exceptions. For example, be aware
cannam@127 478 that repeatedly FFTing the same array is a diverging process (because FFTW
cannam@127 479 computes the unnormalized transform).
cannam@127 480
cannam@127 481 -------------------------------------------------------------------------------
cannam@127 482
cannam@127 483 Question 3.5. An FFTW routine is crashing when I call it.
cannam@127 484
cannam@127 485 Did the FFTW test programs pass (make check, or cd tests; make bigcheck if
cannam@127 486 you want to be paranoid)? If so, you almost certainly have a bug in your
cannam@127 487 own code. For example, you could be passing invalid arguments (such as
cannam@127 488 wrongly-sized arrays) to FFTW, or you could simply have memory corruption
cannam@127 489 elsewhere in your program that causes random crashes later on. Please
cannam@127 490 don't complain to us unless you can come up with a minimal self-contained
cannam@127 491 program (preferably under 30 lines) that illustrates the problem.
cannam@127 492
cannam@127 493 -------------------------------------------------------------------------------
cannam@127 494
cannam@127 495 Question 3.6. My Fortran program crashes when calling FFTW.
cannam@127 496
cannam@127 497 As described in the manual, on 64-bit machines you must store the plans in
cannam@127 498 variables large enough to hold a pointer, for example integer*8. We
cannam@127 499 recommend using integer*8 on 32-bit machines as well, to simplify porting.
cannam@127 500
cannam@127 501 -------------------------------------------------------------------------------
cannam@127 502
cannam@127 503 Question 3.7. FFTW gives results different from my old FFT.
cannam@127 504
cannam@127 505 People follow many different conventions for the DFT, and you should be
cannam@127 506 sure to know the ones that we use (described in the FFTW manual). In
cannam@127 507 particular, you should be aware that the FFTW_FORWARD/FFTW_BACKWARD
cannam@127 508 directions correspond to signs of -1/+1 in the exponent of the DFT
cannam@127 509 definition. (*Numerical Recipes* uses the opposite convention.)
cannam@127 510
cannam@127 511 You should also know that we compute an unnormalized transform. In
cannam@127 512 contrast, Matlab is an example of program that computes a normalized
cannam@127 513 transform. See Q3.10 `Why does your inverse transform return a scaled
cannam@127 514 result?'.
cannam@127 515
cannam@127 516 Finally, note that floating-point arithmetic is not exact, so different
cannam@127 517 FFT algorithms will give slightly different results (on the order of the
cannam@127 518 numerical accuracy; typically a fractional difference of 1e-15 or so in
cannam@127 519 double precision).
cannam@127 520
cannam@127 521 -------------------------------------------------------------------------------
cannam@127 522
cannam@127 523 Question 3.8. FFTW gives different results between runs
cannam@127 524
cannam@127 525 If you use FFTW_MEASURE or FFTW_PATIENT mode, then the algorithm FFTW
cannam@127 526 employs is not deterministic: it depends on runtime performance
cannam@127 527 measurements. This will cause the results to vary slightly from run to
cannam@127 528 run. However, the differences should be slight, on the order of the
cannam@127 529 floating-point precision, and therefore should have no practical impact on
cannam@127 530 most applications.
cannam@127 531
cannam@127 532 If you use saved plans (wisdom) or FFTW_ESTIMATE mode, however, then the
cannam@127 533 algorithm is deterministic and the results should be identical between
cannam@127 534 runs.
cannam@127 535
cannam@127 536 -------------------------------------------------------------------------------
cannam@127 537
cannam@127 538 Question 3.9. Can I save FFTW's plans?
cannam@127 539
cannam@127 540 Yes. Starting with version 1.2, FFTW provides the wisdom mechanism for
cannam@127 541 saving plans; see the FFTW manual.
cannam@127 542
cannam@127 543 -------------------------------------------------------------------------------
cannam@127 544
cannam@127 545 Question 3.10. Why does your inverse transform return a scaled result?
cannam@127 546
cannam@127 547 Computing the forward transform followed by the backward transform (or
cannam@127 548 vice versa) yields the original array scaled by the size of the array.
cannam@127 549 (For multi-dimensional transforms, the size of the array is the product of
cannam@127 550 the dimensions.) We could, instead, have chosen a normalization that
cannam@127 551 would have returned the unscaled array. Or, to accomodate the many
cannam@127 552 conventions in this matter, the transform routines could have accepted a
cannam@127 553 "scale factor" parameter. We did not do this, however, for two reasons.
cannam@127 554 First, we didn't want to sacrifice performance in the common case where
cannam@127 555 the scale factor is 1. Second, in real applications the FFT is followed or
cannam@127 556 preceded by some computation on the data, into which the scale factor can
cannam@127 557 typically be absorbed at little or no cost.
cannam@127 558
cannam@127 559 -------------------------------------------------------------------------------
cannam@127 560
cannam@127 561 Question 3.11. How can I make FFTW put the origin (zero frequency) at the center of its output?
cannam@127 562
cannam@127 563 For human viewing of a spectrum, it is often convenient to put the origin
cannam@127 564 in frequency space at the center of the output array, rather than in the
cannam@127 565 zero-th element (the default in FFTW). If all of the dimensions of your
cannam@127 566 array are even, you can accomplish this by simply multiplying each element
cannam@127 567 of the input array by (-1)^(i + j + ...), where i, j, etcetera are the
cannam@127 568 indices of the element. (This trick is a general property of the DFT, and
cannam@127 569 is not specific to FFTW.)
cannam@127 570
cannam@127 571 -------------------------------------------------------------------------------
cannam@127 572
cannam@127 573 Question 3.12. How do I FFT an image/audio file in *foobar* format?
cannam@127 574
cannam@127 575 FFTW performs an FFT on an array of floating-point values. You can
cannam@127 576 certainly use it to compute the transform of an image or audio stream, but
cannam@127 577 you are responsible for figuring out your data format and converting it to
cannam@127 578 the form FFTW requires.
cannam@127 579
cannam@127 580 -------------------------------------------------------------------------------
cannam@127 581
cannam@127 582 Question 3.13. My program does not link (on Unix).
cannam@127 583
cannam@127 584 The libraries must be listed in the correct order (-lfftw3 -lm for FFTW
cannam@127 585 3.x) and *after* your program sources/objects. (The general rule is that
cannam@127 586 if *A* uses *B*, then *A* must be listed before *B* in the link command.).
cannam@127 587
cannam@127 588 -------------------------------------------------------------------------------
cannam@127 589
cannam@127 590 Question 3.14. I included your header, but linking still fails.
cannam@127 591
cannam@127 592 You're a C++ programmer, aren't you? You have to compile the FFTW library
cannam@127 593 and link it into your program, not just #include <fftw3.h>. (Yes, this is
cannam@127 594 really a FAQ.)
cannam@127 595
cannam@127 596 -------------------------------------------------------------------------------
cannam@127 597
cannam@127 598 Question 3.15. My program crashes, complaining about stack space.
cannam@127 599
cannam@127 600 You cannot declare large arrays with automatic storage (e.g. via
cannam@127 601 fftw_complex array[N]); you should use fftw_malloc (or equivalent) to
cannam@127 602 allocate the arrays you want to transform if they are larger than a few
cannam@127 603 hundred elements.
cannam@127 604
cannam@127 605 -------------------------------------------------------------------------------
cannam@127 606
cannam@127 607 Question 3.16. FFTW seems to have a memory leak.
cannam@127 608
cannam@127 609 After you create a plan, FFTW caches the information required to quickly
cannam@127 610 recreate the plan. (See Q3.9 `Can I save FFTW's plans?') It also
cannam@127 611 maintains a small amount of other persistent memory. You can deallocate
cannam@127 612 all of FFTW's internally allocated memory, if you wish, by calling
cannam@127 613 fftw_cleanup(), as documented in the manual.
cannam@127 614
cannam@127 615 -------------------------------------------------------------------------------
cannam@127 616
cannam@127 617 Question 3.17. The output of FFTW's transform is all zeros.
cannam@127 618
cannam@127 619 You should initialize your input array *after* creating the plan, unless
cannam@127 620 you use FFTW_ESTIMATE: planning with FFTW_MEASURE or FFTW_PATIENT
cannam@127 621 overwrites the input/output arrays, as described in the manual.
cannam@127 622
cannam@127 623 -------------------------------------------------------------------------------
cannam@127 624
cannam@127 625 Question 3.18. How do I call FFTW from the Microsoft language du jour?
cannam@127 626
cannam@127 627 Please *do not* ask us Windows-specific questions. We do not use Windows.
cannam@127 628 We know nothing about Visual Basic, Visual C++, or .NET. Please find the
cannam@127 629 appropriate Usenet discussion group and ask your question there. See also
cannam@127 630 Q2.2 `Does FFTW run on Windows?'.
cannam@127 631
cannam@127 632 -------------------------------------------------------------------------------
cannam@127 633
cannam@127 634 Question 3.19. Can I compute only a subset of the DFT outputs?
cannam@127 635
cannam@127 636 In general, no, an FFT intrinsically computes all outputs from all inputs.
cannam@127 637 In principle, there is something called a *pruned FFT* that can do what
cannam@127 638 you want, but to compute K outputs out of N the complexity is in general
cannam@127 639 O(N log K) instead of O(N log N), thus saving only a small additive factor
cannam@127 640 in the log. (The same argument holds if you instead have only K nonzero
cannam@127 641 inputs.)
cannam@127 642
cannam@127 643 There are some specific cases in which you can get the O(N log K)
cannam@127 644 performance benefits easily, however, by combining a few ordinary FFTs.
cannam@127 645 In particular, the case where you want the first K outputs, where K
cannam@127 646 divides N, can be handled by performing N/K transforms of size K and then
cannam@127 647 summing the outputs multiplied by appropriate phase factors. For more
cannam@127 648 details, see pruned FFTs with FFTW.
cannam@127 649
cannam@127 650 There are also some algorithms that compute pruned transforms
cannam@127 651 *approximately*, but they are beyond the scope of this FAQ.
cannam@127 652
cannam@127 653 -------------------------------------------------------------------------------
cannam@127 654
cannam@127 655 Question 3.20. Can I use FFTW's routines for in-place and out-of-place matrix transposition?
cannam@127 656
cannam@127 657 You can use the FFTW guru interface to create a rank-0 transform of vector
cannam@127 658 rank 2 where the vector strides are transposed. (A rank-0 transform is
cannam@127 659 equivalent to a 1D transform of size 1, which. just copies the input into
cannam@127 660 the output.) Specifying the same location for the input and output makes
cannam@127 661 the transpose in-place.
cannam@127 662
cannam@127 663 For double-valued data stored in row-major format, plan creation looks
cannam@127 664 like this:
cannam@127 665
cannam@127 666 fftw_plan plan_transpose(int rows, int cols, double *in, double *out)
cannam@127 667 {
cannam@127 668 const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */
cannam@127 669 fftw_iodim howmany_dims[2];
cannam@127 670
cannam@127 671 howmany_dims[0].n = rows;
cannam@127 672 howmany_dims[0].is = cols;
cannam@127 673 howmany_dims[0].os = 1;
cannam@127 674
cannam@127 675 howmany_dims[1].n = cols;
cannam@127 676 howmany_dims[1].is = 1;
cannam@127 677 howmany_dims[1].os = rows;
cannam@127 678
cannam@127 679 return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL,
cannam@127 680 /*howmany_rank=*/ 2, howmany_dims,
cannam@127 681 in, out, /*kind=*/ NULL, flags);
cannam@127 682 }
cannam@127 683 (This entry was written by Rhys Ulerich.)
cannam@127 684
cannam@127 685 ===============================================================================
cannam@127 686
cannam@127 687 Section 4. Internals of FFTW
cannam@127 688
cannam@127 689 Q4.1 How does FFTW work?
cannam@127 690 Q4.2 Why is FFTW so fast?
cannam@127 691
cannam@127 692 -------------------------------------------------------------------------------
cannam@127 693
cannam@127 694 Question 4.1. How does FFTW work?
cannam@127 695
cannam@127 696 The innovation (if it can be so called) in FFTW consists in having a
cannam@127 697 variety of composable *solvers*, representing different FFT algorithms and
cannam@127 698 implementation strategies, whose combination into a particular *plan* for
cannam@127 699 a given size can be determined at runtime according to the characteristics
cannam@127 700 of your machine/compiler. This peculiar software architecture allows FFTW
cannam@127 701 to adapt itself to almost any machine.
cannam@127 702
cannam@127 703 For more details (albeit somewhat outdated), see the paper "FFTW: An
cannam@127 704 Adaptive Software Architecture for the FFT", by M. Frigo and S. G.
cannam@127 705 Johnson, *Proc. ICASSP* 3, 1381 (1998), also available at the FFTW web
cannam@127 706 page.
cannam@127 707
cannam@127 708 -------------------------------------------------------------------------------
cannam@127 709
cannam@127 710 Question 4.2. Why is FFTW so fast?
cannam@127 711
cannam@127 712 This is a complex question, and there is no simple answer. In fact, the
cannam@127 713 authors do not fully know the answer, either. In addition to many small
cannam@127 714 performance hacks throughout FFTW, there are three general reasons for
cannam@127 715 FFTW's speed.
cannam@127 716
cannam@127 717 * FFTW uses a variety of FFT algorithms and implementation styles that
cannam@127 718 can be arbitrarily composed to adapt itself to a machine. See Q4.1 `How
cannam@127 719 does FFTW work?'.
cannam@127 720 * FFTW uses a code generator to produce highly-optimized routines for
cannam@127 721 computing small transforms.
cannam@127 722 * FFTW uses explicit divide-and-conquer to take advantage of the memory
cannam@127 723 hierarchy.
cannam@127 724
cannam@127 725 For more details (albeit somewhat outdated), see the paper "FFTW: An
cannam@127 726 Adaptive Software Architecture for the FFT", by M. Frigo and S. G.
cannam@127 727 Johnson, *Proc. ICASSP* 3, 1381 (1998), available along with other
cannam@127 728 references at the FFTW web page.
cannam@127 729
cannam@127 730 ===============================================================================
cannam@127 731
cannam@127 732 Section 5. Known bugs
cannam@127 733
cannam@127 734 Q5.1 FFTW 1.1 crashes in rfftwnd on Linux.
cannam@127 735 Q5.2 The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
cannam@127 736 Q5.3 The test programs in FFTW 1.2.1 fail when I change FFTW to use sin
cannam@127 737 Q5.4 The test program in FFTW 1.2.1 fails for n > 46340.
cannam@127 738 Q5.5 The threaded code fails on Linux Redhat 5.0
cannam@127 739 Q5.6 FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dime
cannam@127 740 Q5.7 FFTW 2.0's complex transforms give the wrong results with prime fa
cannam@127 741 Q5.8 FFTW 2.1.1's MPI test programs crash with MPICH.
cannam@127 742 Q5.9 FFTW 2.1.2's multi-threaded transforms don't work on AIX.
cannam@127 743 Q5.10 FFTW 2.1.2's complex transforms give incorrect results for large p
cannam@127 744 Q5.11 FFTW 2.1.3's multi-threaded transforms don't give any speedup on S
cannam@127 745 Q5.12 FFTW 2.1.3 crashes on AIX.
cannam@127 746
cannam@127 747 -------------------------------------------------------------------------------
cannam@127 748
cannam@127 749 Question 5.1. FFTW 1.1 crashes in rfftwnd on Linux.
cannam@127 750
cannam@127 751 This bug was fixed in FFTW 1.2. There was a bug in rfftwnd causing an
cannam@127 752 incorrect amount of memory to be allocated. The bug showed up in Linux
cannam@127 753 with libc-5.3.12 (and nowhere else that we know of).
cannam@127 754
cannam@127 755 -------------------------------------------------------------------------------
cannam@127 756
cannam@127 757 Question 5.2. The MPI transforms in FFTW 1.2 give incorrect results/leak memory.
cannam@127 758
cannam@127 759 These bugs were corrected in FFTW 1.2.1. The MPI transforms (really, just
cannam@127 760 the transpose routines) in FFTW 1.2 had bugs that could cause errors in
cannam@127 761 some situations.
cannam@127 762
cannam@127 763 -------------------------------------------------------------------------------
cannam@127 764
cannam@127 765 Question 5.3. The test programs in FFTW 1.2.1 fail when I change FFTW to use single precision.
cannam@127 766
cannam@127 767 This bug was fixed in FFTW 1.3. (Older versions of FFTW did work in
cannam@127 768 single precision, but the test programs didn't--the error tolerances in
cannam@127 769 the tests were set for double precision.)
cannam@127 770
cannam@127 771 -------------------------------------------------------------------------------
cannam@127 772
cannam@127 773 Question 5.4. The test program in FFTW 1.2.1 fails for n > 46340.
cannam@127 774
cannam@127 775 This bug was fixed in FFTW 1.3. FFTW 1.2.1 produced the right answer, but
cannam@127 776 the test program was wrong. For large n, n*n in the naive transform that
cannam@127 777 we used for comparison overflows 32 bit integer precision, breaking the
cannam@127 778 test.
cannam@127 779
cannam@127 780 -------------------------------------------------------------------------------
cannam@127 781
cannam@127 782 Question 5.5. The threaded code fails on Linux Redhat 5.0
cannam@127 783
cannam@127 784 We had problems with glibc-2.0.5. The code should work with glibc-2.0.7.
cannam@127 785
cannam@127 786 -------------------------------------------------------------------------------
cannam@127 787
cannam@127 788 Question 5.6. FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final dimension >= 65536.
cannam@127 789
cannam@127 790 This bug was fixed in FFTW 2.0.1. (There was a 32-bit integer overflow
cannam@127 791 due to a poorly-parenthesized expression.)
cannam@127 792
cannam@127 793 -------------------------------------------------------------------------------
cannam@127 794
cannam@127 795 Question 5.7. FFTW 2.0's complex transforms give the wrong results with prime factors 17 to 97.
cannam@127 796
cannam@127 797 There was a bug in the complex transforms that could cause incorrect
cannam@127 798 results under (hopefully rare) circumstances for lengths with
cannam@127 799 intermediate-size prime factors (17-97). This bug was fixed in FFTW
cannam@127 800 2.1.1.
cannam@127 801
cannam@127 802 -------------------------------------------------------------------------------
cannam@127 803
cannam@127 804 Question 5.8. FFTW 2.1.1's MPI test programs crash with MPICH.
cannam@127 805
cannam@127 806 This bug was fixed in FFTW 2.1.2. The 2.1/2.1.1 MPI test programs crashed
cannam@127 807 when using the MPICH implementation of MPI with the ch_p4 device (TCP/IP);
cannam@127 808 the transforms themselves worked fine.
cannam@127 809
cannam@127 810 -------------------------------------------------------------------------------
cannam@127 811
cannam@127 812 Question 5.9. FFTW 2.1.2's multi-threaded transforms don't work on AIX.
cannam@127 813
cannam@127 814 This bug was fixed in FFTW 2.1.3. The multi-threaded transforms in
cannam@127 815 previous versions didn't work with AIX's pthreads implementation, which
cannam@127 816 idiosyncratically creates threads in detached (non-joinable) mode by
cannam@127 817 default.
cannam@127 818
cannam@127 819 -------------------------------------------------------------------------------
cannam@127 820
cannam@127 821 Question 5.10. FFTW 2.1.2's complex transforms give incorrect results for large prime sizes.
cannam@127 822
cannam@127 823 This bug was fixed in FFTW 2.1.3. FFTW's complex-transform algorithm for
cannam@127 824 prime sizes (in versions 2.0 to 2.1.2) had an integer overflow problem
cannam@127 825 that caused incorrect results for many primes greater than 32768 (on
cannam@127 826 32-bit machines). (Sizes without large prime factors are not affected.)
cannam@127 827
cannam@127 828 -------------------------------------------------------------------------------
cannam@127 829
cannam@127 830 Question 5.11. FFTW 2.1.3's multi-threaded transforms don't give any speedup on Solaris.
cannam@127 831
cannam@127 832 This bug was fixed in FFTW 2.1.4. (By default, Solaris creates threads
cannam@127 833 that do not parallelize over multiple processors, so one has to request
cannam@127 834 the proper behavior specifically.)
cannam@127 835
cannam@127 836 -------------------------------------------------------------------------------
cannam@127 837
cannam@127 838 Question 5.12. FFTW 2.1.3 crashes on AIX.
cannam@127 839
cannam@127 840 The FFTW 2.1.3 configure script picked incorrect compiler flags for the
cannam@127 841 xlc compiler on newer IBM processors. This is fixed in FFTW 2.1.4.
cannam@127 842