annotate src/fftw-3.3.8/doc/reference.texi @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents d0c2a83c1364
children
rev   line source
Chris@82 1 @node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top
Chris@82 2 @chapter FFTW Reference
Chris@82 3
Chris@82 4 This chapter provides a complete reference for all sequential (i.e.,
Chris@82 5 one-processor) FFTW functions. Parallel transforms are described in
Chris@82 6 later chapters.
Chris@82 7
Chris@82 8 @menu
Chris@82 9 * Data Types and Files::
Chris@82 10 * Using Plans::
Chris@82 11 * Basic Interface::
Chris@82 12 * Advanced Interface::
Chris@82 13 * Guru Interface::
Chris@82 14 * New-array Execute Functions::
Chris@82 15 * Wisdom::
Chris@82 16 * What FFTW Really Computes::
Chris@82 17 @end menu
Chris@82 18
Chris@82 19 @c ------------------------------------------------------------
Chris@82 20 @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference
Chris@82 21 @section Data Types and Files
Chris@82 22
Chris@82 23 All programs using FFTW should include its header file:
Chris@82 24
Chris@82 25 @example
Chris@82 26 #include <fftw3.h>
Chris@82 27 @end example
Chris@82 28
Chris@82 29 You must also link to the FFTW library. On Unix, this
Chris@82 30 means adding @code{-lfftw3 -lm} at the @emph{end} of the link command.
Chris@82 31
Chris@82 32 @menu
Chris@82 33 * Complex numbers::
Chris@82 34 * Precision::
Chris@82 35 * Memory Allocation::
Chris@82 36 @end menu
Chris@82 37
Chris@82 38 @c =========>
Chris@82 39 @node Complex numbers, Precision, Data Types and Files, Data Types and Files
Chris@82 40 @subsection Complex numbers
Chris@82 41
Chris@82 42 The default FFTW interface uses @code{double} precision for all
Chris@82 43 floating-point numbers, and defines a @code{fftw_complex} type to hold
Chris@82 44 complex numbers as:
Chris@82 45
Chris@82 46 @example
Chris@82 47 typedef double fftw_complex[2];
Chris@82 48 @end example
Chris@82 49 @tindex fftw_complex
Chris@82 50
Chris@82 51 Here, the @code{[0]} element holds the real part and the @code{[1]}
Chris@82 52 element holds the imaginary part.
Chris@82 53
Chris@82 54 Alternatively, if you have a C compiler (such as @code{gcc}) that
Chris@82 55 supports the C99 revision of the ANSI C standard, you can use C's new
Chris@82 56 native complex type (which is binary-compatible with the typedef above).
Chris@82 57 In particular, if you @code{#include <complex.h>} @emph{before}
Chris@82 58 @code{<fftw3.h>}, then @code{fftw_complex} is defined to be the native
Chris@82 59 complex type and you can manipulate it with ordinary arithmetic
Chris@82 60 (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are
Chris@82 61 @code{fftw_complex} and @code{I} is the standard symbol for the
Chris@82 62 imaginary unit);
Chris@82 63 @cindex C99
Chris@82 64
Chris@82 65
Chris@82 66 C++ has its own @code{complex<T>} template class, defined in the
Chris@82 67 standard @code{<complex>} header file. Reportedly, the C++ standards
Chris@82 68 committee has recently agreed to mandate that the storage format used
Chris@82 69 for this type be binary-compatible with the C99 type, i.e. an array
Chris@82 70 @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]}
Chris@82 71 parts. (See report
Chris@82 72 @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf
Chris@82 73 WG21/N1388}.) Although not part of the official standard as of this
Chris@82 74 writing, the proposal stated that: ``This solution has been tested with
Chris@82 75 all current major implementations of the standard library and shown to
Chris@82 76 be working.'' To the extent that this is true, if you have a variable
Chris@82 77 @code{complex<double> *x}, you can pass it directly to FFTW via
Chris@82 78 @code{reinterpret_cast<fftw_complex*>(x)}.
Chris@82 79 @cindex C++
Chris@82 80 @cindex portability
Chris@82 81
Chris@82 82 @c =========>
Chris@82 83 @node Precision, Memory Allocation, Complex numbers, Data Types and Files
Chris@82 84 @subsection Precision
Chris@82 85 @cindex precision
Chris@82 86
Chris@82 87 You can install single and long-double precision versions of FFTW,
Chris@82 88 which replace @code{double} with @code{float} and @code{long double},
Chris@82 89 respectively (@pxref{Installation and Customization}). To use these
Chris@82 90 interfaces, you:
Chris@82 91
Chris@82 92 @itemize @bullet
Chris@82 93
Chris@82 94 @item
Chris@82 95 Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or
Chris@82 96 @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}. (You
Chris@82 97 can link to the different-precision libraries simultaneously.)
Chris@82 98
Chris@82 99 @item
Chris@82 100 Include the @emph{same} @code{<fftw3.h>} header file.
Chris@82 101
Chris@82 102 @item
Chris@82 103 Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or
Chris@82 104 @samp{fftwl_} for single or long-double precision, respectively.
Chris@82 105 (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute}
Chris@82 106 becomes @code{fftwf_execute}, etcetera.)
Chris@82 107
Chris@82 108 @item
Chris@82 109 Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the
Chris@82 110 same.
Chris@82 111
Chris@82 112 @item
Chris@82 113 Replace @code{double} with @code{float} or @code{long double} for
Chris@82 114 subroutine parameters.
Chris@82 115
Chris@82 116 @end itemize
Chris@82 117
Chris@82 118 Depending upon your compiler and/or hardware, @code{long double} may not
Chris@82 119 be any more precise than @code{double} (or may not be supported at all,
Chris@82 120 although it is standard in C99).
Chris@82 121 @cindex C99
Chris@82 122
Chris@82 123
Chris@82 124 We also support using the nonstandard @code{__float128}
Chris@82 125 quadruple-precision type provided by recent versions of @code{gcc} on
Chris@82 126 32- and 64-bit x86 hardware (@pxref{Installation and Customization}).
Chris@82 127 To use this type, link with @code{-lfftw3q -lquadmath -lm} (the
Chris@82 128 @code{libquadmath} library provided by @code{gcc} is needed for
Chris@82 129 quadruple-precision trigonometric functions) and use @samp{fftwq_}
Chris@82 130 identifiers.
Chris@82 131
Chris@82 132 @c =========>
Chris@82 133 @node Memory Allocation, , Precision, Data Types and Files
Chris@82 134 @subsection Memory Allocation
Chris@82 135
Chris@82 136 @example
Chris@82 137 void *fftw_malloc(size_t n);
Chris@82 138 void fftw_free(void *p);
Chris@82 139 @end example
Chris@82 140 @findex fftw_malloc
Chris@82 141 @findex fftw_free
Chris@82 142
Chris@82 143 These are functions that behave identically to @code{malloc} and
Chris@82 144 @code{free}, except that they guarantee that the returned pointer obeys
Chris@82 145 any special alignment restrictions imposed by any algorithm in FFTW
Chris@82 146 (e.g. for SIMD acceleration). @xref{SIMD alignment and fftw_malloc}.
Chris@82 147 @cindex alignment
Chris@82 148
Chris@82 149
Chris@82 150 Data allocated by @code{fftw_malloc} @emph{must} be deallocated by
Chris@82 151 @code{fftw_free} and not by the ordinary @code{free}.
Chris@82 152
Chris@82 153 These routines simply call through to your operating system's
Chris@82 154 @code{malloc} or, if necessary, its aligned equivalent
Chris@82 155 (e.g. @code{memalign}), so you normally need not worry about any
Chris@82 156 significant time or space overhead. You are @emph{not required} to use
Chris@82 157 them to allocate your data, but we strongly recommend it.
Chris@82 158
Chris@82 159 Note: in C++, just as with ordinary @code{malloc}, you must typecast
Chris@82 160 the output of @code{fftw_malloc} to whatever pointer type you are
Chris@82 161 allocating.
Chris@82 162 @cindex C++
Chris@82 163
Chris@82 164
Chris@82 165 We also provide the following two convenience functions to allocate
Chris@82 166 real and complex arrays with @code{n} elements, which are equivalent
Chris@82 167 to @code{(double *) fftw_malloc(sizeof(double) * n)} and
Chris@82 168 @code{(fftw_complex *) fftw_malloc(sizeof(fftw_complex) * n)},
Chris@82 169 respectively:
Chris@82 170
Chris@82 171 @example
Chris@82 172 double *fftw_alloc_real(size_t n);
Chris@82 173 fftw_complex *fftw_alloc_complex(size_t n);
Chris@82 174 @end example
Chris@82 175 @findex fftw_alloc_real
Chris@82 176 @findex fftw_alloc_complex
Chris@82 177
Chris@82 178 The equivalent functions in other precisions allocate arrays of @code{n}
Chris@82 179 elements in that precision. e.g. @code{fftwf_alloc_real(n)} is
Chris@82 180 equivalent to @code{(float *) fftwf_malloc(sizeof(float) * n)}.
Chris@82 181 @cindex precision
Chris@82 182
Chris@82 183 @c ------------------------------------------------------------
Chris@82 184 @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference
Chris@82 185 @section Using Plans
Chris@82 186
Chris@82 187 Plans for all transform types in FFTW are stored as type
Chris@82 188 @code{fftw_plan} (an opaque pointer type), and are created by one of the
Chris@82 189 various planning routines described in the following sections.
Chris@82 190 @tindex fftw_plan
Chris@82 191 An @code{fftw_plan} contains all information necessary to compute the
Chris@82 192 transform, including the pointers to the input and output arrays.
Chris@82 193
Chris@82 194 @example
Chris@82 195 void fftw_execute(const fftw_plan plan);
Chris@82 196 @end example
Chris@82 197 @findex fftw_execute
Chris@82 198
Chris@82 199 This executes the @code{plan}, to compute the corresponding transform on
Chris@82 200 the arrays for which it was planned (which must still exist). The plan
Chris@82 201 is not modified, and @code{fftw_execute} can be called as many times as
Chris@82 202 desired.
Chris@82 203
Chris@82 204 To apply a given plan to a different array, you can use the new-array execute
Chris@82 205 interface. @xref{New-array Execute Functions}.
Chris@82 206
Chris@82 207 @code{fftw_execute} (and equivalents) is the only function in FFTW
Chris@82 208 guaranteed to be thread-safe; see @ref{Thread safety}.
Chris@82 209
Chris@82 210 This function:
Chris@82 211 @example
Chris@82 212 void fftw_destroy_plan(fftw_plan plan);
Chris@82 213 @end example
Chris@82 214 @findex fftw_destroy_plan
Chris@82 215 deallocates the @code{plan} and all its associated data.
Chris@82 216
Chris@82 217 FFTW's planner saves some other persistent data, such as the
Chris@82 218 accumulated wisdom and a list of algorithms available in the current
Chris@82 219 configuration. If you want to deallocate all of that and reset FFTW
Chris@82 220 to the pristine state it was in when you started your program, you can
Chris@82 221 call:
Chris@82 222
Chris@82 223 @example
Chris@82 224 void fftw_cleanup(void);
Chris@82 225 @end example
Chris@82 226 @findex fftw_cleanup
Chris@82 227
Chris@82 228 After calling @code{fftw_cleanup}, all existing plans become undefined,
Chris@82 229 and you should not attempt to execute them nor to destroy them. You can
Chris@82 230 however create and execute/destroy new plans, in which case FFTW starts
Chris@82 231 accumulating wisdom information again.
Chris@82 232
Chris@82 233 @code{fftw_cleanup} does not deallocate your plans, however. To prevent
Chris@82 234 memory leaks, you must still call @code{fftw_destroy_plan} before
Chris@82 235 executing @code{fftw_cleanup}.
Chris@82 236
Chris@82 237 Occasionally, it may useful to know FFTW's internal ``cost'' metric
Chris@82 238 that it uses to compare plans to one another; this cost is
Chris@82 239 proportional to an execution time of the plan, in undocumented units,
Chris@82 240 if the plan was created with the @code{FFTW_MEASURE} or other
Chris@82 241 timing-based options, or alternatively is a heuristic cost function
Chris@82 242 for @code{FFTW_ESTIMATE} plans. (The cost values of measured and
Chris@82 243 estimated plans are not comparable, being in different units. Also,
Chris@82 244 costs from different FFTW versions or the same version compiled
Chris@82 245 differently may not be in the same units. Plans created from wisdom
Chris@82 246 have a cost of 0 since no timing measurement is performed for them.
Chris@82 247 Finally, certain problems for which only one top-level algorithm was
Chris@82 248 possible may have required no measurements of the cost of the whole
Chris@82 249 plan, in which case @code{fftw_cost} will also return 0.) The cost
Chris@82 250 metric for a given plan is returned by:
Chris@82 251
Chris@82 252 @example
Chris@82 253 double fftw_cost(const fftw_plan plan);
Chris@82 254 @end example
Chris@82 255 @findex fftw_cost
Chris@82 256
Chris@82 257 The following two routines are provided purely for academic purposes
Chris@82 258 (that is, for entertainment).
Chris@82 259
Chris@82 260 @example
Chris@82 261 void fftw_flops(const fftw_plan plan,
Chris@82 262 double *add, double *mul, double *fma);
Chris@82 263 @end example
Chris@82 264 @findex fftw_flops
Chris@82 265
Chris@82 266 Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an
Chris@82 267 exact count of the number of floating-point additions, multiplications,
Chris@82 268 and fused multiply-add operations involved in the plan's execution. The
Chris@82 269 total number of floating-point operations (flops) is @code{add + mul +
Chris@82 270 2*fma}, or @code{add + mul + fma} if the hardware supports fused
Chris@82 271 multiply-add instructions (although the number of FMA operations is only
Chris@82 272 approximate because of compiler voodoo). (The number of operations
Chris@82 273 should be an integer, but we use @code{double} to avoid overflowing
Chris@82 274 @code{int} for large transforms; the arguments are of type @code{double}
Chris@82 275 even for single and long-double precision versions of FFTW.)
Chris@82 276
Chris@82 277 @example
Chris@82 278 void fftw_fprint_plan(const fftw_plan plan, FILE *output_file);
Chris@82 279 void fftw_print_plan(const fftw_plan plan);
Chris@82 280 char *fftw_sprint_plan(const fftw_plan plan);
Chris@82 281 @end example
Chris@82 282 @findex fftw_fprint_plan
Chris@82 283 @findex fftw_print_plan
Chris@82 284
Chris@82 285 This outputs a ``nerd-readable'' representation of the @code{plan} to
Chris@82 286 the given file, to @code{stdout}, or two a newly allocated
Chris@82 287 NUL-terminated string (which the caller is responsible for deallocating
Chris@82 288 with @code{free}), respectively.
Chris@82 289
Chris@82 290 @c ------------------------------------------------------------
Chris@82 291 @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference
Chris@82 292 @section Basic Interface
Chris@82 293 @cindex basic interface
Chris@82 294
Chris@82 295 Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est
Chris@82 296 omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface}
Chris@82 297 computes a single transform of contiguous data, the @dfn{advanced
Chris@82 298 interface} computes transforms of multiple or strided arrays, and the
Chris@82 299 @dfn{guru interface} supports the most general data layouts,
Chris@82 300 multiplicities, and strides. This section describes the the basic
Chris@82 301 interface, which we expect to satisfy the needs of most users.
Chris@82 302
Chris@82 303 @menu
Chris@82 304 * Complex DFTs::
Chris@82 305 * Planner Flags::
Chris@82 306 * Real-data DFTs::
Chris@82 307 * Real-data DFT Array Format::
Chris@82 308 * Real-to-Real Transforms::
Chris@82 309 * Real-to-Real Transform Kinds::
Chris@82 310 @end menu
Chris@82 311
Chris@82 312 @c =========>
Chris@82 313 @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface
Chris@82 314 @subsection Complex DFTs
Chris@82 315
Chris@82 316 @example
Chris@82 317 fftw_plan fftw_plan_dft_1d(int n0,
Chris@82 318 fftw_complex *in, fftw_complex *out,
Chris@82 319 int sign, unsigned flags);
Chris@82 320 fftw_plan fftw_plan_dft_2d(int n0, int n1,
Chris@82 321 fftw_complex *in, fftw_complex *out,
Chris@82 322 int sign, unsigned flags);
Chris@82 323 fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2,
Chris@82 324 fftw_complex *in, fftw_complex *out,
Chris@82 325 int sign, unsigned flags);
Chris@82 326 fftw_plan fftw_plan_dft(int rank, const int *n,
Chris@82 327 fftw_complex *in, fftw_complex *out,
Chris@82 328 int sign, unsigned flags);
Chris@82 329 @end example
Chris@82 330 @findex fftw_plan_dft_1d
Chris@82 331 @findex fftw_plan_dft_2d
Chris@82 332 @findex fftw_plan_dft_3d
Chris@82 333 @findex fftw_plan_dft
Chris@82 334
Chris@82 335 Plan a complex input/output discrete Fourier transform (DFT) in zero or
Chris@82 336 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
Chris@82 337
Chris@82 338 Once you have created a plan for a certain transform type and
Chris@82 339 parameters, then creating another plan of the same type and parameters,
Chris@82 340 but for different arrays, is fast and shares constant data with the
Chris@82 341 first plan (if it still exists).
Chris@82 342
Chris@82 343 The planner returns @code{NULL} if the plan cannot be created. In the
Chris@82 344 standard FFTW distribution, the basic interface is guaranteed to return
Chris@82 345 a non-@code{NULL} plan. A plan may be @code{NULL}, however, if you are
Chris@82 346 using a customized FFTW configuration supporting a restricted set of
Chris@82 347 transforms.
Chris@82 348
Chris@82 349 @subsubheading Arguments
Chris@82 350 @itemize @bullet
Chris@82 351
Chris@82 352 @item
Chris@82 353 @code{rank} is the rank of the transform (it should be the size of the
Chris@82 354 array @code{*n}), and can be any non-negative integer. (@xref{Complex
Chris@82 355 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
Chris@82 356 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
Chris@82 357 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
Chris@82 358 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
Chris@82 359 copy of one number from input to output.
Chris@82 360
Chris@82 361 @item
Chris@82 362 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate
Chris@82 363 for each routine) specify the size of the transform dimensions. They
Chris@82 364 can be any positive integer.
Chris@82 365
Chris@82 366 @itemize @minus
Chris@82 367 @item
Chris@82 368 @cindex row-major
Chris@82 369 Multi-dimensional arrays are stored in row-major order with dimensions:
Chris@82 370 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
Chris@82 371 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
Chris@82 372 @xref{Multi-dimensional Array Format}.
Chris@82 373 @item
Chris@82 374 FFTW is best at handling sizes of the form
Chris@82 375 @ifinfo
Chris@82 376 @math{2^a 3^b 5^c 7^d 11^e 13^f},
Chris@82 377 @end ifinfo
Chris@82 378 @tex
Chris@82 379 $2^a 3^b 5^c 7^d 11^e 13^f$,
Chris@82 380 @end tex
Chris@82 381 @html
Chris@82 382 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
Chris@82 383 11<sup>e</sup> 13<sup>f</sup>,
Chris@82 384 @end html
Chris@82 385 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
Chris@82 386 are arbitrary. Other sizes are computed by means of a slow,
Chris@82 387 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). It is possible to customize FFTW
Chris@82 388 for different array sizes; see @ref{Installation and Customization}.
Chris@82 389 Transforms whose sizes are powers of @math{2} are especially fast.
Chris@82 390 @end itemize
Chris@82 391
Chris@82 392 @item
Chris@82 393 @code{in} and @code{out} point to the input and output arrays of the
Chris@82 394 transform, which may be the same (yielding an in-place transform).
Chris@82 395 @cindex in-place
Chris@82 396 These arrays are overwritten during planning, unless
Chris@82 397 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
Chris@82 398 initialized, but they must be allocated.)
Chris@82 399
Chris@82 400 If @code{in == out}, the transform is @dfn{in-place} and the input
Chris@82 401 array is overwritten. If @code{in != out}, the two arrays must
Chris@82 402 not overlap (but FFTW does not check for this condition).
Chris@82 403
Chris@82 404 @item
Chris@82 405 @ctindex FFTW_FORWARD
Chris@82 406 @ctindex FFTW_BACKWARD
Chris@82 407 @code{sign} is the sign of the exponent in the formula that defines the
Chris@82 408 Fourier transform. It can be @math{-1} (= @code{FFTW_FORWARD}) or
Chris@82 409 @math{+1} (= @code{FFTW_BACKWARD}).
Chris@82 410
Chris@82 411 @item
Chris@82 412 @cindex flags
Chris@82 413 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
Chris@82 414 as defined in @ref{Planner Flags}.
Chris@82 415
Chris@82 416 @end itemize
Chris@82 417
Chris@82 418 FFTW computes an unnormalized transform: computing a forward followed by
Chris@82 419 a backward transform (or vice versa) will result in the original data
Chris@82 420 multiplied by the size of the transform (the product of the dimensions).
Chris@82 421 @cindex normalization
Chris@82 422 For more information, see @ref{What FFTW Really Computes}.
Chris@82 423
Chris@82 424 @c =========>
Chris@82 425 @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface
Chris@82 426 @subsection Planner Flags
Chris@82 427
Chris@82 428 All of the planner routines in FFTW accept an integer @code{flags}
Chris@82 429 argument, which is a bitwise OR (@samp{|}) of zero or more of the flag
Chris@82 430 constants defined below. These flags control the rigor (and time) of
Chris@82 431 the planning process, and can also impose (or lift) restrictions on the
Chris@82 432 type of transform algorithm that is employed.
Chris@82 433
Chris@82 434 @emph{Important:} the planner overwrites the input array during
Chris@82 435 planning unless a saved plan (@pxref{Wisdom}) is available for that
Chris@82 436 problem, so you should initialize your input data after creating the
Chris@82 437 plan. The only exceptions to this are the @code{FFTW_ESTIMATE} and
Chris@82 438 @code{FFTW_WISDOM_ONLY} flags, as mentioned below.
Chris@82 439
Chris@82 440 In all cases, if wisdom is available for the given problem that was
Chris@82 441 created with equal-or-greater planning rigor, then the more rigorous
Chris@82 442 wisdom is used. For example, in @code{FFTW_ESTIMATE} mode any available
Chris@82 443 wisdom is used, whereas in @code{FFTW_PATIENT} mode only wisdom created
Chris@82 444 in patient or exhaustive mode can be used. @xref{Words of Wisdom-Saving
Chris@82 445 Plans}.
Chris@82 446
Chris@82 447 @subsubheading Planning-rigor flags
Chris@82 448 @itemize @bullet
Chris@82 449
Chris@82 450 @item
Chris@82 451 @ctindex FFTW_ESTIMATE
Chris@82 452 @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of
Chris@82 453 different algorithms, a simple heuristic is used to pick a (probably
Chris@82 454 sub-optimal) plan quickly. With this flag, the input/output arrays are
Chris@82 455 not overwritten during planning.
Chris@82 456
Chris@82 457 @item
Chris@82 458 @ctindex FFTW_MEASURE
Chris@82 459 @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually
Chris@82 460 @emph{computing} several FFTs and measuring their execution time.
Chris@82 461 Depending on your machine, this can take some time (often a few
Chris@82 462 seconds). @code{FFTW_MEASURE} is the default planning option.
Chris@82 463
Chris@82 464 @item
Chris@82 465 @ctindex FFTW_PATIENT
Chris@82 466 @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider
Chris@82 467 range of algorithms and often produces a ``more optimal'' plan
Chris@82 468 (especially for large transforms), but at the expense of several times
Chris@82 469 longer planning time (especially for large transforms).
Chris@82 470
Chris@82 471 @item
Chris@82 472 @ctindex FFTW_EXHAUSTIVE
Chris@82 473 @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an
Chris@82 474 even wider range of algorithms, including many that we think are
Chris@82 475 unlikely to be fast, to produce the most optimal plan but with a
Chris@82 476 substantially increased planning time.
Chris@82 477
Chris@82 478 @item
Chris@82 479 @ctindex FFTW_WISDOM_ONLY
Chris@82 480 @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan
Chris@82 481 is only created if wisdom is available for the given problem, and
Chris@82 482 otherwise a @code{NULL} plan is returned. This can be combined with
Chris@82 483 other flags, e.g. @samp{FFTW_WISDOM_ONLY | FFTW_PATIENT} creates a
Chris@82 484 plan only if wisdom is available that was created in
Chris@82 485 @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode. The
Chris@82 486 @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect
Chris@82 487 whether wisdom is available; for example, if wisdom is not available
Chris@82 488 one may wish to allocate new arrays for planning so that user data is
Chris@82 489 not overwritten.
Chris@82 490
Chris@82 491 @end itemize
Chris@82 492
Chris@82 493 @subsubheading Algorithm-restriction flags
Chris@82 494 @itemize @bullet
Chris@82 495
Chris@82 496 @item
Chris@82 497 @ctindex FFTW_DESTROY_INPUT
Chris@82 498 @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is
Chris@82 499 allowed to @emph{overwrite its input} array with arbitrary data; this
Chris@82 500 can sometimes allow more efficient algorithms to be employed.
Chris@82 501 @cindex out-of-place
Chris@82 502
Chris@82 503 @item
Chris@82 504 @ctindex FFTW_PRESERVE_INPUT
Chris@82 505 @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must
Chris@82 506 @emph{not change its input} array. This is ordinarily the
Chris@82 507 @emph{default}, except for c2r and hc2r (i.e. complex-to-real)
Chris@82 508 transforms for which @code{FFTW_DESTROY_INPUT} is the default. In the
Chris@82 509 latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use
Chris@82 510 algorithms that do not destroy the input, at the expense of worse
Chris@82 511 performance; for multi-dimensional c2r transforms, however, no
Chris@82 512 input-preserving algorithms are implemented and the planner will return
Chris@82 513 @code{NULL} if one is requested.
Chris@82 514 @cindex c2r
Chris@82 515 @cindex hc2r
Chris@82 516
Chris@82 517 @item
Chris@82 518 @ctindex FFTW_UNALIGNED
Chris@82 519 @cindex alignment
Chris@82 520 @findex fftw_malloc
Chris@82 521 @findex fftw_alignment_of
Chris@82 522 @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any
Chris@82 523 unusual alignment requirements on the input/output arrays (i.e. no
Chris@82 524 SIMD may be used). This flag is normally @emph{not necessary}, since
Chris@82 525 the planner automatically detects misaligned arrays. The only use for
Chris@82 526 this flag is if you want to use the new-array execute interface to
Chris@82 527 execute a given plan on a different array that may not be aligned like
Chris@82 528 the original. (Using @code{fftw_malloc} makes this flag unnecessary
Chris@82 529 even then. You can also use @code{fftw_alignment_of} to detect
Chris@82 530 whether two arrays are equivalently aligned.)
Chris@82 531
Chris@82 532 @end itemize
Chris@82 533
Chris@82 534 @subsubheading Limiting planning time
Chris@82 535
Chris@82 536 @example
Chris@82 537 extern void fftw_set_timelimit(double seconds);
Chris@82 538 @end example
Chris@82 539 @findex fftw_set_timelimit
Chris@82 540
Chris@82 541 This function instructs FFTW to spend at most @code{seconds} seconds
Chris@82 542 (approximately) in the planner. If @code{seconds ==
Chris@82 543 FFTW_NO_TIMELIMIT} (the default value, which is negative), then
Chris@82 544 planning time is unbounded. Otherwise, FFTW plans with a
Chris@82 545 progressively wider range of algorithms until the the given time limit
Chris@82 546 is reached or the given range of algorithms is explored, returning the
Chris@82 547 best available plan.
Chris@82 548 @ctindex FFTW_NO_TIMELIMIT
Chris@82 549
Chris@82 550
Chris@82 551 For example, specifying @code{FFTW_PATIENT} first plans in
Chris@82 552 @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then
Chris@82 553 finally (time permitting) in @code{FFTW_PATIENT}. If
Chris@82 554 @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further
Chris@82 555 progress to @code{FFTW_EXHAUSTIVE} mode.
Chris@82 556
Chris@82 557 Note that the @code{seconds} argument specifies only a rough limit; in
Chris@82 558 practice, the planner may use somewhat more time if the time limit is
Chris@82 559 reached when the planner is in the middle of an operation that cannot
Chris@82 560 be interrupted. At the very least, the planner will complete planning
Chris@82 561 in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit
Chris@82 562 of 0).
Chris@82 563
Chris@82 564
Chris@82 565 @c =========>
Chris@82 566 @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface
Chris@82 567 @subsection Real-data DFTs
Chris@82 568
Chris@82 569 @example
Chris@82 570 fftw_plan fftw_plan_dft_r2c_1d(int n0,
Chris@82 571 double *in, fftw_complex *out,
Chris@82 572 unsigned flags);
Chris@82 573 fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1,
Chris@82 574 double *in, fftw_complex *out,
Chris@82 575 unsigned flags);
Chris@82 576 fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2,
Chris@82 577 double *in, fftw_complex *out,
Chris@82 578 unsigned flags);
Chris@82 579 fftw_plan fftw_plan_dft_r2c(int rank, const int *n,
Chris@82 580 double *in, fftw_complex *out,
Chris@82 581 unsigned flags);
Chris@82 582 @end example
Chris@82 583 @findex fftw_plan_dft_r2c_1d
Chris@82 584 @findex fftw_plan_dft_r2c_2d
Chris@82 585 @findex fftw_plan_dft_r2c_3d
Chris@82 586 @findex fftw_plan_dft_r2c
Chris@82 587 @cindex r2c
Chris@82 588
Chris@82 589 Plan a real-input/complex-output discrete Fourier transform (DFT) in
Chris@82 590 zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using
Chris@82 591 Plans}).
Chris@82 592
Chris@82 593 Once you have created a plan for a certain transform type and
Chris@82 594 parameters, then creating another plan of the same type and parameters,
Chris@82 595 but for different arrays, is fast and shares constant data with the
Chris@82 596 first plan (if it still exists).
Chris@82 597
Chris@82 598 The planner returns @code{NULL} if the plan cannot be created. A
Chris@82 599 non-@code{NULL} plan is always returned by the basic interface unless
Chris@82 600 you are using a customized FFTW configuration supporting a restricted
Chris@82 601 set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag
Chris@82 602 with a multi-dimensional out-of-place c2r transform (see below).
Chris@82 603
Chris@82 604 @subsubheading Arguments
Chris@82 605 @itemize @bullet
Chris@82 606
Chris@82 607 @item
Chris@82 608 @code{rank} is the rank of the transform (it should be the size of the
Chris@82 609 array @code{*n}), and can be any non-negative integer. (@xref{Complex
Chris@82 610 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
Chris@82 611 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
Chris@82 612 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
Chris@82 613 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
Chris@82 614 copy of one real number (with zero imaginary part) from input to output.
Chris@82 615
Chris@82 616 @item
Chris@82 617 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate
Chris@82 618 for each routine) specify the size of the transform dimensions. They
Chris@82 619 can be any positive integer. This is different in general from the
Chris@82 620 @emph{physical} array dimensions, which are described in @ref{Real-data
Chris@82 621 DFT Array Format}.
Chris@82 622
Chris@82 623 @itemize @minus
Chris@82 624 @item
Chris@82 625 FFTW is best at handling sizes of the form
Chris@82 626 @ifinfo
Chris@82 627 @math{2^a 3^b 5^c 7^d 11^e 13^f},
Chris@82 628 @end ifinfo
Chris@82 629 @tex
Chris@82 630 $2^a 3^b 5^c 7^d 11^e 13^f$,
Chris@82 631 @end tex
Chris@82 632 @html
Chris@82 633 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
Chris@82 634 11<sup>e</sup> 13<sup>f</sup>,
Chris@82 635 @end html
Chris@82 636 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
Chris@82 637 are arbitrary. Other sizes are computed by means of a slow,
Chris@82 638 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
Chris@82 639 for different array sizes; see @ref{Installation and Customization}.)
Chris@82 640 Transforms whose sizes are powers of @math{2} are especially fast, and
Chris@82 641 it is generally beneficial for the @emph{last} dimension of an r2c/c2r
Chris@82 642 transform to be @emph{even}.
Chris@82 643 @end itemize
Chris@82 644
Chris@82 645 @item
Chris@82 646 @code{in} and @code{out} point to the input and output arrays of the
Chris@82 647 transform, which may be the same (yielding an in-place transform).
Chris@82 648 @cindex in-place
Chris@82 649 These arrays are overwritten during planning, unless
Chris@82 650 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
Chris@82 651 initialized, but they must be allocated.) For an in-place transform, it
Chris@82 652 is important to remember that the real array will require padding,
Chris@82 653 described in @ref{Real-data DFT Array Format}.
Chris@82 654 @cindex padding
Chris@82 655
Chris@82 656 @item
Chris@82 657 @cindex flags
Chris@82 658 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
Chris@82 659 as defined in @ref{Planner Flags}.
Chris@82 660
Chris@82 661 @end itemize
Chris@82 662
Chris@82 663 The inverse transforms, taking complex input (storing the non-redundant
Chris@82 664 half of a logically Hermitian array) to real output, are given by:
Chris@82 665
Chris@82 666 @example
Chris@82 667 fftw_plan fftw_plan_dft_c2r_1d(int n0,
Chris@82 668 fftw_complex *in, double *out,
Chris@82 669 unsigned flags);
Chris@82 670 fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1,
Chris@82 671 fftw_complex *in, double *out,
Chris@82 672 unsigned flags);
Chris@82 673 fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2,
Chris@82 674 fftw_complex *in, double *out,
Chris@82 675 unsigned flags);
Chris@82 676 fftw_plan fftw_plan_dft_c2r(int rank, const int *n,
Chris@82 677 fftw_complex *in, double *out,
Chris@82 678 unsigned flags);
Chris@82 679 @end example
Chris@82 680 @findex fftw_plan_dft_c2r_1d
Chris@82 681 @findex fftw_plan_dft_c2r_2d
Chris@82 682 @findex fftw_plan_dft_c2r_3d
Chris@82 683 @findex fftw_plan_dft_c2r
Chris@82 684 @cindex c2r
Chris@82 685
Chris@82 686 The arguments are the same as for the r2c transforms, except that the
Chris@82 687 input and output data formats are reversed.
Chris@82 688
Chris@82 689 FFTW computes an unnormalized transform: computing an r2c followed by a
Chris@82 690 c2r transform (or vice versa) will result in the original data
Chris@82 691 multiplied by the size of the transform (the product of the logical
Chris@82 692 dimensions).
Chris@82 693 @cindex normalization
Chris@82 694 An r2c transform produces the same output as a @code{FFTW_FORWARD}
Chris@82 695 complex DFT of the same input, and a c2r transform is correspondingly
Chris@82 696 equivalent to @code{FFTW_BACKWARD}. For more information, see @ref{What
Chris@82 697 FFTW Really Computes}.
Chris@82 698
Chris@82 699 @c =========>
Chris@82 700 @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface
Chris@82 701 @subsection Real-data DFT Array Format
Chris@82 702 @cindex r2c/c2r multi-dimensional array format
Chris@82 703
Chris@82 704 The output of a DFT of real data (r2c) contains symmetries that, in
Chris@82 705 principle, make half of the outputs redundant (@pxref{What FFTW Really
Chris@82 706 Computes}). (Similarly for the input of an inverse c2r transform.) In
Chris@82 707 practice, it is not possible to entirely realize these savings in an
Chris@82 708 efficient and understandable format that generalizes to
Chris@82 709 multi-dimensional transforms. Instead, the output of the r2c
Chris@82 710 transforms is @emph{slightly} over half of the output of the
Chris@82 711 corresponding complex transform. We do not ``pack'' the data in any
Chris@82 712 way, but store it as an ordinary array of @code{fftw_complex} values.
Chris@82 713 In fact, this data is simply a subsection of what would be the array in
Chris@82 714 the corresponding complex transform.
Chris@82 715
Chris@82 716 Specifically, for a real transform of @math{d} (= @code{rank})
Chris@82 717 dimensions @ndims{}, the complex data is an @ndimshalf array of
Chris@82 718 @code{fftw_complex} values in row-major order (with the division rounded
Chris@82 719 down). That is, we only store the @emph{lower} half (non-negative
Chris@82 720 frequencies), plus one element, of the last dimension of the data from
Chris@82 721 the ordinary complex transform. (We could have instead taken half of
Chris@82 722 any other dimension, but implementation turns out to be simpler if the
Chris@82 723 last, contiguous, dimension is used.)
Chris@82 724
Chris@82 725 @cindex out-of-place
Chris@82 726 For an out-of-place transform, the real data is simply an array with
Chris@82 727 physical dimensions @ndims in row-major order.
Chris@82 728
Chris@82 729 @cindex in-place
Chris@82 730 @cindex padding
Chris@82 731 For an in-place transform, some complications arise since the complex data
Chris@82 732 is slightly larger than the real data. In this case, the final
Chris@82 733 dimension of the real data must be @emph{padded} with extra values to
Chris@82 734 accommodate the size of the complex data---two extra if the last
Chris@82 735 dimension is even and one if it is odd. That is, the last dimension of
Chris@82 736 the real data must physically contain
Chris@82 737 @tex
Chris@82 738 $2 (n_{d-1}/2+1)$
Chris@82 739 @end tex
Chris@82 740 @ifinfo
Chris@82 741 2 * (n[d-1]/2+1)
Chris@82 742 @end ifinfo
Chris@82 743 @html
Chris@82 744 2 * (n<sub>d-1</sub>/2+1)
Chris@82 745 @end html
Chris@82 746 @code{double} values (exactly enough to hold the complex data). This
Chris@82 747 physical array size does not, however, change the @emph{logical} array
Chris@82 748 size---only
Chris@82 749 @tex
Chris@82 750 $n_{d-1}$
Chris@82 751 @end tex
Chris@82 752 @ifinfo
Chris@82 753 n[d-1]
Chris@82 754 @end ifinfo
Chris@82 755 @html
Chris@82 756 n<sub>d-1</sub>
Chris@82 757 @end html
Chris@82 758 values are actually stored in the last dimension, and
Chris@82 759 @tex
Chris@82 760 $n_{d-1}$
Chris@82 761 @end tex
Chris@82 762 @ifinfo
Chris@82 763 n[d-1]
Chris@82 764 @end ifinfo
Chris@82 765 @html
Chris@82 766 n<sub>d-1</sub>
Chris@82 767 @end html
Chris@82 768 is the last dimension passed to the planner.
Chris@82 769
Chris@82 770 @c =========>
Chris@82 771 @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface
Chris@82 772 @subsection Real-to-Real Transforms
Chris@82 773 @cindex r2r
Chris@82 774
Chris@82 775 @example
Chris@82 776 fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out,
Chris@82 777 fftw_r2r_kind kind, unsigned flags);
Chris@82 778 fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out,
Chris@82 779 fftw_r2r_kind kind0, fftw_r2r_kind kind1,
Chris@82 780 unsigned flags);
Chris@82 781 fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2,
Chris@82 782 double *in, double *out,
Chris@82 783 fftw_r2r_kind kind0,
Chris@82 784 fftw_r2r_kind kind1,
Chris@82 785 fftw_r2r_kind kind2,
Chris@82 786 unsigned flags);
Chris@82 787 fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out,
Chris@82 788 const fftw_r2r_kind *kind, unsigned flags);
Chris@82 789 @end example
Chris@82 790 @findex fftw_plan_r2r_1d
Chris@82 791 @findex fftw_plan_r2r_2d
Chris@82 792 @findex fftw_plan_r2r_3d
Chris@82 793 @findex fftw_plan_r2r
Chris@82 794
Chris@82 795 Plan a real input/output (r2r) transform of various kinds in zero or
Chris@82 796 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
Chris@82 797
Chris@82 798 Once you have created a plan for a certain transform type and
Chris@82 799 parameters, then creating another plan of the same type and parameters,
Chris@82 800 but for different arrays, is fast and shares constant data with the
Chris@82 801 first plan (if it still exists).
Chris@82 802
Chris@82 803 The planner returns @code{NULL} if the plan cannot be created. A
Chris@82 804 non-@code{NULL} plan is always returned by the basic interface unless
Chris@82 805 you are using a customized FFTW configuration supporting a restricted
Chris@82 806 set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are
Chris@82 807 not defined).
Chris@82 808 @ctindex FFTW_REDFT00
Chris@82 809
Chris@82 810 @subsubheading Arguments
Chris@82 811 @itemize @bullet
Chris@82 812
Chris@82 813 @item
Chris@82 814 @code{rank} is the dimensionality of the transform (it should be the
Chris@82 815 size of the arrays @code{*n} and @code{*kind}), and can be any
Chris@82 816 non-negative integer. The @samp{_1d}, @samp{_2d}, and @samp{_3d}
Chris@82 817 planners correspond to a @code{rank} of @code{1}, @code{2}, and
Chris@82 818 @code{3}, respectively. A @code{rank} of zero is equivalent to a copy
Chris@82 819 of one number from input to output.
Chris@82 820
Chris@82 821 @item
Chris@82 822 @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]},
Chris@82 823 respectively, gives the (physical) size of the transform dimensions.
Chris@82 824 They can be any positive integer.
Chris@82 825
Chris@82 826 @itemize @minus
Chris@82 827 @item
Chris@82 828 @cindex row-major
Chris@82 829 Multi-dimensional arrays are stored in row-major order with dimensions:
Chris@82 830 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
Chris@82 831 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
Chris@82 832 @xref{Multi-dimensional Array Format}.
Chris@82 833 @item
Chris@82 834 FFTW is generally best at handling sizes of the form
Chris@82 835 @ifinfo
Chris@82 836 @math{2^a 3^b 5^c 7^d 11^e 13^f},
Chris@82 837 @end ifinfo
Chris@82 838 @tex
Chris@82 839 $2^a 3^b 5^c 7^d 11^e 13^f$,
Chris@82 840 @end tex
Chris@82 841 @html
Chris@82 842 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
Chris@82 843 11<sup>e</sup> 13<sup>f</sup>,
Chris@82 844 @end html
Chris@82 845 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
Chris@82 846 are arbitrary. Other sizes are computed by means of a slow,
Chris@82 847 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
Chris@82 848 for different array sizes; see @ref{Installation and Customization}.)
Chris@82 849 Transforms whose sizes are powers of @math{2} are especially fast.
Chris@82 850 @item
Chris@82 851 For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of
Chris@82 852 size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that
Chris@82 853 should be factorizable in the above form.
Chris@82 854 @end itemize
Chris@82 855
Chris@82 856 @item
Chris@82 857 @code{in} and @code{out} point to the input and output arrays of the
Chris@82 858 transform, which may be the same (yielding an in-place transform).
Chris@82 859 @cindex in-place
Chris@82 860 These arrays are overwritten during planning, unless
Chris@82 861 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
Chris@82 862 initialized, but they must be allocated.)
Chris@82 863
Chris@82 864 @item
Chris@82 865 @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or
Chris@82 866 @code{kind[rank]}, is the kind of r2r transform used for the
Chris@82 867 corresponding dimension. The valid kind constants are described in
Chris@82 868 @ref{Real-to-Real Transform Kinds}. In a multi-dimensional transform,
Chris@82 869 what is computed is the separable product formed by taking each
Chris@82 870 transform kind along the corresponding dimension, one dimension after
Chris@82 871 another.
Chris@82 872
Chris@82 873 @item
Chris@82 874 @cindex flags
Chris@82 875 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
Chris@82 876 as defined in @ref{Planner Flags}.
Chris@82 877
Chris@82 878 @end itemize
Chris@82 879
Chris@82 880 @c =========>
Chris@82 881 @node Real-to-Real Transform Kinds, , Real-to-Real Transforms, Basic Interface
Chris@82 882 @subsection Real-to-Real Transform Kinds
Chris@82 883 @cindex kind (r2r)
Chris@82 884
Chris@82 885 FFTW currently supports 11 different r2r transform kinds, specified by
Chris@82 886 one of the constants below. For the precise definitions of these
Chris@82 887 transforms, see @ref{What FFTW Really Computes}. For a more colloquial
Chris@82 888 introduction to these transform kinds, see @ref{More DFTs of Real Data}.
Chris@82 889
Chris@82 890 For dimension of size @code{n}, there is a corresponding ``logical''
Chris@82 891 dimension @code{N} that determines the normalization (and the optimal
Chris@82 892 factorization); the formula for @code{N} is given for each kind below.
Chris@82 893 Also, with each transform kind is listed its corrsponding inverse
Chris@82 894 transform. FFTW computes unnormalized transforms: a transform followed
Chris@82 895 by its inverse will result in the original data multiplied by @code{N}
Chris@82 896 (or the product of the @code{N}'s for each dimension, in
Chris@82 897 multi-dimensions).
Chris@82 898 @cindex normalization
Chris@82 899
Chris@82 900 @itemize @bullet
Chris@82 901
Chris@82 902 @item
Chris@82 903 @ctindex FFTW_R2HC
Chris@82 904 @code{FFTW_R2HC} computes a real-input DFT with output in
Chris@82 905 ``halfcomplex'' format, i.e. real and imaginary parts for a transform of
Chris@82 906 size @code{n} stored as:
Chris@82 907 @tex
Chris@82 908 $$
Chris@82 909 r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1
Chris@82 910 $$
Chris@82 911 @end tex
Chris@82 912 @ifinfo
Chris@82 913 r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1
Chris@82 914 @end ifinfo
Chris@82 915 @html
Chris@82 916 <p align=center>
Chris@82 917 r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub>
Chris@82 918 </p>
Chris@82 919 @end html
Chris@82 920 (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.)
Chris@82 921
Chris@82 922 @item
Chris@82 923 @ctindex FFTW_HC2R
Chris@82 924 @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above.
Chris@82 925 (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.)
Chris@82 926
Chris@82 927 @item
Chris@82 928 @ctindex FFTW_DHT
Chris@82 929 @code{FFTW_DHT} computes a discrete Hartley transform.
Chris@82 930 (Logical @code{N=n}, inverse is @code{FFTW_DHT}.)
Chris@82 931 @cindex discrete Hartley transform
Chris@82 932
Chris@82 933 @item
Chris@82 934 @ctindex FFTW_REDFT00
Chris@82 935 @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I.
Chris@82 936 (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.)
Chris@82 937 @cindex discrete cosine transform
Chris@82 938 @cindex DCT
Chris@82 939
Chris@82 940 @item
Chris@82 941 @ctindex FFTW_REDFT10
Chris@82 942 @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT).
Chris@82 943 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.)
Chris@82 944
Chris@82 945 @item
Chris@82 946 @ctindex FFTW_REDFT01
Chris@82 947 @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II).
Chris@82 948 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.)
Chris@82 949 @cindex IDCT
Chris@82 950
Chris@82 951 @item
Chris@82 952 @ctindex FFTW_REDFT11
Chris@82 953 @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV.
Chris@82 954 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.)
Chris@82 955
Chris@82 956 @item
Chris@82 957 @ctindex FFTW_RODFT00
Chris@82 958 @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I.
Chris@82 959 (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.)
Chris@82 960 @cindex discrete sine transform
Chris@82 961 @cindex DST
Chris@82 962
Chris@82 963 @item
Chris@82 964 @ctindex FFTW_RODFT10
Chris@82 965 @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II.
Chris@82 966 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.)
Chris@82 967
Chris@82 968 @item
Chris@82 969 @ctindex FFTW_RODFT01
Chris@82 970 @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III.
Chris@82 971 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.)
Chris@82 972
Chris@82 973 @item
Chris@82 974 @ctindex FFTW_RODFT11
Chris@82 975 @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV.
Chris@82 976 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.)
Chris@82 977
Chris@82 978 @end itemize
Chris@82 979
Chris@82 980 @c ------------------------------------------------------------
Chris@82 981 @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference
Chris@82 982 @section Advanced Interface
Chris@82 983 @cindex advanced interface
Chris@82 984
Chris@82 985 FFTW's ``advanced'' interface supplements the basic interface with four
Chris@82 986 new planner routines, providing a new level of flexibility: you can plan
Chris@82 987 a transform of multiple arrays simultaneously, operate on non-contiguous
Chris@82 988 (strided) data, and transform a subset of a larger multi-dimensional
Chris@82 989 array. Other than these additional features, the planner operates in
Chris@82 990 the same fashion as in the basic interface, and the resulting
Chris@82 991 @code{fftw_plan} is used in the same way (@pxref{Using Plans}).
Chris@82 992
Chris@82 993 @menu
Chris@82 994 * Advanced Complex DFTs::
Chris@82 995 * Advanced Real-data DFTs::
Chris@82 996 * Advanced Real-to-real Transforms::
Chris@82 997 @end menu
Chris@82 998
Chris@82 999 @c =========>
Chris@82 1000 @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface
Chris@82 1001 @subsection Advanced Complex DFTs
Chris@82 1002
Chris@82 1003 @example
Chris@82 1004 fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany,
Chris@82 1005 fftw_complex *in, const int *inembed,
Chris@82 1006 int istride, int idist,
Chris@82 1007 fftw_complex *out, const int *onembed,
Chris@82 1008 int ostride, int odist,
Chris@82 1009 int sign, unsigned flags);
Chris@82 1010 @end example
Chris@82 1011 @findex fftw_plan_many_dft
Chris@82 1012
Chris@82 1013 This routine plans multiple multidimensional complex DFTs, and it
Chris@82 1014 extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to
Chris@82 1015 compute @code{howmany} transforms, each having rank @code{rank} and size
Chris@82 1016 @code{n}. In addition, the transform data need not be contiguous, but
Chris@82 1017 it may be laid out in memory with an arbitrary stride. To account for
Chris@82 1018 these possibilities, @code{fftw_plan_many_dft} adds the new parameters
Chris@82 1019 @code{howmany}, @{@code{i},@code{o}@}@code{nembed},
Chris@82 1020 @{@code{i},@code{o}@}@code{stride}, and
Chris@82 1021 @{@code{i},@code{o}@}@code{dist}. The FFTW basic interface
Chris@82 1022 (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2,
Chris@82 1023 and@tie{}3, but the advanced interface handles only the general-rank
Chris@82 1024 case.
Chris@82 1025
Chris@82 1026 @code{howmany} is the (nonnegative) number of transforms to compute. The resulting
Chris@82 1027 plan computes @code{howmany} transforms, where the input of the
Chris@82 1028 @code{k}-th transform is at location @code{in+k*idist} (in C pointer
Chris@82 1029 arithmetic), and its output is at location @code{out+k*odist}. Plans
Chris@82 1030 obtained in this way can often be faster than calling FFTW multiple
Chris@82 1031 times for the individual transforms. The basic @code{fftw_plan_dft}
Chris@82 1032 interface corresponds to @code{howmany=1} (in which case the @code{dist}
Chris@82 1033 parameters are ignored).
Chris@82 1034 @cindex howmany parameter
Chris@82 1035 @cindex dist
Chris@82 1036
Chris@82 1037
Chris@82 1038 Each of the @code{howmany} transforms has rank @code{rank} and size
Chris@82 1039 @code{n}, as in the basic interface. In addition, the advanced
Chris@82 1040 interface allows the input and output arrays of each transform to be
Chris@82 1041 row-major subarrays of larger rank-@code{rank} arrays, described by
Chris@82 1042 @code{inembed} and @code{onembed} parameters, respectively.
Chris@82 1043 @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank},
Chris@82 1044 and @code{n} should be elementwise less than or equal to
Chris@82 1045 @{@code{i},@code{o}@}@code{nembed}. Passing @code{NULL} for an
Chris@82 1046 @code{nembed} parameter is equivalent to passing @code{n} (i.e. same
Chris@82 1047 physical and logical dimensions, as in the basic interface.)
Chris@82 1048
Chris@82 1049 The @code{stride} parameters indicate that the @code{j}-th element of
Chris@82 1050 the input or output arrays is located at @code{j*istride} or
Chris@82 1051 @code{j*ostride}, respectively. (For a multi-dimensional array,
Chris@82 1052 @code{j} is the ordinary row-major index.) When combined with the
Chris@82 1053 @code{k}-th transform in a @code{howmany} loop, from above, this means
Chris@82 1054 that the (@code{j},@code{k})-th element is at @code{j*stride+k*dist}.
Chris@82 1055 (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.)
Chris@82 1056 @cindex stride
Chris@82 1057
Chris@82 1058
Chris@82 1059 For in-place transforms, the input and output @code{stride} and
Chris@82 1060 @code{dist} parameters should be the same; otherwise, the planner may
Chris@82 1061 return @code{NULL}.
Chris@82 1062
Chris@82 1063 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
Chris@82 1064 this function returns. You can safely free or reuse them.
Chris@82 1065
Chris@82 1066 @strong{Examples}:
Chris@82 1067 One transform of one 5 by 6 array contiguous in memory:
Chris@82 1068 @example
Chris@82 1069 int rank = 2;
Chris@82 1070 int n[] = @{5, 6@};
Chris@82 1071 int howmany = 1;
Chris@82 1072 int idist = odist = 0; /* unused because howmany = 1 */
Chris@82 1073 int istride = ostride = 1; /* array is contiguous in memory */
Chris@82 1074 int *inembed = n, *onembed = n;
Chris@82 1075 @end example
Chris@82 1076
Chris@82 1077 Transform of three 5 by 6 arrays, each contiguous in memory,
Chris@82 1078 stored in memory one after another:
Chris@82 1079 @example
Chris@82 1080 int rank = 2;
Chris@82 1081 int n[] = @{5, 6@};
Chris@82 1082 int howmany = 3;
Chris@82 1083 int idist = odist = n[0]*n[1]; /* = 30, the distance in memory
Chris@82 1084 between the first element
Chris@82 1085 of the first array and the
Chris@82 1086 first element of the second array */
Chris@82 1087 int istride = ostride = 1; /* array is contiguous in memory */
Chris@82 1088 int *inembed = n, *onembed = n;
Chris@82 1089 @end example
Chris@82 1090
Chris@82 1091 Transform each column of a 2d array with 10 rows and 3 columns:
Chris@82 1092 @example
Chris@82 1093 int rank = 1; /* not 2: we are computing 1d transforms */
Chris@82 1094 int n[] = @{10@}; /* 1d transforms of length 10 */
Chris@82 1095 int howmany = 3;
Chris@82 1096 int idist = odist = 1;
Chris@82 1097 int istride = ostride = 3; /* distance between two elements in
Chris@82 1098 the same column */
Chris@82 1099 int *inembed = n, *onembed = n;
Chris@82 1100 @end example
Chris@82 1101
Chris@82 1102 @c =========>
Chris@82 1103 @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface
Chris@82 1104 @subsection Advanced Real-data DFTs
Chris@82 1105
Chris@82 1106 @example
Chris@82 1107 fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany,
Chris@82 1108 double *in, const int *inembed,
Chris@82 1109 int istride, int idist,
Chris@82 1110 fftw_complex *out, const int *onembed,
Chris@82 1111 int ostride, int odist,
Chris@82 1112 unsigned flags);
Chris@82 1113 fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany,
Chris@82 1114 fftw_complex *in, const int *inembed,
Chris@82 1115 int istride, int idist,
Chris@82 1116 double *out, const int *onembed,
Chris@82 1117 int ostride, int odist,
Chris@82 1118 unsigned flags);
Chris@82 1119 @end example
Chris@82 1120 @findex fftw_plan_many_dft_r2c
Chris@82 1121 @findex fftw_plan_many_dft_c2r
Chris@82 1122
Chris@82 1123 Like @code{fftw_plan_many_dft}, these two functions add @code{howmany},
Chris@82 1124 @code{nembed}, @code{stride}, and @code{dist} parameters to the
Chris@82 1125 @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but
Chris@82 1126 otherwise behave the same as the basic interface.
Chris@82 1127
Chris@82 1128 The interpretation of @code{howmany}, @code{stride}, and @code{dist} are
Chris@82 1129 the same as for @code{fftw_plan_many_dft}, above. Note that the
Chris@82 1130 @code{stride} and @code{dist} for the real array are in units of
Chris@82 1131 @code{double}, and for the complex array are in units of
Chris@82 1132 @code{fftw_complex}.
Chris@82 1133
Chris@82 1134 If an @code{nembed} parameter is @code{NULL}, it is interpreted as what
Chris@82 1135 it would be in the basic interface, as described in @ref{Real-data DFT
Chris@82 1136 Array Format}. That is, for the complex array the size is assumed to be
Chris@82 1137 the same as @code{n}, but with the last dimension cut roughly in half.
Chris@82 1138 For the real array, the size is assumed to be @code{n} if the transform
Chris@82 1139 is out-of-place, or @code{n} with the last dimension ``padded'' if the
Chris@82 1140 transform is in-place.
Chris@82 1141
Chris@82 1142 If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as
Chris@82 1143 the physical size of the corresponding array, in row-major order, just
Chris@82 1144 as for @code{fftw_plan_many_dft}. In this case, each dimension of
Chris@82 1145 @code{nembed} should be @code{>=} what it would be in the basic
Chris@82 1146 interface (e.g. the halved or padded @code{n}).
Chris@82 1147
Chris@82 1148 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
Chris@82 1149 this function returns. You can safely free or reuse them.
Chris@82 1150
Chris@82 1151 @c =========>
Chris@82 1152 @node Advanced Real-to-real Transforms, , Advanced Real-data DFTs, Advanced Interface
Chris@82 1153 @subsection Advanced Real-to-real Transforms
Chris@82 1154
Chris@82 1155 @example
Chris@82 1156 fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany,
Chris@82 1157 double *in, const int *inembed,
Chris@82 1158 int istride, int idist,
Chris@82 1159 double *out, const int *onembed,
Chris@82 1160 int ostride, int odist,
Chris@82 1161 const fftw_r2r_kind *kind, unsigned flags);
Chris@82 1162 @end example
Chris@82 1163 @findex fftw_plan_many_r2r
Chris@82 1164
Chris@82 1165 Like @code{fftw_plan_many_dft}, this functions adds @code{howmany},
Chris@82 1166 @code{nembed}, @code{stride}, and @code{dist} parameters to the
Chris@82 1167 @code{fftw_plan_r2r} function, but otherwise behave the same as the
Chris@82 1168 basic interface. The interpretation of those additional parameters are
Chris@82 1169 the same as for @code{fftw_plan_many_dft}. (Of course, the
Chris@82 1170 @code{stride} and @code{dist} parameters are now in units of
Chris@82 1171 @code{double}, not @code{fftw_complex}.)
Chris@82 1172
Chris@82 1173 Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not
Chris@82 1174 used after this function returns. You can safely free or reuse them.
Chris@82 1175
Chris@82 1176 @c ------------------------------------------------------------
Chris@82 1177 @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference
Chris@82 1178 @section Guru Interface
Chris@82 1179 @cindex guru interface
Chris@82 1180
Chris@82 1181 The ``guru'' interface to FFTW is intended to expose as much as possible
Chris@82 1182 of the flexibility in the underlying FFTW architecture. It allows one
Chris@82 1183 to compute multi-dimensional ``vectors'' (loops) of multi-dimensional
Chris@82 1184 transforms, where each vector/transform dimension has an independent
Chris@82 1185 size and stride.
Chris@82 1186 @cindex vector
Chris@82 1187 One can also use more general complex-number formats, e.g. separate real
Chris@82 1188 and imaginary arrays.
Chris@82 1189
Chris@82 1190 For those users who require the flexibility of the guru interface, it is
Chris@82 1191 important that they pay special attention to the documentation lest they
Chris@82 1192 shoot themselves in the foot.
Chris@82 1193
Chris@82 1194 @menu
Chris@82 1195 * Interleaved and split arrays::
Chris@82 1196 * Guru vector and transform sizes::
Chris@82 1197 * Guru Complex DFTs::
Chris@82 1198 * Guru Real-data DFTs::
Chris@82 1199 * Guru Real-to-real Transforms::
Chris@82 1200 * 64-bit Guru Interface::
Chris@82 1201 @end menu
Chris@82 1202
Chris@82 1203 @c =========>
Chris@82 1204 @node Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface
Chris@82 1205 @subsection Interleaved and split arrays
Chris@82 1206
Chris@82 1207 The guru interface supports two representations of complex numbers,
Chris@82 1208 which we call the interleaved and the split format.
Chris@82 1209
Chris@82 1210 The @dfn{interleaved} format is the same one used by the basic and
Chris@82 1211 advanced interfaces, and it is documented in @ref{Complex numbers}.
Chris@82 1212 In the interleaved format, you provide pointers to the real part of a
Chris@82 1213 complex number, and the imaginary part understood to be stored in the
Chris@82 1214 next memory location.
Chris@82 1215 @cindex interleaved format
Chris@82 1216
Chris@82 1217
Chris@82 1218 The @dfn{split} format allows separate pointers to the real and
Chris@82 1219 imaginary parts of a complex array.
Chris@82 1220 @cindex split format
Chris@82 1221
Chris@82 1222
Chris@82 1223 Technically, the interleaved format is redundant, because you can
Chris@82 1224 always express an interleaved array in terms of a split array with
Chris@82 1225 appropriate pointers and strides. On the other hand, the interleaved
Chris@82 1226 format is simpler to use, and it is common in practice. Hence, FFTW
Chris@82 1227 supports it as a special case.
Chris@82 1228
Chris@82 1229 @c =========>
Chris@82 1230 @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface
Chris@82 1231 @subsection Guru vector and transform sizes
Chris@82 1232
Chris@82 1233 The guru interface introduces one basic new data structure,
Chris@82 1234 @code{fftw_iodim}, that is used to specify sizes and strides for
Chris@82 1235 multi-dimensional transforms and vectors:
Chris@82 1236
Chris@82 1237 @example
Chris@82 1238 typedef struct @{
Chris@82 1239 int n;
Chris@82 1240 int is;
Chris@82 1241 int os;
Chris@82 1242 @} fftw_iodim;
Chris@82 1243 @end example
Chris@82 1244 @tindex fftw_iodim
Chris@82 1245
Chris@82 1246 Here, @code{n} is the size of the dimension, and @code{is} and @code{os}
Chris@82 1247 are the strides of that dimension for the input and output arrays. (The
Chris@82 1248 stride is the separation of consecutive elements along this dimension.)
Chris@82 1249
Chris@82 1250 The meaning of the stride parameter depends on the type of the array
Chris@82 1251 that the stride refers to. @emph{If the array is interleaved complex,
Chris@82 1252 strides are expressed in units of complex numbers
Chris@82 1253 (@code{fftw_complex}). If the array is split complex or real, strides
Chris@82 1254 are expressed in units of real numbers (@code{double}).} This
Chris@82 1255 convention is consistent with the usual pointer arithmetic in the C
Chris@82 1256 language. An interleaved array is denoted by a pointer @code{p} to
Chris@82 1257 @code{fftw_complex}, so that @code{p+1} points to the next complex
Chris@82 1258 number. Split arrays are denoted by pointers to @code{double}, in
Chris@82 1259 which case pointer arithmetic operates in units of
Chris@82 1260 @code{sizeof(double)}.
Chris@82 1261 @cindex stride
Chris@82 1262
Chris@82 1263
Chris@82 1264 The guru planner interfaces all take a (@code{rank}, @code{dims[rank]})
Chris@82 1265 pair describing the transform size, and a (@code{howmany_rank},
Chris@82 1266 @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a
Chris@82 1267 multi-dimensional loop of transforms to perform), where @code{dims} and
Chris@82 1268 @code{howmany_dims} are arrays of @code{fftw_iodim}. Each @code{n} field must
Chris@82 1269 be positive for @code{dims} and nonnegative for @code{howmany_dims}, while both
Chris@82 1270 @code{rank} and @code{howmany_rank} must be nonnegative.
Chris@82 1271
Chris@82 1272 For example, the @code{howmany} parameter in the advanced complex-DFT
Chris@82 1273 interface corresponds to @code{howmany_rank} = 1,
Chris@82 1274 @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} =
Chris@82 1275 @code{idist}, and @code{howmany_dims[0].os} = @code{odist}.
Chris@82 1276 @cindex howmany loop
Chris@82 1277 @cindex dist
Chris@82 1278 (To compute a single transform, you can just use @code{howmany_rank} = 0.)
Chris@82 1279
Chris@82 1280
Chris@82 1281 A row-major multidimensional array with dimensions @code{n[rank]}
Chris@82 1282 (@pxref{Row-major Format}) corresponds to @code{dims[i].n} =
Chris@82 1283 @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] *
Chris@82 1284 dims[i+1].is} (similarly for @code{os}). The stride of the last
Chris@82 1285 (@code{i=rank-1}) dimension is the overall stride of the array.
Chris@82 1286 e.g. to be equivalent to the advanced complex-DFT interface, you would
Chris@82 1287 have @code{dims[rank-1].is} = @code{istride} and
Chris@82 1288 @code{dims[rank-1].os} = @code{ostride}.
Chris@82 1289 @cindex row-major
Chris@82 1290
Chris@82 1291
Chris@82 1292 In general, we only guarantee FFTW to return a non-@code{NULL} plan if
Chris@82 1293 the vector and transform dimensions correspond to a set of distinct
Chris@82 1294 indices, and for in-place transforms the input/output strides should
Chris@82 1295 be the same.
Chris@82 1296
Chris@82 1297 @c =========>
Chris@82 1298 @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface
Chris@82 1299 @subsection Guru Complex DFTs
Chris@82 1300
Chris@82 1301 @example
Chris@82 1302 fftw_plan fftw_plan_guru_dft(
Chris@82 1303 int rank, const fftw_iodim *dims,
Chris@82 1304 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@82 1305 fftw_complex *in, fftw_complex *out,
Chris@82 1306 int sign, unsigned flags);
Chris@82 1307
Chris@82 1308 fftw_plan fftw_plan_guru_split_dft(
Chris@82 1309 int rank, const fftw_iodim *dims,
Chris@82 1310 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@82 1311 double *ri, double *ii, double *ro, double *io,
Chris@82 1312 unsigned flags);
Chris@82 1313 @end example
Chris@82 1314 @findex fftw_plan_guru_dft
Chris@82 1315 @findex fftw_plan_guru_split_dft
Chris@82 1316
Chris@82 1317 These two functions plan a complex-data, multi-dimensional DFT
Chris@82 1318 for the interleaved and split format, respectively.
Chris@82 1319 Transform dimensions are given by (@code{rank}, @code{dims}) over a
Chris@82 1320 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
Chris@82 1321 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
Chris@82 1322 to @code{fftw_iodim} arrays of length @code{rank} and
Chris@82 1323 @code{howmany_rank}, respectively.
Chris@82 1324
Chris@82 1325 @cindex flags
Chris@82 1326 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
Chris@82 1327 as defined in @ref{Planner Flags}.
Chris@82 1328
Chris@82 1329 In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and
Chris@82 1330 @code{out} point to the interleaved input and output arrays,
Chris@82 1331 respectively. The sign can be either @math{-1} (=
Chris@82 1332 @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). If the
Chris@82 1333 pointers are equal, the transform is in-place.
Chris@82 1334
Chris@82 1335 In the @code{fftw_plan_guru_split_dft} function,
Chris@82 1336 @code{ri} and @code{ii} point to the real and imaginary input arrays,
Chris@82 1337 and @code{ro} and @code{io} point to the real and imaginary output
Chris@82 1338 arrays. The input and output pointers may be the same, indicating an
Chris@82 1339 in-place transform. For example, for @code{fftw_complex} pointers
Chris@82 1340 @code{in} and @code{out}, the corresponding parameters are:
Chris@82 1341
Chris@82 1342 @example
Chris@82 1343 ri = (double *) in;
Chris@82 1344 ii = (double *) in + 1;
Chris@82 1345 ro = (double *) out;
Chris@82 1346 io = (double *) out + 1;
Chris@82 1347 @end example
Chris@82 1348
Chris@82 1349 Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides
Chris@82 1350 are expressed in units of @code{double}. For a contiguous
Chris@82 1351 @code{fftw_complex} array, the overall stride of the transform should
Chris@82 1352 be 2, the distance between consecutive real parts or between
Chris@82 1353 consecutive imaginary parts; see @ref{Guru vector and transform
Chris@82 1354 sizes}. Note that the dimension strides are applied equally to the
Chris@82 1355 real and imaginary parts; real and imaginary arrays with different
Chris@82 1356 strides are not supported.
Chris@82 1357
Chris@82 1358 There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}.
Chris@82 1359 This function always plans for an @code{FFTW_FORWARD} transform. To
Chris@82 1360 plan for an @code{FFTW_BACKWARD} transform, you can exploit the
Chris@82 1361 identity that the backwards DFT is equal to the forwards DFT with the
Chris@82 1362 real and imaginary parts swapped. For example, in the case of the
Chris@82 1363 @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform
Chris@82 1364 is computed by the parameters:
Chris@82 1365
Chris@82 1366 @example
Chris@82 1367 ri = (double *) in + 1;
Chris@82 1368 ii = (double *) in;
Chris@82 1369 ro = (double *) out + 1;
Chris@82 1370 io = (double *) out;
Chris@82 1371 @end example
Chris@82 1372
Chris@82 1373 @c =========>
Chris@82 1374 @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface
Chris@82 1375 @subsection Guru Real-data DFTs
Chris@82 1376
Chris@82 1377 @example
Chris@82 1378 fftw_plan fftw_plan_guru_dft_r2c(
Chris@82 1379 int rank, const fftw_iodim *dims,
Chris@82 1380 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@82 1381 double *in, fftw_complex *out,
Chris@82 1382 unsigned flags);
Chris@82 1383
Chris@82 1384 fftw_plan fftw_plan_guru_split_dft_r2c(
Chris@82 1385 int rank, const fftw_iodim *dims,
Chris@82 1386 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@82 1387 double *in, double *ro, double *io,
Chris@82 1388 unsigned flags);
Chris@82 1389
Chris@82 1390 fftw_plan fftw_plan_guru_dft_c2r(
Chris@82 1391 int rank, const fftw_iodim *dims,
Chris@82 1392 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@82 1393 fftw_complex *in, double *out,
Chris@82 1394 unsigned flags);
Chris@82 1395
Chris@82 1396 fftw_plan fftw_plan_guru_split_dft_c2r(
Chris@82 1397 int rank, const fftw_iodim *dims,
Chris@82 1398 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@82 1399 double *ri, double *ii, double *out,
Chris@82 1400 unsigned flags);
Chris@82 1401 @end example
Chris@82 1402 @findex fftw_plan_guru_dft_r2c
Chris@82 1403 @findex fftw_plan_guru_split_dft_r2c
Chris@82 1404 @findex fftw_plan_guru_dft_c2r
Chris@82 1405 @findex fftw_plan_guru_split_dft_c2r
Chris@82 1406
Chris@82 1407 Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with
Chris@82 1408 transform dimensions given by (@code{rank}, @code{dims}) over a
Chris@82 1409 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
Chris@82 1410 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
Chris@82 1411 to @code{fftw_iodim} arrays of length @code{rank} and
Chris@82 1412 @code{howmany_rank}, respectively. As for the basic and advanced
Chris@82 1413 interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform
Chris@82 1414 is @code{FFTW_BACKWARD}.
Chris@82 1415
Chris@82 1416 The @emph{last} dimension of @code{dims} is interpreted specially:
Chris@82 1417 that dimension of the real array has size @code{dims[rank-1].n}, but
Chris@82 1418 that dimension of the complex array has size @code{dims[rank-1].n/2+1}
Chris@82 1419 (division rounded down). The strides, on the other hand, are taken to
Chris@82 1420 be exactly as specified. It is up to the user to specify the strides
Chris@82 1421 appropriately for the peculiar dimensions of the data, and we do not
Chris@82 1422 guarantee that the planner will succeed (return non-@code{NULL}) for
Chris@82 1423 any dimensions other than those described in @ref{Real-data DFT Array
Chris@82 1424 Format} and generalized in @ref{Advanced Real-data DFTs}. (That is,
Chris@82 1425 for an in-place transform, each individual dimension should be able to
Chris@82 1426 operate in place.)
Chris@82 1427 @cindex in-place
Chris@82 1428
Chris@82 1429
Chris@82 1430 @code{in} and @code{out} point to the input and output arrays for r2c
Chris@82 1431 and c2r transforms, respectively. For split arrays, @code{ri} and
Chris@82 1432 @code{ii} point to the real and imaginary input arrays for a c2r
Chris@82 1433 transform, and @code{ro} and @code{io} point to the real and imaginary
Chris@82 1434 output arrays for an r2c transform. @code{in} and @code{ro} or
Chris@82 1435 @code{ri} and @code{out} may be the same, indicating an in-place
Chris@82 1436 transform. (In-place transforms where @code{in} and @code{io} or
Chris@82 1437 @code{ii} and @code{out} are the same are not currently supported.)
Chris@82 1438
Chris@82 1439 @cindex flags
Chris@82 1440 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
Chris@82 1441 as defined in @ref{Planner Flags}.
Chris@82 1442
Chris@82 1443 In-place transforms of rank greater than 1 are currently only
Chris@82 1444 supported for interleaved arrays. For split arrays, the planner will
Chris@82 1445 return @code{NULL}.
Chris@82 1446 @cindex in-place
Chris@82 1447
Chris@82 1448 @c =========>
Chris@82 1449 @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface
Chris@82 1450 @subsection Guru Real-to-real Transforms
Chris@82 1451
Chris@82 1452 @example
Chris@82 1453 fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims,
Chris@82 1454 int howmany_rank,
Chris@82 1455 const fftw_iodim *howmany_dims,
Chris@82 1456 double *in, double *out,
Chris@82 1457 const fftw_r2r_kind *kind,
Chris@82 1458 unsigned flags);
Chris@82 1459 @end example
Chris@82 1460 @findex fftw_plan_guru_r2r
Chris@82 1461
Chris@82 1462 Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD}
Chris@82 1463 transform with transform dimensions given by (@code{rank}, @code{dims})
Chris@82 1464 over a multi-dimensional vector (loop) of dimensions
Chris@82 1465 (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and
Chris@82 1466 @code{howmany_dims} should point to @code{fftw_iodim} arrays of length
Chris@82 1467 @code{rank} and @code{howmany_rank}, respectively.
Chris@82 1468
Chris@82 1469 The transform kind of each dimension is given by the @code{kind}
Chris@82 1470 parameter, which should point to an array of length @code{rank}. Valid
Chris@82 1471 @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform
Chris@82 1472 Kinds}.
Chris@82 1473
Chris@82 1474 @code{in} and @code{out} point to the real input and output arrays; they
Chris@82 1475 may be the same, indicating an in-place transform.
Chris@82 1476
Chris@82 1477 @cindex flags
Chris@82 1478 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
Chris@82 1479 as defined in @ref{Planner Flags}.
Chris@82 1480
Chris@82 1481 @c =========>
Chris@82 1482 @node 64-bit Guru Interface, , Guru Real-to-real Transforms, Guru Interface
Chris@82 1483 @subsection 64-bit Guru Interface
Chris@82 1484 @cindex 64-bit architecture
Chris@82 1485
Chris@82 1486 When compiled in 64-bit mode on a 64-bit architecture (where addresses
Chris@82 1487 are 64 bits wide), FFTW uses 64-bit quantities internally for all
Chris@82 1488 transform sizes, strides, and so on---you don't have to do anything
Chris@82 1489 special to exploit this. However, in the ordinary FFTW interfaces,
Chris@82 1490 you specify the transform size by an @code{int} quantity, which is
Chris@82 1491 normally only 32 bits wide. This means that, even though FFTW is
Chris@82 1492 using 64-bit sizes internally, you cannot specify a single transform
Chris@82 1493 dimension larger than
Chris@82 1494 @ifinfo
Chris@82 1495 2^31-1
Chris@82 1496 @end ifinfo
Chris@82 1497 @html
Chris@82 1498 2<sup><small>31</small></sup>&minus;1
Chris@82 1499 @end html
Chris@82 1500 @tex
Chris@82 1501 $2^{31}-1$
Chris@82 1502 @end tex
Chris@82 1503 numbers.
Chris@82 1504
Chris@82 1505 We expect that few users will require transforms larger than this, but,
Chris@82 1506 for those who do, we provide a 64-bit version of the guru interface in
Chris@82 1507 which all sizes are specified as integers of type @code{ptrdiff_t}
Chris@82 1508 instead of @code{int}. (@code{ptrdiff_t} is a signed integer type
Chris@82 1509 defined by the C standard to be wide enough to represent address
Chris@82 1510 differences, and thus must be at least 64 bits wide on a 64-bit
Chris@82 1511 machine.) We stress that there is @emph{no performance advantage} to
Chris@82 1512 using this interface---the same internal FFTW code is employed
Chris@82 1513 regardless---and it is only necessary if you want to specify very
Chris@82 1514 large transform sizes.
Chris@82 1515 @tindex ptrdiff_t
Chris@82 1516
Chris@82 1517
Chris@82 1518 In particular, the 64-bit guru interface is a set of planner routines
Chris@82 1519 that are exactly the same as the guru planner routines, except that
Chris@82 1520 they are named with @samp{guru64} instead of @samp{guru} and they take
Chris@82 1521 arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}.
Chris@82 1522 For example, instead of @code{fftw_plan_guru_dft}, we have
Chris@82 1523 @code{fftw_plan_guru64_dft}.
Chris@82 1524
Chris@82 1525 @example
Chris@82 1526 fftw_plan fftw_plan_guru64_dft(
Chris@82 1527 int rank, const fftw_iodim64 *dims,
Chris@82 1528 int howmany_rank, const fftw_iodim64 *howmany_dims,
Chris@82 1529 fftw_complex *in, fftw_complex *out,
Chris@82 1530 int sign, unsigned flags);
Chris@82 1531 @end example
Chris@82 1532 @findex fftw_plan_guru64_dft
Chris@82 1533
Chris@82 1534 The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the
Chris@82 1535 same interpretation, except that it uses type @code{ptrdiff_t} instead
Chris@82 1536 of type @code{int}.
Chris@82 1537
Chris@82 1538 @example
Chris@82 1539 typedef struct @{
Chris@82 1540 ptrdiff_t n;
Chris@82 1541 ptrdiff_t is;
Chris@82 1542 ptrdiff_t os;
Chris@82 1543 @} fftw_iodim64;
Chris@82 1544 @end example
Chris@82 1545 @tindex fftw_iodim64
Chris@82 1546
Chris@82 1547 Every other @samp{fftw_plan_guru} function also has a
Chris@82 1548 @samp{fftw_plan_guru64} equivalent, but we do not repeat their
Chris@82 1549 documentation here since they are identical to the 32-bit versions
Chris@82 1550 except as noted above.
Chris@82 1551
Chris@82 1552 @c -----------------------------------------------------------
Chris@82 1553 @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference
Chris@82 1554 @section New-array Execute Functions
Chris@82 1555 @cindex execute
Chris@82 1556 @cindex new-array execution
Chris@82 1557
Chris@82 1558 Normally, one executes a plan for the arrays with which the plan was
Chris@82 1559 created, by calling @code{fftw_execute(plan)} as described in @ref{Using
Chris@82 1560 Plans}.
Chris@82 1561 @findex fftw_execute
Chris@82 1562 However, it is possible for sophisticated users to apply a given plan
Chris@82 1563 to a @emph{different} array using the ``new-array execute'' functions
Chris@82 1564 detailed below, provided that the following conditions are met:
Chris@82 1565
Chris@82 1566 @itemize @bullet
Chris@82 1567
Chris@82 1568 @item
Chris@82 1569 The array size, strides, etcetera are the same (since those are set by
Chris@82 1570 the plan).
Chris@82 1571
Chris@82 1572 @item
Chris@82 1573 The input and output arrays are the same (in-place) or different
Chris@82 1574 (out-of-place) if the plan was originally created to be in-place or
Chris@82 1575 out-of-place, respectively.
Chris@82 1576
Chris@82 1577 @item
Chris@82 1578 For split arrays, the separations between the real and imaginary
Chris@82 1579 parts, @code{ii-ri} and @code{io-ro}, are the same as they were for
Chris@82 1580 the input and output arrays when the plan was created. (This
Chris@82 1581 condition is automatically satisfied for interleaved arrays.)
Chris@82 1582
Chris@82 1583 @item
Chris@82 1584 The @dfn{alignment} of the new input/output arrays is the same as that
Chris@82 1585 of the input/output arrays when the plan was created, unless the plan
Chris@82 1586 was created with the @code{FFTW_UNALIGNED} flag.
Chris@82 1587 @ctindex FFTW_UNALIGNED
Chris@82 1588 Here, the alignment is a platform-dependent quantity (for example, it is
Chris@82 1589 the address modulo 16 if SSE SIMD instructions are used, but the address
Chris@82 1590 modulo 4 for non-SIMD single-precision FFTW on the same machine). In
Chris@82 1591 general, only arrays allocated with @code{fftw_malloc} are guaranteed to
Chris@82 1592 be equally aligned (@pxref{SIMD alignment and fftw_malloc}).
Chris@82 1593
Chris@82 1594 @end itemize
Chris@82 1595
Chris@82 1596 @cindex alignment
Chris@82 1597 The alignment issue is especially critical, because if you don't use
Chris@82 1598 @code{fftw_malloc} then you may have little control over the alignment
Chris@82 1599 of arrays in memory. For example, neither the C++ @code{new} function
Chris@82 1600 nor the Fortran @code{allocate} statement provide strong enough
Chris@82 1601 guarantees about data alignment. If you don't use @code{fftw_malloc},
Chris@82 1602 therefore, you probably have to use @code{FFTW_UNALIGNED} (which
Chris@82 1603 disables most SIMD support). If possible, it is probably better for
Chris@82 1604 you to simply create multiple plans (creating a new plan is quick once
Chris@82 1605 one exists for a given size), or better yet re-use the same array for
Chris@82 1606 your transforms.
Chris@82 1607
Chris@82 1608 @findex fftw_alignment_of
Chris@82 1609 For rare circumstances in which you cannot control the alignment of
Chris@82 1610 allocated memory, but wish to determine where a given array is
Chris@82 1611 aligned like the original array for which a plan was created, you can
Chris@82 1612 use the @code{fftw_alignment_of} function:
Chris@82 1613 @example
Chris@82 1614 int fftw_alignment_of(double *p);
Chris@82 1615 @end example
Chris@82 1616 Two arrays have equivalent alignment (for the purposes of applying a
Chris@82 1617 plan) if and only if @code{fftw_alignment_of} returns the same value
Chris@82 1618 for the corresponding pointers to their data (typecast to @code{double*}
Chris@82 1619 if necessary).
Chris@82 1620
Chris@82 1621 If you are tempted to use the new-array execute interface because you
Chris@82 1622 want to transform a known bunch of arrays of the same size, you should
Chris@82 1623 probably go use the advanced interface instead (@pxref{Advanced
Chris@82 1624 Interface})).
Chris@82 1625
Chris@82 1626 The new-array execute functions are:
Chris@82 1627
Chris@82 1628 @example
Chris@82 1629 void fftw_execute_dft(
Chris@82 1630 const fftw_plan p,
Chris@82 1631 fftw_complex *in, fftw_complex *out);
Chris@82 1632
Chris@82 1633 void fftw_execute_split_dft(
Chris@82 1634 const fftw_plan p,
Chris@82 1635 double *ri, double *ii, double *ro, double *io);
Chris@82 1636
Chris@82 1637 void fftw_execute_dft_r2c(
Chris@82 1638 const fftw_plan p,
Chris@82 1639 double *in, fftw_complex *out);
Chris@82 1640
Chris@82 1641 void fftw_execute_split_dft_r2c(
Chris@82 1642 const fftw_plan p,
Chris@82 1643 double *in, double *ro, double *io);
Chris@82 1644
Chris@82 1645 void fftw_execute_dft_c2r(
Chris@82 1646 const fftw_plan p,
Chris@82 1647 fftw_complex *in, double *out);
Chris@82 1648
Chris@82 1649 void fftw_execute_split_dft_c2r(
Chris@82 1650 const fftw_plan p,
Chris@82 1651 double *ri, double *ii, double *out);
Chris@82 1652
Chris@82 1653 void fftw_execute_r2r(
Chris@82 1654 const fftw_plan p,
Chris@82 1655 double *in, double *out);
Chris@82 1656 @end example
Chris@82 1657 @findex fftw_execute_dft
Chris@82 1658 @findex fftw_execute_split_dft
Chris@82 1659 @findex fftw_execute_dft_r2c
Chris@82 1660 @findex fftw_execute_split_dft_r2c
Chris@82 1661 @findex fftw_execute_dft_c2r
Chris@82 1662 @findex fftw_execute_split_dft_c2r
Chris@82 1663 @findex fftw_execute_r2r
Chris@82 1664
Chris@82 1665 These execute the @code{plan} to compute the corresponding transform on
Chris@82 1666 the input/output arrays specified by the subsequent arguments. The
Chris@82 1667 input/output array arguments have the same meanings as the ones passed
Chris@82 1668 to the guru planner routines in the preceding sections. The @code{plan}
Chris@82 1669 is not modified, and these routines can be called as many times as
Chris@82 1670 desired, or intermixed with calls to the ordinary @code{fftw_execute}.
Chris@82 1671
Chris@82 1672 The @code{plan} @emph{must} have been created for the transform type
Chris@82 1673 corresponding to the execute function, e.g. it must be a complex-DFT
Chris@82 1674 plan for @code{fftw_execute_dft}. Any of the planner routines for that
Chris@82 1675 transform type, from the basic to the guru interface, could have been
Chris@82 1676 used to create the plan, however.
Chris@82 1677
Chris@82 1678 @c ------------------------------------------------------------
Chris@82 1679 @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference
Chris@82 1680 @section Wisdom
Chris@82 1681 @cindex wisdom
Chris@82 1682 @cindex saving plans to disk
Chris@82 1683
Chris@82 1684 This section documents the FFTW mechanism for saving and restoring
Chris@82 1685 plans from disk. This mechanism is called @dfn{wisdom}.
Chris@82 1686
Chris@82 1687 @menu
Chris@82 1688 * Wisdom Export::
Chris@82 1689 * Wisdom Import::
Chris@82 1690 * Forgetting Wisdom::
Chris@82 1691 * Wisdom Utilities::
Chris@82 1692 @end menu
Chris@82 1693
Chris@82 1694 @c =========>
Chris@82 1695 @node Wisdom Export, Wisdom Import, Wisdom, Wisdom
Chris@82 1696 @subsection Wisdom Export
Chris@82 1697
Chris@82 1698 @example
Chris@82 1699 int fftw_export_wisdom_to_filename(const char *filename);
Chris@82 1700 void fftw_export_wisdom_to_file(FILE *output_file);
Chris@82 1701 char *fftw_export_wisdom_to_string(void);
Chris@82 1702 void fftw_export_wisdom(void (*write_char)(char c, void *), void *data);
Chris@82 1703 @end example
Chris@82 1704 @findex fftw_export_wisdom
Chris@82 1705 @findex fftw_export_wisdom_to_filename
Chris@82 1706 @findex fftw_export_wisdom_to_file
Chris@82 1707 @findex fftw_export_wisdom_to_string
Chris@82 1708
Chris@82 1709 These functions allow you to export all currently accumulated wisdom
Chris@82 1710 in a form from which it can be later imported and restored, even
Chris@82 1711 during a separate run of the program. (@xref{Words of Wisdom-Saving
Chris@82 1712 Plans}.) The current store of wisdom is not affected by calling any
Chris@82 1713 of these routines.
Chris@82 1714
Chris@82 1715 @code{fftw_export_wisdom} exports the wisdom to any output
Chris@82 1716 medium, as specified by the callback function
Chris@82 1717 @code{write_char}. @code{write_char} is a @code{putc}-like function that
Chris@82 1718 writes the character @code{c} to some output; its second parameter is
Chris@82 1719 the @code{data} pointer passed to @code{fftw_export_wisdom}. For
Chris@82 1720 convenience, the following three ``wrapper'' routines are provided:
Chris@82 1721
Chris@82 1722 @code{fftw_export_wisdom_to_filename} writes wisdom to a file named
Chris@82 1723 @code{filename} (which is created or overwritten), returning @code{1}
Chris@82 1724 on success and @code{0} on failure. A lower-level function, which
Chris@82 1725 requires you to open and close the file yourself (e.g. if you want to
Chris@82 1726 write wisdom to a portion of a larger file) is
Chris@82 1727 @code{fftw_export_wisdom_to_file}. This writes the wisdom to the
Chris@82 1728 current position in @code{output_file}, which should be open with
Chris@82 1729 write permission; upon exit, the file remains open and is positioned
Chris@82 1730 at the end of the wisdom data.
Chris@82 1731
Chris@82 1732 @code{fftw_export_wisdom_to_string} returns a pointer to a
Chris@82 1733 @code{NULL}-terminated string holding the wisdom data. This string is
Chris@82 1734 dynamically allocated, and it is the responsibility of the caller to
Chris@82 1735 deallocate it with @code{free} when it is no longer needed.
Chris@82 1736
Chris@82 1737 All of these routines export the wisdom in the same format, which we
Chris@82 1738 will not document here except to say that it is LISP-like ASCII text
Chris@82 1739 that is insensitive to white space.
Chris@82 1740
Chris@82 1741 @c =========>
Chris@82 1742 @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom
Chris@82 1743 @subsection Wisdom Import
Chris@82 1744
Chris@82 1745 @example
Chris@82 1746 int fftw_import_system_wisdom(void);
Chris@82 1747 int fftw_import_wisdom_from_filename(const char *filename);
Chris@82 1748 int fftw_import_wisdom_from_string(const char *input_string);
Chris@82 1749 int fftw_import_wisdom(int (*read_char)(void *), void *data);
Chris@82 1750 @end example
Chris@82 1751 @findex fftw_import_wisdom
Chris@82 1752 @findex fftw_import_system_wisdom
Chris@82 1753 @findex fftw_import_wisdom_from_filename
Chris@82 1754 @findex fftw_import_wisdom_from_file
Chris@82 1755 @findex fftw_import_wisdom_from_string
Chris@82 1756
Chris@82 1757 These functions import wisdom into a program from data stored by the
Chris@82 1758 @code{fftw_export_wisdom} functions above. (@xref{Words of
Chris@82 1759 Wisdom-Saving Plans}.) The imported wisdom replaces any wisdom
Chris@82 1760 already accumulated by the running program.
Chris@82 1761
Chris@82 1762 @code{fftw_import_wisdom} imports wisdom from any input medium, as
Chris@82 1763 specified by the callback function @code{read_char}. @code{read_char} is
Chris@82 1764 a @code{getc}-like function that returns the next character in the
Chris@82 1765 input; its parameter is the @code{data} pointer passed to
Chris@82 1766 @code{fftw_import_wisdom}. If the end of the input data is reached
Chris@82 1767 (which should never happen for valid data), @code{read_char} should
Chris@82 1768 return @code{EOF} (as defined in @code{<stdio.h>}). For convenience,
Chris@82 1769 the following three ``wrapper'' routines are provided:
Chris@82 1770
Chris@82 1771 @code{fftw_import_wisdom_from_filename} reads wisdom from a file named
Chris@82 1772 @code{filename}. A lower-level function, which requires you to open
Chris@82 1773 and close the file yourself (e.g. if you want to read wisdom from a
Chris@82 1774 portion of a larger file) is @code{fftw_import_wisdom_from_file}. This
Chris@82 1775 reads wisdom from the current position in @code{input_file} (which
Chris@82 1776 should be open with read permission); upon exit, the file remains
Chris@82 1777 open, but the position of the read pointer is unspecified.
Chris@82 1778
Chris@82 1779 @code{fftw_import_wisdom_from_string} reads wisdom from the
Chris@82 1780 @code{NULL}-terminated string @code{input_string}.
Chris@82 1781
Chris@82 1782 @code{fftw_import_system_wisdom} reads wisdom from an
Chris@82 1783 implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix
Chris@82 1784 and GNU systems).
Chris@82 1785 @cindex wisdom, system-wide
Chris@82 1786
Chris@82 1787
Chris@82 1788 The return value of these import routines is @code{1} if the wisdom was
Chris@82 1789 read successfully and @code{0} otherwise. Note that, in all of these
Chris@82 1790 functions, any data in the input stream past the end of the wisdom data
Chris@82 1791 is simply ignored.
Chris@82 1792
Chris@82 1793 @c =========>
Chris@82 1794 @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom
Chris@82 1795 @subsection Forgetting Wisdom
Chris@82 1796
Chris@82 1797 @example
Chris@82 1798 void fftw_forget_wisdom(void);
Chris@82 1799 @end example
Chris@82 1800 @findex fftw_forget_wisdom
Chris@82 1801
Chris@82 1802 Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom}
Chris@82 1803 to be discarded and its associated memory to be freed. (New
Chris@82 1804 @code{wisdom} can still be gathered subsequently, however.)
Chris@82 1805
Chris@82 1806 @c =========>
Chris@82 1807 @node Wisdom Utilities, , Forgetting Wisdom, Wisdom
Chris@82 1808 @subsection Wisdom Utilities
Chris@82 1809
Chris@82 1810 FFTW includes two standalone utility programs that deal with wisdom. We
Chris@82 1811 merely summarize them here, since they come with their own @code{man}
Chris@82 1812 pages for Unix and GNU systems (with HTML versions on our web site).
Chris@82 1813
Chris@82 1814 The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in
Chris@82 1815 single precision, etcetera), which can be used to create a wisdom file
Chris@82 1816 containing plans for any of the transform sizes and types supported by
Chris@82 1817 FFTW. It is preferable to create wisdom directly from your executable
Chris@82 1818 (@pxref{Caveats in Using Wisdom}), but this program is useful for
Chris@82 1819 creating global wisdom files for @code{fftw_import_system_wisdom}.
Chris@82 1820 @cindex fftw-wisdom utility
Chris@82 1821
Chris@82 1822
Chris@82 1823 The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom
Chris@82 1824 file as input and produces a @dfn{configuration routine} as output. The
Chris@82 1825 latter is a C subroutine that you can compile and link into your
Chris@82 1826 program, replacing a routine of the same name in the FFTW library, that
Chris@82 1827 determines which parts of FFTW are callable by your program.
Chris@82 1828 @code{fftw-wisdom-to-conf} produces a configuration routine that links
Chris@82 1829 to only those parts of FFTW needed by the saved plans in the wisdom,
Chris@82 1830 greatly reducing the size of statically linked executables (which should
Chris@82 1831 only attempt to create plans corresponding to those in the wisdom,
Chris@82 1832 however).
Chris@82 1833 @cindex fftw-wisdom-to-conf utility
Chris@82 1834 @cindex configuration routines
Chris@82 1835
Chris@82 1836 @c ------------------------------------------------------------
Chris@82 1837 @node What FFTW Really Computes, , Wisdom, FFTW Reference
Chris@82 1838 @section What FFTW Really Computes
Chris@82 1839
Chris@82 1840 In this section, we provide precise mathematical definitions for the
Chris@82 1841 transforms that FFTW computes. These transform definitions are fairly
Chris@82 1842 standard, but some authors follow slightly different conventions for the
Chris@82 1843 normalization of the transform (the constant factor in front) and the
Chris@82 1844 sign of the complex exponent. We begin by presenting the
Chris@82 1845 one-dimensional (1d) transform definitions, and then give the
Chris@82 1846 straightforward extension to multi-dimensional transforms.
Chris@82 1847
Chris@82 1848 @menu
Chris@82 1849 * The 1d Discrete Fourier Transform (DFT)::
Chris@82 1850 * The 1d Real-data DFT::
Chris@82 1851 * 1d Real-even DFTs (DCTs)::
Chris@82 1852 * 1d Real-odd DFTs (DSTs)::
Chris@82 1853 * 1d Discrete Hartley Transforms (DHTs)::
Chris@82 1854 * Multi-dimensional Transforms::
Chris@82 1855 @end menu
Chris@82 1856
Chris@82 1857 @c =========>
Chris@82 1858 @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes
Chris@82 1859 @subsection The 1d Discrete Fourier Transform (DFT)
Chris@82 1860
Chris@82 1861 @cindex discrete Fourier transform
Chris@82 1862 @cindex DFT
Chris@82 1863 The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a
Chris@82 1864 1d complex array @math{X} of size @math{n} computes an array @math{Y},
Chris@82 1865 where:
Chris@82 1866 @tex
Chris@82 1867 $$
Chris@82 1868 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
Chris@82 1869 $$
Chris@82 1870 @end tex
Chris@82 1871 @ifinfo
Chris@82 1872 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
Chris@82 1873 @end ifinfo
Chris@82 1874 @html
Chris@82 1875 <center><img src="equation-dft.png" align="top">.</center>
Chris@82 1876 @end html
Chris@82 1877 The backward (@code{FFTW_BACKWARD}) DFT computes:
Chris@82 1878 @tex
Chris@82 1879 $$
Chris@82 1880 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
Chris@82 1881 $$
Chris@82 1882 @end tex
Chris@82 1883 @ifinfo
Chris@82 1884 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
Chris@82 1885 @end ifinfo
Chris@82 1886 @html
Chris@82 1887 <center><img src="equation-idft.png" align="top">.</center>
Chris@82 1888 @end html
Chris@82 1889
Chris@82 1890 @cindex normalization
Chris@82 1891 FFTW computes an unnormalized transform, in that there is no coefficient
Chris@82 1892 in front of the summation in the DFT. In other words, applying the
Chris@82 1893 forward and then the backward transform will multiply the input by
Chris@82 1894 @math{n}.
Chris@82 1895
Chris@82 1896 @cindex frequency
Chris@82 1897 From above, an @code{FFTW_FORWARD} transform corresponds to a sign of
Chris@82 1898 @math{-1} in the exponent of the DFT. Note also that we use the
Chris@82 1899 standard ``in-order'' output ordering---the @math{k}-th output
Chris@82 1900 corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T}
Chris@82 1901 is your total sampling period). For those who like to think in terms of
Chris@82 1902 positive and negative frequencies, this means that the positive
Chris@82 1903 frequencies are stored in the first half of the output and the negative
Chris@82 1904 frequencies are stored in backwards order in the second half of the
Chris@82 1905 output. (The frequency @math{-k/n} is the same as the frequency
Chris@82 1906 @math{(n-k)/n}.)
Chris@82 1907
Chris@82 1908 @c =========>
Chris@82 1909 @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes
Chris@82 1910 @subsection The 1d Real-data DFT
Chris@82 1911
Chris@82 1912 The real-input (r2c) DFT in FFTW computes the @emph{forward} transform
Chris@82 1913 @math{Y} of the size @code{n} real array @math{X}, exactly as defined
Chris@82 1914 above, i.e.
Chris@82 1915 @tex
Chris@82 1916 $$
Chris@82 1917 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
Chris@82 1918 $$
Chris@82 1919 @end tex
Chris@82 1920 @ifinfo
Chris@82 1921 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
Chris@82 1922 @end ifinfo
Chris@82 1923 @html
Chris@82 1924 <center><img src="equation-dft.png" align="top">.</center>
Chris@82 1925 @end html
Chris@82 1926 This output array @math{Y} can easily be shown to possess the
Chris@82 1927 ``Hermitian'' symmetry
Chris@82 1928 @cindex Hermitian
Chris@82 1929 @tex
Chris@82 1930 $Y_k = Y_{n-k}^*$,
Chris@82 1931 @end tex
Chris@82 1932 @ifinfo
Chris@82 1933 Y[k] = Y[n-k]*,
Chris@82 1934 @end ifinfo
Chris@82 1935 @html
Chris@82 1936 <i>Y<sub>k</sub> = Y<sub>n-k</sub></i><sup>*</sup>,
Chris@82 1937 @end html
Chris@82 1938 where we take @math{Y} to be periodic so that
Chris@82 1939 @tex
Chris@82 1940 $Y_n = Y_0$.
Chris@82 1941 @end tex
Chris@82 1942 @ifinfo
Chris@82 1943 Y[n] = Y[0].
Chris@82 1944 @end ifinfo
Chris@82 1945 @html
Chris@82 1946 <i>Y<sub>n</sub> = Y</i><sub>0</sub>.
Chris@82 1947 @end html
Chris@82 1948
Chris@82 1949 As a result of this symmetry, half of the output @math{Y} is redundant
Chris@82 1950 (being the complex conjugate of the other half), and so the 1d r2c
Chris@82 1951 transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y}
Chris@82 1952 (@math{n/2+1} complex numbers), where the division by @math{2} is
Chris@82 1953 rounded down.
Chris@82 1954
Chris@82 1955 Moreover, the Hermitian symmetry implies that
Chris@82 1956 @tex
Chris@82 1957 $Y_0$
Chris@82 1958 @end tex
Chris@82 1959 @ifinfo
Chris@82 1960 Y[0]
Chris@82 1961 @end ifinfo
Chris@82 1962 @html
Chris@82 1963 <i>Y</i><sub>0</sub>
Chris@82 1964 @end html
Chris@82 1965 and, if @math{n} is even, the
Chris@82 1966 @tex
Chris@82 1967 $Y_{n/2}$
Chris@82 1968 @end tex
Chris@82 1969 @ifinfo
Chris@82 1970 Y[n/2]
Chris@82 1971 @end ifinfo
Chris@82 1972 @html
Chris@82 1973 <i>Y</i><sub><i>n</i>/2</sub>
Chris@82 1974 @end html
Chris@82 1975 element, are purely real. So, for the @code{R2HC} r2r transform, the
Chris@82 1976 halfcomplex format does not store the imaginary parts of these elements.
Chris@82 1977 @cindex r2r
Chris@82 1978 @ctindex R2HC
Chris@82 1979 @cindex halfcomplex format
Chris@82 1980
Chris@82 1981
Chris@82 1982 The c2r and @code{H2RC} r2r transforms compute the backward DFT of the
Chris@82 1983 @emph{complex} array @math{X} with Hermitian symmetry, stored in the
Chris@82 1984 r2c/@code{R2HC} output formats, respectively, where the backward
Chris@82 1985 transform is defined exactly as for the complex case:
Chris@82 1986 @tex
Chris@82 1987 $$
Chris@82 1988 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
Chris@82 1989 $$
Chris@82 1990 @end tex
Chris@82 1991 @ifinfo
Chris@82 1992 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
Chris@82 1993 @end ifinfo
Chris@82 1994 @html
Chris@82 1995 <center><img src="equation-idft.png" align="top">.</center>
Chris@82 1996 @end html
Chris@82 1997 The outputs @code{Y} of this transform can easily be seen to be purely
Chris@82 1998 real, and are stored as an array of real numbers.
Chris@82 1999
Chris@82 2000 @cindex normalization
Chris@82 2001 Like FFTW's complex DFT, these transforms are unnormalized. In other
Chris@82 2002 words, applying the real-to-complex (forward) and then the
Chris@82 2003 complex-to-real (backward) transform will multiply the input by
Chris@82 2004 @math{n}.
Chris@82 2005
Chris@82 2006 @c =========>
Chris@82 2007 @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes
Chris@82 2008 @subsection 1d Real-even DFTs (DCTs)
Chris@82 2009
Chris@82 2010 The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized
Chris@82 2011 forward (and backward) DFTs as defined above, where the input array
Chris@82 2012 @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry. In
Chris@82 2013 this case, the output array is likewise real and even symmetry.
Chris@82 2014 @cindex real-even DFT
Chris@82 2015 @cindex REDFT
Chris@82 2016
Chris@82 2017
Chris@82 2018 @ctindex REDFT00
Chris@82 2019 For the case of @code{REDFT00}, this even symmetry means that
Chris@82 2020 @tex
Chris@82 2021 $X_j = X_{N-j}$,
Chris@82 2022 @end tex
Chris@82 2023 @ifinfo
Chris@82 2024 X[j] = X[N-j],
Chris@82 2025 @end ifinfo
Chris@82 2026 @html
Chris@82 2027 <i>X<sub>j</sub> = X<sub>N-j</sub></i>,
Chris@82 2028 @end html
Chris@82 2029 where we take @math{X} to be periodic so that
Chris@82 2030 @tex
Chris@82 2031 $X_N = X_0$.
Chris@82 2032 @end tex
Chris@82 2033 @ifinfo
Chris@82 2034 X[N] = X[0].
Chris@82 2035 @end ifinfo
Chris@82 2036 @html
Chris@82 2037 <i>X<sub>N</sub> = X</i><sub>0</sub>.
Chris@82 2038 @end html
Chris@82 2039 Because of this redundancy, only the first @math{n} real numbers are
Chris@82 2040 actually stored, where @math{N = 2(n-1)}.
Chris@82 2041
Chris@82 2042 The proper definition of even symmetry for @code{REDFT10},
Chris@82 2043 @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate
Chris@82 2044 because of the shifts by @math{1/2} of the input and/or output, although
Chris@82 2045 the corresponding boundary conditions are given in @ref{Real even/odd
Chris@82 2046 DFTs (cosine/sine transforms)}. Because of the even symmetry, however,
Chris@82 2047 the sine terms in the DFT all cancel and the remaining cosine terms are
Chris@82 2048 written explicitly below. This formulation often leads people to call
Chris@82 2049 such a transform a @dfn{discrete cosine transform} (DCT), although it is
Chris@82 2050 really just a special case of the DFT.
Chris@82 2051 @cindex discrete cosine transform
Chris@82 2052 @cindex DCT
Chris@82 2053
Chris@82 2054
Chris@82 2055 In each of the definitions below, we transform a real array @math{X} of
Chris@82 2056 length @math{n} to a real array @math{Y} of length @math{n}:
Chris@82 2057
Chris@82 2058 @subsubheading REDFT00 (DCT-I)
Chris@82 2059 @ctindex REDFT00
Chris@82 2060 An @code{REDFT00} transform (type-I DCT) in FFTW is defined by:
Chris@82 2061 @tex
Chris@82 2062 $$
Chris@82 2063 Y_k = X_0 + (-1)^k X_{n-1}
Chris@82 2064 + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)].
Chris@82 2065 $$
Chris@82 2066 @end tex
Chris@82 2067 @ifinfo
Chris@82 2068 Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))).
Chris@82 2069 @end ifinfo
Chris@82 2070 @html
Chris@82 2071 <center><img src="equation-redft00.png" align="top">.</center>
Chris@82 2072 @end html
Chris@82 2073 Note that this transform is not defined for @math{n=1}. For @math{n=2},
Chris@82 2074 the summation term above is dropped as you might expect.
Chris@82 2075
Chris@82 2076 @subsubheading REDFT10 (DCT-II)
Chris@82 2077 @ctindex REDFT10
Chris@82 2078 An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by:
Chris@82 2079 @tex
Chris@82 2080 $$
Chris@82 2081 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n].
Chris@82 2082 $$
Chris@82 2083 @end tex
Chris@82 2084 @ifinfo
Chris@82 2085 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)).
Chris@82 2086 @end ifinfo
Chris@82 2087 @html
Chris@82 2088 <center><img src="equation-redft10.png" align="top">.</center>
Chris@82 2089 @end html
Chris@82 2090
Chris@82 2091 @subsubheading REDFT01 (DCT-III)
Chris@82 2092 @ctindex REDFT01
Chris@82 2093 An @code{REDFT01} transform (type-III DCT) in FFTW is defined by:
Chris@82 2094 @tex
Chris@82 2095 $$
Chris@82 2096 Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n].
Chris@82 2097 $$
Chris@82 2098 @end tex
Chris@82 2099 @ifinfo
Chris@82 2100 Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)).
Chris@82 2101 @end ifinfo
Chris@82 2102 @html
Chris@82 2103 <center><img src="equation-redft01.png" align="top">.</center>
Chris@82 2104 @end html
Chris@82 2105 In the case of @math{n=1}, this reduces to
Chris@82 2106 @tex
Chris@82 2107 $Y_0 = X_0$.
Chris@82 2108 @end tex
Chris@82 2109 @ifinfo
Chris@82 2110 Y[0] = X[0].
Chris@82 2111 @end ifinfo
Chris@82 2112 @html
Chris@82 2113 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
Chris@82 2114 @end html
Chris@82 2115 Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''.
Chris@82 2116 @cindex IDCT
Chris@82 2117
Chris@82 2118 @subsubheading REDFT11 (DCT-IV)
Chris@82 2119 @ctindex REDFT11
Chris@82 2120 An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by:
Chris@82 2121 @tex
Chris@82 2122 $$
Chris@82 2123 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n].
Chris@82 2124 $$
Chris@82 2125 @end tex
Chris@82 2126 @ifinfo
Chris@82 2127 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)).
Chris@82 2128 @end ifinfo
Chris@82 2129 @html
Chris@82 2130 <center><img src="equation-redft11.png" align="top">.</center>
Chris@82 2131 @end html
Chris@82 2132
Chris@82 2133 @subsubheading Inverses and Normalization
Chris@82 2134
Chris@82 2135 These definitions correspond directly to the unnormalized DFTs used
Chris@82 2136 elsewhere in FFTW (hence the factors of @math{2} in front of the
Chris@82 2137 summations). The unnormalized inverse of @code{REDFT00} is
Chris@82 2138 @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and
Chris@82 2139 of @code{REDFT11} is @code{REDFT11}. Each unnormalized inverse results
Chris@82 2140 in the original array multiplied by @math{N}, where @math{N} is the
Chris@82 2141 @emph{logical} DFT size. For @code{REDFT00}, @math{N=2(n-1)} (note that
Chris@82 2142 @math{n=1} is not defined); otherwise, @math{N=2n}.
Chris@82 2143 @cindex normalization
Chris@82 2144
Chris@82 2145
Chris@82 2146 In defining the discrete cosine transform, some authors also include
Chris@82 2147 additional factors of
Chris@82 2148 @ifinfo
Chris@82 2149 sqrt(2)
Chris@82 2150 @end ifinfo
Chris@82 2151 @html
Chris@82 2152 &radic;2
Chris@82 2153 @end html
Chris@82 2154 @tex
Chris@82 2155 $\sqrt{2}$
Chris@82 2156 @end tex
Chris@82 2157 (or its inverse) multiplying selected inputs and/or outputs. This is a
Chris@82 2158 mostly cosmetic change that makes the transform orthogonal, but
Chris@82 2159 sacrifices the direct equivalence to a symmetric DFT.
Chris@82 2160
Chris@82 2161 @c =========>
Chris@82 2162 @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes
Chris@82 2163 @subsection 1d Real-odd DFTs (DSTs)
Chris@82 2164
Chris@82 2165 The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized
Chris@82 2166 forward (and backward) DFTs as defined above, where the input array
Chris@82 2167 @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry. In
Chris@82 2168 this case, the output is odd symmetry and purely imaginary.
Chris@82 2169 @cindex real-odd DFT
Chris@82 2170 @cindex RODFT
Chris@82 2171
Chris@82 2172
Chris@82 2173 @ctindex RODFT00
Chris@82 2174 For the case of @code{RODFT00}, this odd symmetry means that
Chris@82 2175 @tex
Chris@82 2176 $X_j = -X_{N-j}$,
Chris@82 2177 @end tex
Chris@82 2178 @ifinfo
Chris@82 2179 X[j] = -X[N-j],
Chris@82 2180 @end ifinfo
Chris@82 2181 @html
Chris@82 2182 <i>X<sub>j</sub> = -X<sub>N-j</sub></i>,
Chris@82 2183 @end html
Chris@82 2184 where we take @math{X} to be periodic so that
Chris@82 2185 @tex
Chris@82 2186 $X_N = X_0$.
Chris@82 2187 @end tex
Chris@82 2188 @ifinfo
Chris@82 2189 X[N] = X[0].
Chris@82 2190 @end ifinfo
Chris@82 2191 @html
Chris@82 2192 <i>X<sub>N</sub> = X</i><sub>0</sub>.
Chris@82 2193 @end html
Chris@82 2194 Because of this redundancy, only the first @math{n} real numbers
Chris@82 2195 starting at @math{j=1} are actually stored (the @math{j=0} element is
Chris@82 2196 zero), where @math{N = 2(n+1)}.
Chris@82 2197
Chris@82 2198 The proper definition of odd symmetry for @code{RODFT10},
Chris@82 2199 @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate
Chris@82 2200 because of the shifts by @math{1/2} of the input and/or output, although
Chris@82 2201 the corresponding boundary conditions are given in @ref{Real even/odd
Chris@82 2202 DFTs (cosine/sine transforms)}. Because of the odd symmetry, however,
Chris@82 2203 the cosine terms in the DFT all cancel and the remaining sine terms are
Chris@82 2204 written explicitly below. This formulation often leads people to call
Chris@82 2205 such a transform a @dfn{discrete sine transform} (DST), although it is
Chris@82 2206 really just a special case of the DFT.
Chris@82 2207 @cindex discrete sine transform
Chris@82 2208 @cindex DST
Chris@82 2209
Chris@82 2210
Chris@82 2211 In each of the definitions below, we transform a real array @math{X} of
Chris@82 2212 length @math{n} to a real array @math{Y} of length @math{n}:
Chris@82 2213
Chris@82 2214 @subsubheading RODFT00 (DST-I)
Chris@82 2215 @ctindex RODFT00
Chris@82 2216 An @code{RODFT00} transform (type-I DST) in FFTW is defined by:
Chris@82 2217 @tex
Chris@82 2218 $$
Chris@82 2219 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)].
Chris@82 2220 $$
Chris@82 2221 @end tex
Chris@82 2222 @ifinfo
Chris@82 2223 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))).
Chris@82 2224 @end ifinfo
Chris@82 2225 @html
Chris@82 2226 <center><img src="equation-rodft00.png" align="top">.</center>
Chris@82 2227 @end html
Chris@82 2228
Chris@82 2229 @subsubheading RODFT10 (DST-II)
Chris@82 2230 @ctindex RODFT10
Chris@82 2231 An @code{RODFT10} transform (type-II DST) in FFTW is defined by:
Chris@82 2232 @tex
Chris@82 2233 $$
Chris@82 2234 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n].
Chris@82 2235 $$
Chris@82 2236 @end tex
Chris@82 2237 @ifinfo
Chris@82 2238 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)).
Chris@82 2239 @end ifinfo
Chris@82 2240 @html
Chris@82 2241 <center><img src="equation-rodft10.png" align="top">.</center>
Chris@82 2242 @end html
Chris@82 2243
Chris@82 2244 @subsubheading RODFT01 (DST-III)
Chris@82 2245 @ctindex RODFT01
Chris@82 2246 An @code{RODFT01} transform (type-III DST) in FFTW is defined by:
Chris@82 2247 @tex
Chris@82 2248 $$
Chris@82 2249 Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n].
Chris@82 2250 $$
Chris@82 2251 @end tex
Chris@82 2252 @ifinfo
Chris@82 2253 Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)).
Chris@82 2254 @end ifinfo
Chris@82 2255 @html
Chris@82 2256 <center><img src="equation-rodft01.png" align="top">.</center>
Chris@82 2257 @end html
Chris@82 2258 In the case of @math{n=1}, this reduces to
Chris@82 2259 @tex
Chris@82 2260 $Y_0 = X_0$.
Chris@82 2261 @end tex
Chris@82 2262 @ifinfo
Chris@82 2263 Y[0] = X[0].
Chris@82 2264 @end ifinfo
Chris@82 2265 @html
Chris@82 2266 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
Chris@82 2267 @end html
Chris@82 2268
Chris@82 2269 @subsubheading RODFT11 (DST-IV)
Chris@82 2270 @ctindex RODFT11
Chris@82 2271 An @code{RODFT11} transform (type-IV DST) in FFTW is defined by:
Chris@82 2272 @tex
Chris@82 2273 $$
Chris@82 2274 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n].
Chris@82 2275 $$
Chris@82 2276 @end tex
Chris@82 2277 @ifinfo
Chris@82 2278 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)).
Chris@82 2279 @end ifinfo
Chris@82 2280 @html
Chris@82 2281 <center><img src="equation-rodft11.png" align="top">.</center>
Chris@82 2282 @end html
Chris@82 2283
Chris@82 2284 @subsubheading Inverses and Normalization
Chris@82 2285
Chris@82 2286 These definitions correspond directly to the unnormalized DFTs used
Chris@82 2287 elsewhere in FFTW (hence the factors of @math{2} in front of the
Chris@82 2288 summations). The unnormalized inverse of @code{RODFT00} is
Chris@82 2289 @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and
Chris@82 2290 of @code{RODFT11} is @code{RODFT11}. Each unnormalized inverse results
Chris@82 2291 in the original array multiplied by @math{N}, where @math{N} is the
Chris@82 2292 @emph{logical} DFT size. For @code{RODFT00}, @math{N=2(n+1)};
Chris@82 2293 otherwise, @math{N=2n}.
Chris@82 2294 @cindex normalization
Chris@82 2295
Chris@82 2296
Chris@82 2297 In defining the discrete sine transform, some authors also include
Chris@82 2298 additional factors of
Chris@82 2299 @ifinfo
Chris@82 2300 sqrt(2)
Chris@82 2301 @end ifinfo
Chris@82 2302 @html
Chris@82 2303 &radic;2
Chris@82 2304 @end html
Chris@82 2305 @tex
Chris@82 2306 $\sqrt{2}$
Chris@82 2307 @end tex
Chris@82 2308 (or its inverse) multiplying selected inputs and/or outputs. This is a
Chris@82 2309 mostly cosmetic change that makes the transform orthogonal, but
Chris@82 2310 sacrifices the direct equivalence to an antisymmetric DFT.
Chris@82 2311
Chris@82 2312 @c =========>
Chris@82 2313 @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes
Chris@82 2314 @subsection 1d Discrete Hartley Transforms (DHTs)
Chris@82 2315
Chris@82 2316 @cindex discrete Hartley transform
Chris@82 2317 @cindex DHT
Chris@82 2318 The discrete Hartley transform (DHT) of a 1d real array @math{X} of size
Chris@82 2319 @math{n} computes a real array @math{Y} of the same size, where:
Chris@82 2320 @tex
Chris@82 2321 $$
Chris@82 2322 Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)].
Chris@82 2323 $$
Chris@82 2324 @end tex
Chris@82 2325 @ifinfo
Chris@82 2326 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)].
Chris@82 2327 @end ifinfo
Chris@82 2328 @html
Chris@82 2329 <center><img src="equation-dht.png" align="top">.</center>
Chris@82 2330 @end html
Chris@82 2331
Chris@82 2332 @cindex normalization
Chris@82 2333 FFTW computes an unnormalized transform, in that there is no coefficient
Chris@82 2334 in front of the summation in the DHT. In other words, applying the
Chris@82 2335 transform twice (the DHT is its own inverse) will multiply the input by
Chris@82 2336 @math{n}.
Chris@82 2337
Chris@82 2338 @c =========>
Chris@82 2339 @node Multi-dimensional Transforms, , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes
Chris@82 2340 @subsection Multi-dimensional Transforms
Chris@82 2341
Chris@82 2342 The multi-dimensional transforms of FFTW, in general, compute simply the
Chris@82 2343 separable product of the given 1d transform along each dimension of the
Chris@82 2344 array. Since each of these transforms is unnormalized, computing the
Chris@82 2345 forward followed by the backward/inverse multi-dimensional transform
Chris@82 2346 will result in the original array scaled by the product of the
Chris@82 2347 normalization factors for each dimension (e.g. the product of the
Chris@82 2348 dimension sizes, for a multi-dimensional DFT).
Chris@82 2349
Chris@82 2350 @tex
Chris@82 2351 As an explicit example, consider the following exact mathematical
Chris@82 2352 definition of our multi-dimensional DFT. Let $X$ be a $d$-dimensional
Chris@82 2353 complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0
Chris@82 2354 \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$. Let also
Chris@82 2355 $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d
Chris@82 2356 \}$.
Chris@82 2357
Chris@82 2358 The forward transform computes a complex array~$Y$, whose
Chris@82 2359 structure is the same as that of~$X$, defined by
Chris@82 2360
Chris@82 2361 $$
Chris@82 2362 Y[k_1, k_2, \ldots, k_d] =
Chris@82 2363 \sum_{j_1 = 0}^{n_1 - 1}
Chris@82 2364 \sum_{j_2 = 0}^{n_2 - 1}
Chris@82 2365 \cdots
Chris@82 2366 \sum_{j_d = 0}^{n_d - 1}
Chris@82 2367 X[j_1, j_2, \ldots, j_d]
Chris@82 2368 \omega_1^{-j_1 k_1}
Chris@82 2369 \omega_2^{-j_2 k_2}
Chris@82 2370 \cdots
Chris@82 2371 \omega_d^{-j_d k_d} \ .
Chris@82 2372 $$
Chris@82 2373
Chris@82 2374 The backward transform computes
Chris@82 2375 $$
Chris@82 2376 Y[k_1, k_2, \ldots, k_d] =
Chris@82 2377 \sum_{j_1 = 0}^{n_1 - 1}
Chris@82 2378 \sum_{j_2 = 0}^{n_2 - 1}
Chris@82 2379 \cdots
Chris@82 2380 \sum_{j_d = 0}^{n_d - 1}
Chris@82 2381 X[j_1, j_2, \ldots, j_d]
Chris@82 2382 \omega_1^{j_1 k_1}
Chris@82 2383 \omega_2^{j_2 k_2}
Chris@82 2384 \cdots
Chris@82 2385 \omega_d^{j_d k_d} \ .
Chris@82 2386 $$
Chris@82 2387
Chris@82 2388 Computing the forward transform followed by the backward transform
Chris@82 2389 will multiply the array by $\prod_{s=1}^{d} n_d$.
Chris@82 2390 @end tex
Chris@82 2391
Chris@82 2392 @cindex r2c
Chris@82 2393 The definition of FFTW's multi-dimensional DFT of real data (r2c)
Chris@82 2394 deserves special attention. In this case, we logically compute the full
Chris@82 2395 multi-dimensional DFT of the input data; since the input data are purely
Chris@82 2396 real, the output data have the Hermitian symmetry and therefore only one
Chris@82 2397 non-redundant half need be stored. More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array
Chris@82 2398 @tex
Chris@82 2399 $Y[k_0, k_1, \ldots, k_{d-1}]$
Chris@82 2400 @end tex
Chris@82 2401 @html
Chris@82 2402 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
Chris@82 2403 <i>k</i><sub><i>d-1</i></sub>]
Chris@82 2404 @end html
Chris@82 2405 @ifinfo
Chris@82 2406 Y[k[0], k[1], ..., k[d-1]]
Chris@82 2407 @end ifinfo
Chris@82 2408 has the symmetry:
Chris@82 2409 @tex
Chris@82 2410 $$
Chris@82 2411 Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^*
Chris@82 2412 $$
Chris@82 2413 @end tex
Chris@82 2414 @html
Chris@82 2415 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
Chris@82 2416 <i>k</i><sub><i>d-1</i></sub>] = <i>Y</i>[<i>n</i><sub>0</sub> -
Chris@82 2417 <i>k</i><sub>0</sub>, <i>n</i><sub>1</sub> - <i>k</i><sub>1</sub>, ...,
Chris@82 2418 <i>n</i><sub><i>d-1</i></sub> - <i>k</i><sub><i>d-1</i></sub>]<sup>*</sup>
Chris@82 2419 @end html
Chris@82 2420 @ifinfo
Chris@82 2421 Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]*
Chris@82 2422 @end ifinfo
Chris@82 2423 (where each dimension is periodic). Because of this symmetry, we only
Chris@82 2424 store the
Chris@82 2425 @tex
Chris@82 2426 $k_{d-1} = 0 \cdots n_{d-1}/2$
Chris@82 2427 @end tex
Chris@82 2428 @html
Chris@82 2429 <i>k</i><sub><i>d-1</i></sub> = 0...<i>n</i><sub><i>d-1</i></sub>/2+1
Chris@82 2430 @end html
Chris@82 2431 @ifinfo
Chris@82 2432 k[d-1] = 0...n[d-1]/2
Chris@82 2433 @end ifinfo
Chris@82 2434 elements of the @emph{last} dimension (division by @math{2} is rounded
Chris@82 2435 down). (We could instead have cut any other dimension in half, but the
Chris@82 2436 last dimension proved computationally convenient.) This results in the
Chris@82 2437 peculiar array format described in more detail by @ref{Real-data DFT
Chris@82 2438 Array Format}.
Chris@82 2439
Chris@82 2440 The multi-dimensional c2r transform is simply the unnormalized inverse
Chris@82 2441 of the r2c transform. i.e. it is the same as FFTW's complex backward
Chris@82 2442 multi-dimensional DFT, operating on a Hermitian input array in the
Chris@82 2443 peculiar format mentioned above and outputting a real array (since the
Chris@82 2444 DFT output is purely real).
Chris@82 2445
Chris@82 2446 We should remind the user that the separable product of 1d transforms
Chris@82 2447 along each dimension, as computed by FFTW, is not always the same thing
Chris@82 2448 as the usual multi-dimensional transform. A multi-dimensional
Chris@82 2449 @code{R2HC} (or @code{HC2R}) transform is not identical to the
Chris@82 2450 multi-dimensional DFT, requiring some post-processing to combine the
Chris@82 2451 requisite real and imaginary parts, as was described in @ref{The
Chris@82 2452 Halfcomplex-format DFT}. Likewise, FFTW's multidimensional
Chris@82 2453 @code{FFTW_DHT} r2r transform is not the same thing as the logical
Chris@82 2454 multi-dimensional discrete Hartley transform defined in the literature,
Chris@82 2455 as discussed in @ref{The Discrete Hartley Transform}.
Chris@82 2456