Mercurial > hg > sv-dependency-builds
comparison src/fftw-3.3.3/doc/reference.texi @ 10:37bf6b4a2645
Add FFTW3
| author | Chris Cannam |
|---|---|
| date | Wed, 20 Mar 2013 15:35:50 +0000 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 9:c0fb53affa76 | 10:37bf6b4a2645 |
|---|---|
| 1 @node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top | |
| 2 @chapter FFTW Reference | |
| 3 | |
| 4 This chapter provides a complete reference for all sequential (i.e., | |
| 5 one-processor) FFTW functions. Parallel transforms are described in | |
| 6 later chapters. | |
| 7 | |
| 8 @menu | |
| 9 * Data Types and Files:: | |
| 10 * Using Plans:: | |
| 11 * Basic Interface:: | |
| 12 * Advanced Interface:: | |
| 13 * Guru Interface:: | |
| 14 * New-array Execute Functions:: | |
| 15 * Wisdom:: | |
| 16 * What FFTW Really Computes:: | |
| 17 @end menu | |
| 18 | |
| 19 @c ------------------------------------------------------------ | |
| 20 @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference | |
| 21 @section Data Types and Files | |
| 22 | |
| 23 All programs using FFTW should include its header file: | |
| 24 | |
| 25 @example | |
| 26 #include <fftw3.h> | |
| 27 @end example | |
| 28 | |
| 29 You must also link to the FFTW library. On Unix, this | |
| 30 means adding @code{-lfftw3 -lm} at the @emph{end} of the link command. | |
| 31 | |
| 32 @menu | |
| 33 * Complex numbers:: | |
| 34 * Precision:: | |
| 35 * Memory Allocation:: | |
| 36 @end menu | |
| 37 | |
| 38 @c =========> | |
| 39 @node Complex numbers, Precision, Data Types and Files, Data Types and Files | |
| 40 @subsection Complex numbers | |
| 41 | |
| 42 The default FFTW interface uses @code{double} precision for all | |
| 43 floating-point numbers, and defines a @code{fftw_complex} type to hold | |
| 44 complex numbers as: | |
| 45 | |
| 46 @example | |
| 47 typedef double fftw_complex[2]; | |
| 48 @end example | |
| 49 @tindex fftw_complex | |
| 50 | |
| 51 Here, the @code{[0]} element holds the real part and the @code{[1]} | |
| 52 element holds the imaginary part. | |
| 53 | |
| 54 Alternatively, if you have a C compiler (such as @code{gcc}) that | |
| 55 supports the C99 revision of the ANSI C standard, you can use C's new | |
| 56 native complex type (which is binary-compatible with the typedef above). | |
| 57 In particular, if you @code{#include <complex.h>} @emph{before} | |
| 58 @code{<fftw3.h>}, then @code{fftw_complex} is defined to be the native | |
| 59 complex type and you can manipulate it with ordinary arithmetic | |
| 60 (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are | |
| 61 @code{fftw_complex} and @code{I} is the standard symbol for the | |
| 62 imaginary unit); | |
| 63 @cindex C99 | |
| 64 | |
| 65 | |
| 66 C++ has its own @code{complex<T>} template class, defined in the | |
| 67 standard @code{<complex>} header file. Reportedly, the C++ standards | |
| 68 committee has recently agreed to mandate that the storage format used | |
| 69 for this type be binary-compatible with the C99 type, i.e. an array | |
| 70 @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]} | |
| 71 parts. (See report | |
| 72 @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf | |
| 73 WG21/N1388}.) Although not part of the official standard as of this | |
| 74 writing, the proposal stated that: ``This solution has been tested with | |
| 75 all current major implementations of the standard library and shown to | |
| 76 be working.'' To the extent that this is true, if you have a variable | |
| 77 @code{complex<double> *x}, you can pass it directly to FFTW via | |
| 78 @code{reinterpret_cast<fftw_complex*>(x)}. | |
| 79 @cindex C++ | |
| 80 @cindex portability | |
| 81 | |
| 82 @c =========> | |
| 83 @node Precision, Memory Allocation, Complex numbers, Data Types and Files | |
| 84 @subsection Precision | |
| 85 @cindex precision | |
| 86 | |
| 87 You can install single and long-double precision versions of FFTW, | |
| 88 which replace @code{double} with @code{float} and @code{long double}, | |
| 89 respectively (@pxref{Installation and Customization}). To use these | |
| 90 interfaces, you: | |
| 91 | |
| 92 @itemize @bullet | |
| 93 | |
| 94 @item | |
| 95 Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or | |
| 96 @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}. (You | |
| 97 can link to the different-precision libraries simultaneously.) | |
| 98 | |
| 99 @item | |
| 100 Include the @emph{same} @code{<fftw3.h>} header file. | |
| 101 | |
| 102 @item | |
| 103 Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or | |
| 104 @samp{fftwl_} for single or long-double precision, respectively. | |
| 105 (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute} | |
| 106 becomes @code{fftwf_execute}, etcetera.) | |
| 107 | |
| 108 @item | |
| 109 Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the | |
| 110 same. | |
| 111 | |
| 112 @item | |
| 113 Replace @code{double} with @code{float} or @code{long double} for | |
| 114 subroutine parameters. | |
| 115 | |
| 116 @end itemize | |
| 117 | |
| 118 Depending upon your compiler and/or hardware, @code{long double} may not | |
| 119 be any more precise than @code{double} (or may not be supported at all, | |
| 120 although it is standard in C99). | |
| 121 @cindex C99 | |
| 122 | |
| 123 | |
| 124 We also support using the nonstandard @code{__float128} | |
| 125 quadruple-precision type provided by recent versions of @code{gcc} on | |
| 126 32- and 64-bit x86 hardware (@pxref{Installation and Customization}). | |
| 127 To use this type, link with @code{-lfftw3q -lquadmath -lm} (the | |
| 128 @code{libquadmath} library provided by @code{gcc} is needed for | |
| 129 quadruple-precision trigonometric functions) and use @samp{fftwq_} | |
| 130 identifiers. | |
| 131 | |
| 132 @c =========> | |
| 133 @node Memory Allocation, , Precision, Data Types and Files | |
| 134 @subsection Memory Allocation | |
| 135 | |
| 136 @example | |
| 137 void *fftw_malloc(size_t n); | |
| 138 void fftw_free(void *p); | |
| 139 @end example | |
| 140 @findex fftw_malloc | |
| 141 @findex fftw_free | |
| 142 | |
| 143 These are functions that behave identically to @code{malloc} and | |
| 144 @code{free}, except that they guarantee that the returned pointer obeys | |
| 145 any special alignment restrictions imposed by any algorithm in FFTW | |
| 146 (e.g. for SIMD acceleration). @xref{SIMD alignment and fftw_malloc}. | |
| 147 @cindex alignment | |
| 148 | |
| 149 | |
| 150 Data allocated by @code{fftw_malloc} @emph{must} be deallocated by | |
| 151 @code{fftw_free} and not by the ordinary @code{free}. | |
| 152 | |
| 153 These routines simply call through to your operating system's | |
| 154 @code{malloc} or, if necessary, its aligned equivalent | |
| 155 (e.g. @code{memalign}), so you normally need not worry about any | |
| 156 significant time or space overhead. You are @emph{not required} to use | |
| 157 them to allocate your data, but we strongly recommend it. | |
| 158 | |
| 159 Note: in C++, just as with ordinary @code{malloc}, you must typecast | |
| 160 the output of @code{fftw_malloc} to whatever pointer type you are | |
| 161 allocating. | |
| 162 @cindex C++ | |
| 163 | |
| 164 | |
| 165 We also provide the following two convenience functions to allocate | |
| 166 real and complex arrays with @code{n} elements, which are equivalent | |
| 167 to @code{(double *) fftw_malloc(sizeof(double) * n)} and | |
| 168 @code{(fftw_complex *) fftw_malloc(sizeof(fftw_complex) * n)}, | |
| 169 respectively: | |
| 170 | |
| 171 @example | |
| 172 double *fftw_alloc_real(size_t n); | |
| 173 fftw_complex *fftw_alloc_complex(size_t n); | |
| 174 @end example | |
| 175 @findex fftw_alloc_real | |
| 176 @findex fftw_alloc_complex | |
| 177 | |
| 178 The equivalent functions in other precisions allocate arrays of @code{n} | |
| 179 elements in that precision. e.g. @code{fftwf_alloc_real(n)} is | |
| 180 equivalent to @code{(float *) fftwf_malloc(sizeof(float) * n)}. | |
| 181 @cindex precision | |
| 182 | |
| 183 @c ------------------------------------------------------------ | |
| 184 @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference | |
| 185 @section Using Plans | |
| 186 | |
| 187 Plans for all transform types in FFTW are stored as type | |
| 188 @code{fftw_plan} (an opaque pointer type), and are created by one of the | |
| 189 various planning routines described in the following sections. | |
| 190 @tindex fftw_plan | |
| 191 An @code{fftw_plan} contains all information necessary to compute the | |
| 192 transform, including the pointers to the input and output arrays. | |
| 193 | |
| 194 @example | |
| 195 void fftw_execute(const fftw_plan plan); | |
| 196 @end example | |
| 197 @findex fftw_execute | |
| 198 | |
| 199 This executes the @code{plan}, to compute the corresponding transform on | |
| 200 the arrays for which it was planned (which must still exist). The plan | |
| 201 is not modified, and @code{fftw_execute} can be called as many times as | |
| 202 desired. | |
| 203 | |
| 204 To apply a given plan to a different array, you can use the new-array execute | |
| 205 interface. @xref{New-array Execute Functions}. | |
| 206 | |
| 207 @code{fftw_execute} (and equivalents) is the only function in FFTW | |
| 208 guaranteed to be thread-safe; see @ref{Thread safety}. | |
| 209 | |
| 210 This function: | |
| 211 @example | |
| 212 void fftw_destroy_plan(fftw_plan plan); | |
| 213 @end example | |
| 214 @findex fftw_destroy_plan | |
| 215 deallocates the @code{plan} and all its associated data. | |
| 216 | |
| 217 FFTW's planner saves some other persistent data, such as the | |
| 218 accumulated wisdom and a list of algorithms available in the current | |
| 219 configuration. If you want to deallocate all of that and reset FFTW | |
| 220 to the pristine state it was in when you started your program, you can | |
| 221 call: | |
| 222 | |
| 223 @example | |
| 224 void fftw_cleanup(void); | |
| 225 @end example | |
| 226 @findex fftw_cleanup | |
| 227 | |
| 228 After calling @code{fftw_cleanup}, all existing plans become undefined, | |
| 229 and you should not attempt to execute them nor to destroy them. You can | |
| 230 however create and execute/destroy new plans, in which case FFTW starts | |
| 231 accumulating wisdom information again. | |
| 232 | |
| 233 @code{fftw_cleanup} does not deallocate your plans, however. To prevent | |
| 234 memory leaks, you must still call @code{fftw_destroy_plan} before | |
| 235 executing @code{fftw_cleanup}. | |
| 236 | |
| 237 Occasionally, it may useful to know FFTW's internal ``cost'' metric | |
| 238 that it uses to compare plans to one another; this cost is | |
| 239 proportional to an execution time of the plan, in undocumented units, | |
| 240 if the plan was created with the @code{FFTW_MEASURE} or other | |
| 241 timing-based options, or alternatively is a heuristic cost function | |
| 242 for @code{FFTW_ESTIMATE} plans. (The cost values of measured and | |
| 243 estimated plans are not comparable, being in different units. Also, | |
| 244 costs from different FFTW versions or the same version compiled | |
| 245 differently may not be in the same units. Plans created from wisdom | |
| 246 have a cost of 0 since no timing measurement is performed for them. | |
| 247 Finally, certain problems for which only one top-level algorithm was | |
| 248 possible may have required no measurements of the cost of the whole | |
| 249 plan, in which case @code{fftw_cost} will also return 0.) The cost | |
| 250 metric for a given plan is returned by: | |
| 251 | |
| 252 @example | |
| 253 double fftw_cost(const fftw_plan plan); | |
| 254 @end example | |
| 255 @findex fftw_cost | |
| 256 | |
| 257 The following two routines are provided purely for academic purposes | |
| 258 (that is, for entertainment). | |
| 259 | |
| 260 @example | |
| 261 void fftw_flops(const fftw_plan plan, | |
| 262 double *add, double *mul, double *fma); | |
| 263 @end example | |
| 264 @findex fftw_flops | |
| 265 | |
| 266 Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an | |
| 267 exact count of the number of floating-point additions, multiplications, | |
| 268 and fused multiply-add operations involved in the plan's execution. The | |
| 269 total number of floating-point operations (flops) is @code{add + mul + | |
| 270 2*fma}, or @code{add + mul + fma} if the hardware supports fused | |
| 271 multiply-add instructions (although the number of FMA operations is only | |
| 272 approximate because of compiler voodoo). (The number of operations | |
| 273 should be an integer, but we use @code{double} to avoid overflowing | |
| 274 @code{int} for large transforms; the arguments are of type @code{double} | |
| 275 even for single and long-double precision versions of FFTW.) | |
| 276 | |
| 277 @example | |
| 278 void fftw_fprint_plan(const fftw_plan plan, FILE *output_file); | |
| 279 void fftw_print_plan(const fftw_plan plan); | |
| 280 @end example | |
| 281 @findex fftw_fprint_plan | |
| 282 @findex fftw_print_plan | |
| 283 | |
| 284 This outputs a ``nerd-readable'' representation of the @code{plan} to | |
| 285 the given file or to @code{stdout}, respectively. | |
| 286 | |
| 287 @c ------------------------------------------------------------ | |
| 288 @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference | |
| 289 @section Basic Interface | |
| 290 @cindex basic interface | |
| 291 | |
| 292 Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est | |
| 293 omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface} | |
| 294 computes a single transform of contiguous data, the @dfn{advanced | |
| 295 interface} computes transforms of multiple or strided arrays, and the | |
| 296 @dfn{guru interface} supports the most general data layouts, | |
| 297 multiplicities, and strides. This section describes the the basic | |
| 298 interface, which we expect to satisfy the needs of most users. | |
| 299 | |
| 300 @menu | |
| 301 * Complex DFTs:: | |
| 302 * Planner Flags:: | |
| 303 * Real-data DFTs:: | |
| 304 * Real-data DFT Array Format:: | |
| 305 * Real-to-Real Transforms:: | |
| 306 * Real-to-Real Transform Kinds:: | |
| 307 @end menu | |
| 308 | |
| 309 @c =========> | |
| 310 @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface | |
| 311 @subsection Complex DFTs | |
| 312 | |
| 313 @example | |
| 314 fftw_plan fftw_plan_dft_1d(int n0, | |
| 315 fftw_complex *in, fftw_complex *out, | |
| 316 int sign, unsigned flags); | |
| 317 fftw_plan fftw_plan_dft_2d(int n0, int n1, | |
| 318 fftw_complex *in, fftw_complex *out, | |
| 319 int sign, unsigned flags); | |
| 320 fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2, | |
| 321 fftw_complex *in, fftw_complex *out, | |
| 322 int sign, unsigned flags); | |
| 323 fftw_plan fftw_plan_dft(int rank, const int *n, | |
| 324 fftw_complex *in, fftw_complex *out, | |
| 325 int sign, unsigned flags); | |
| 326 @end example | |
| 327 @findex fftw_plan_dft_1d | |
| 328 @findex fftw_plan_dft_2d | |
| 329 @findex fftw_plan_dft_3d | |
| 330 @findex fftw_plan_dft | |
| 331 | |
| 332 Plan a complex input/output discrete Fourier transform (DFT) in zero or | |
| 333 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). | |
| 334 | |
| 335 Once you have created a plan for a certain transform type and | |
| 336 parameters, then creating another plan of the same type and parameters, | |
| 337 but for different arrays, is fast and shares constant data with the | |
| 338 first plan (if it still exists). | |
| 339 | |
| 340 The planner returns @code{NULL} if the plan cannot be created. In the | |
| 341 standard FFTW distribution, the basic interface is guaranteed to return | |
| 342 a non-@code{NULL} plan. A plan may be @code{NULL}, however, if you are | |
| 343 using a customized FFTW configuration supporting a restricted set of | |
| 344 transforms. | |
| 345 | |
| 346 @subsubheading Arguments | |
| 347 @itemize @bullet | |
| 348 | |
| 349 @item | |
| 350 @code{rank} is the rank of the transform (it should be the size of the | |
| 351 array @code{*n}), and can be any non-negative integer. (@xref{Complex | |
| 352 Multi-Dimensional DFTs}, for the definition of ``rank''.) The | |
| 353 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a | |
| 354 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank | |
| 355 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a | |
| 356 copy of one number from input to output. | |
| 357 | |
| 358 @item | |
| 359 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate | |
| 360 for each routine) specify the size of the transform dimensions. They | |
| 361 can be any positive integer. | |
| 362 | |
| 363 @itemize @minus | |
| 364 @item | |
| 365 @cindex row-major | |
| 366 Multi-dimensional arrays are stored in row-major order with dimensions: | |
| 367 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or | |
| 368 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. | |
| 369 @xref{Multi-dimensional Array Format}. | |
| 370 @item | |
| 371 FFTW is best at handling sizes of the form | |
| 372 @ifinfo | |
| 373 @math{2^a 3^b 5^c 7^d 11^e 13^f}, | |
| 374 @end ifinfo | |
| 375 @tex | |
| 376 $2^a 3^b 5^c 7^d 11^e 13^f$, | |
| 377 @end tex | |
| 378 @html | |
| 379 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup> | |
| 380 11<sup>e</sup> 13<sup>f</sup>, | |
| 381 @end html | |
| 382 where @math{e+f} is either @math{0} or @math{1}, and the other exponents | |
| 383 are arbitrary. Other sizes are computed by means of a slow, | |
| 384 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). It is possible to customize FFTW | |
| 385 for different array sizes; see @ref{Installation and Customization}. | |
| 386 Transforms whose sizes are powers of @math{2} are especially fast. | |
| 387 @end itemize | |
| 388 | |
| 389 @item | |
| 390 @code{in} and @code{out} point to the input and output arrays of the | |
| 391 transform, which may be the same (yielding an in-place transform). | |
| 392 @cindex in-place | |
| 393 These arrays are overwritten during planning, unless | |
| 394 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be | |
| 395 initialized, but they must be allocated.) | |
| 396 | |
| 397 If @code{in == out}, the transform is @dfn{in-place} and the input | |
| 398 array is overwritten. If @code{in != out}, the two arrays must | |
| 399 not overlap (but FFTW does not check for this condition). | |
| 400 | |
| 401 @item | |
| 402 @ctindex FFTW_FORWARD | |
| 403 @ctindex FFTW_BACKWARD | |
| 404 @code{sign} is the sign of the exponent in the formula that defines the | |
| 405 Fourier transform. It can be @math{-1} (= @code{FFTW_FORWARD}) or | |
| 406 @math{+1} (= @code{FFTW_BACKWARD}). | |
| 407 | |
| 408 @item | |
| 409 @cindex flags | |
| 410 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | |
| 411 as defined in @ref{Planner Flags}. | |
| 412 | |
| 413 @end itemize | |
| 414 | |
| 415 FFTW computes an unnormalized transform: computing a forward followed by | |
| 416 a backward transform (or vice versa) will result in the original data | |
| 417 multiplied by the size of the transform (the product of the dimensions). | |
| 418 @cindex normalization | |
| 419 For more information, see @ref{What FFTW Really Computes}. | |
| 420 | |
| 421 @c =========> | |
| 422 @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface | |
| 423 @subsection Planner Flags | |
| 424 | |
| 425 All of the planner routines in FFTW accept an integer @code{flags} | |
| 426 argument, which is a bitwise OR (@samp{|}) of zero or more of the flag | |
| 427 constants defined below. These flags control the rigor (and time) of | |
| 428 the planning process, and can also impose (or lift) restrictions on the | |
| 429 type of transform algorithm that is employed. | |
| 430 | |
| 431 @emph{Important:} the planner overwrites the input array during | |
| 432 planning unless a saved plan (@pxref{Wisdom}) is available for that | |
| 433 problem, so you should initialize your input data after creating the | |
| 434 plan. The only exceptions to this are the @code{FFTW_ESTIMATE} and | |
| 435 @code{FFTW_WISDOM_ONLY} flags, as mentioned below. | |
| 436 | |
| 437 In all cases, if wisdom is available for the given problem that was | |
| 438 created with equal-or-greater planning rigor, then the more rigorous | |
| 439 wisdom is used. For example, in @code{FFTW_ESTIMATE} mode any available | |
| 440 wisdom is used, whereas in @code{FFTW_PATIENT} mode only wisdom created | |
| 441 in patient or exhaustive mode can be used. @xref{Words of Wisdom-Saving | |
| 442 Plans}. | |
| 443 | |
| 444 @subsubheading Planning-rigor flags | |
| 445 @itemize @bullet | |
| 446 | |
| 447 @item | |
| 448 @ctindex FFTW_ESTIMATE | |
| 449 @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of | |
| 450 different algorithms, a simple heuristic is used to pick a (probably | |
| 451 sub-optimal) plan quickly. With this flag, the input/output arrays are | |
| 452 not overwritten during planning. | |
| 453 | |
| 454 @item | |
| 455 @ctindex FFTW_MEASURE | |
| 456 @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually | |
| 457 @emph{computing} several FFTs and measuring their execution time. | |
| 458 Depending on your machine, this can take some time (often a few | |
| 459 seconds). @code{FFTW_MEASURE} is the default planning option. | |
| 460 | |
| 461 @item | |
| 462 @ctindex FFTW_PATIENT | |
| 463 @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider | |
| 464 range of algorithms and often produces a ``more optimal'' plan | |
| 465 (especially for large transforms), but at the expense of several times | |
| 466 longer planning time (especially for large transforms). | |
| 467 | |
| 468 @item | |
| 469 @ctindex FFTW_EXHAUSTIVE | |
| 470 @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an | |
| 471 even wider range of algorithms, including many that we think are | |
| 472 unlikely to be fast, to produce the most optimal plan but with a | |
| 473 substantially increased planning time. | |
| 474 | |
| 475 @item | |
| 476 @ctindex FFTW_WISDOM_ONLY | |
| 477 @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan | |
| 478 is only created if wisdom is available for the given problem, and | |
| 479 otherwise a @code{NULL} plan is returned. This can be combined with | |
| 480 other flags, e.g. @samp{FFTW_WISDOM_ONLY | FFTW_PATIENT} creates a | |
| 481 plan only if wisdom is available that was created in | |
| 482 @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode. The | |
| 483 @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect | |
| 484 whether wisdom is available; for example, if wisdom is not available | |
| 485 one may wish to allocate new arrays for planning so that user data is | |
| 486 not overwritten. | |
| 487 | |
| 488 @end itemize | |
| 489 | |
| 490 @subsubheading Algorithm-restriction flags | |
| 491 @itemize @bullet | |
| 492 | |
| 493 @item | |
| 494 @ctindex FFTW_DESTROY_INPUT | |
| 495 @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is | |
| 496 allowed to @emph{overwrite its input} array with arbitrary data; this | |
| 497 can sometimes allow more efficient algorithms to be employed. | |
| 498 @cindex out-of-place | |
| 499 | |
| 500 @item | |
| 501 @ctindex FFTW_PRESERVE_INPUT | |
| 502 @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must | |
| 503 @emph{not change its input} array. This is ordinarily the | |
| 504 @emph{default}, except for c2r and hc2r (i.e. complex-to-real) | |
| 505 transforms for which @code{FFTW_DESTROY_INPUT} is the default. In the | |
| 506 latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use | |
| 507 algorithms that do not destroy the input, at the expense of worse | |
| 508 performance; for multi-dimensional c2r transforms, however, no | |
| 509 input-preserving algorithms are implemented and the planner will return | |
| 510 @code{NULL} if one is requested. | |
| 511 @cindex c2r | |
| 512 @cindex hc2r | |
| 513 | |
| 514 @item | |
| 515 @ctindex FFTW_UNALIGNED | |
| 516 @cindex alignment | |
| 517 @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any | |
| 518 unusual alignment requirements on the input/output arrays (i.e. no | |
| 519 SIMD may be used). This flag is normally @emph{not necessary}, since | |
| 520 the planner automatically detects misaligned arrays. The only use for | |
| 521 this flag is if you want to use the new-array execute interface to | |
| 522 execute a given plan on a different array that may not be aligned like | |
| 523 the original. (Using @code{fftw_malloc} makes this flag unnecessary | |
| 524 even then.) | |
| 525 | |
| 526 @end itemize | |
| 527 | |
| 528 @subsubheading Limiting planning time | |
| 529 | |
| 530 @example | |
| 531 extern void fftw_set_timelimit(double seconds); | |
| 532 @end example | |
| 533 @findex fftw_set_timelimit | |
| 534 | |
| 535 This function instructs FFTW to spend at most @code{seconds} seconds | |
| 536 (approximately) in the planner. If @code{seconds == | |
| 537 FFTW_NO_TIMELIMIT} (the default value, which is negative), then | |
| 538 planning time is unbounded. Otherwise, FFTW plans with a | |
| 539 progressively wider range of algorithms until the the given time limit | |
| 540 is reached or the given range of algorithms is explored, returning the | |
| 541 best available plan. | |
| 542 @ctindex FFTW_NO_TIMELIMIT | |
| 543 | |
| 544 | |
| 545 For example, specifying @code{FFTW_PATIENT} first plans in | |
| 546 @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then | |
| 547 finally (time permitting) in @code{FFTW_PATIENT}. If | |
| 548 @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further | |
| 549 progress to @code{FFTW_EXHAUSTIVE} mode. | |
| 550 | |
| 551 Note that the @code{seconds} argument specifies only a rough limit; in | |
| 552 practice, the planner may use somewhat more time if the time limit is | |
| 553 reached when the planner is in the middle of an operation that cannot | |
| 554 be interrupted. At the very least, the planner will complete planning | |
| 555 in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit | |
| 556 of 0). | |
| 557 | |
| 558 | |
| 559 @c =========> | |
| 560 @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface | |
| 561 @subsection Real-data DFTs | |
| 562 | |
| 563 @example | |
| 564 fftw_plan fftw_plan_dft_r2c_1d(int n0, | |
| 565 double *in, fftw_complex *out, | |
| 566 unsigned flags); | |
| 567 fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1, | |
| 568 double *in, fftw_complex *out, | |
| 569 unsigned flags); | |
| 570 fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2, | |
| 571 double *in, fftw_complex *out, | |
| 572 unsigned flags); | |
| 573 fftw_plan fftw_plan_dft_r2c(int rank, const int *n, | |
| 574 double *in, fftw_complex *out, | |
| 575 unsigned flags); | |
| 576 @end example | |
| 577 @findex fftw_plan_dft_r2c_1d | |
| 578 @findex fftw_plan_dft_r2c_2d | |
| 579 @findex fftw_plan_dft_r2c_3d | |
| 580 @findex fftw_plan_dft_r2c | |
| 581 @cindex r2c | |
| 582 | |
| 583 Plan a real-input/complex-output discrete Fourier transform (DFT) in | |
| 584 zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using | |
| 585 Plans}). | |
| 586 | |
| 587 Once you have created a plan for a certain transform type and | |
| 588 parameters, then creating another plan of the same type and parameters, | |
| 589 but for different arrays, is fast and shares constant data with the | |
| 590 first plan (if it still exists). | |
| 591 | |
| 592 The planner returns @code{NULL} if the plan cannot be created. A | |
| 593 non-@code{NULL} plan is always returned by the basic interface unless | |
| 594 you are using a customized FFTW configuration supporting a restricted | |
| 595 set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag | |
| 596 with a multi-dimensional out-of-place c2r transform (see below). | |
| 597 | |
| 598 @subsubheading Arguments | |
| 599 @itemize @bullet | |
| 600 | |
| 601 @item | |
| 602 @code{rank} is the rank of the transform (it should be the size of the | |
| 603 array @code{*n}), and can be any non-negative integer. (@xref{Complex | |
| 604 Multi-Dimensional DFTs}, for the definition of ``rank''.) The | |
| 605 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a | |
| 606 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank | |
| 607 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a | |
| 608 copy of one real number (with zero imaginary part) from input to output. | |
| 609 | |
| 610 @item | |
| 611 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate | |
| 612 for each routine) specify the size of the transform dimensions. They | |
| 613 can be any positive integer. This is different in general from the | |
| 614 @emph{physical} array dimensions, which are described in @ref{Real-data | |
| 615 DFT Array Format}. | |
| 616 | |
| 617 @itemize @minus | |
| 618 @item | |
| 619 FFTW is best at handling sizes of the form | |
| 620 @ifinfo | |
| 621 @math{2^a 3^b 5^c 7^d 11^e 13^f}, | |
| 622 @end ifinfo | |
| 623 @tex | |
| 624 $2^a 3^b 5^c 7^d 11^e 13^f$, | |
| 625 @end tex | |
| 626 @html | |
| 627 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup> | |
| 628 11<sup>e</sup> 13<sup>f</sup>, | |
| 629 @end html | |
| 630 where @math{e+f} is either @math{0} or @math{1}, and the other exponents | |
| 631 are arbitrary. Other sizes are computed by means of a slow, | |
| 632 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW | |
| 633 for different array sizes; see @ref{Installation and Customization}.) | |
| 634 Transforms whose sizes are powers of @math{2} are especially fast, and | |
| 635 it is generally beneficial for the @emph{last} dimension of an r2c/c2r | |
| 636 transform to be @emph{even}. | |
| 637 @end itemize | |
| 638 | |
| 639 @item | |
| 640 @code{in} and @code{out} point to the input and output arrays of the | |
| 641 transform, which may be the same (yielding an in-place transform). | |
| 642 @cindex in-place | |
| 643 These arrays are overwritten during planning, unless | |
| 644 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be | |
| 645 initialized, but they must be allocated.) For an in-place transform, it | |
| 646 is important to remember that the real array will require padding, | |
| 647 described in @ref{Real-data DFT Array Format}. | |
| 648 @cindex padding | |
| 649 | |
| 650 @item | |
| 651 @cindex flags | |
| 652 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | |
| 653 as defined in @ref{Planner Flags}. | |
| 654 | |
| 655 @end itemize | |
| 656 | |
| 657 The inverse transforms, taking complex input (storing the non-redundant | |
| 658 half of a logically Hermitian array) to real output, are given by: | |
| 659 | |
| 660 @example | |
| 661 fftw_plan fftw_plan_dft_c2r_1d(int n0, | |
| 662 fftw_complex *in, double *out, | |
| 663 unsigned flags); | |
| 664 fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1, | |
| 665 fftw_complex *in, double *out, | |
| 666 unsigned flags); | |
| 667 fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2, | |
| 668 fftw_complex *in, double *out, | |
| 669 unsigned flags); | |
| 670 fftw_plan fftw_plan_dft_c2r(int rank, const int *n, | |
| 671 fftw_complex *in, double *out, | |
| 672 unsigned flags); | |
| 673 @end example | |
| 674 @findex fftw_plan_dft_c2r_1d | |
| 675 @findex fftw_plan_dft_c2r_2d | |
| 676 @findex fftw_plan_dft_c2r_3d | |
| 677 @findex fftw_plan_dft_c2r | |
| 678 @cindex c2r | |
| 679 | |
| 680 The arguments are the same as for the r2c transforms, except that the | |
| 681 input and output data formats are reversed. | |
| 682 | |
| 683 FFTW computes an unnormalized transform: computing an r2c followed by a | |
| 684 c2r transform (or vice versa) will result in the original data | |
| 685 multiplied by the size of the transform (the product of the logical | |
| 686 dimensions). | |
| 687 @cindex normalization | |
| 688 An r2c transform produces the same output as a @code{FFTW_FORWARD} | |
| 689 complex DFT of the same input, and a c2r transform is correspondingly | |
| 690 equivalent to @code{FFTW_BACKWARD}. For more information, see @ref{What | |
| 691 FFTW Really Computes}. | |
| 692 | |
| 693 @c =========> | |
| 694 @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface | |
| 695 @subsection Real-data DFT Array Format | |
| 696 @cindex r2c/c2r multi-dimensional array format | |
| 697 | |
| 698 The output of a DFT of real data (r2c) contains symmetries that, in | |
| 699 principle, make half of the outputs redundant (@pxref{What FFTW Really | |
| 700 Computes}). (Similarly for the input of an inverse c2r transform.) In | |
| 701 practice, it is not possible to entirely realize these savings in an | |
| 702 efficient and understandable format that generalizes to | |
| 703 multi-dimensional transforms. Instead, the output of the r2c | |
| 704 transforms is @emph{slightly} over half of the output of the | |
| 705 corresponding complex transform. We do not ``pack'' the data in any | |
| 706 way, but store it as an ordinary array of @code{fftw_complex} values. | |
| 707 In fact, this data is simply a subsection of what would be the array in | |
| 708 the corresponding complex transform. | |
| 709 | |
| 710 Specifically, for a real transform of @math{d} (= @code{rank}) | |
| 711 dimensions @ndims{}, the complex data is an @ndimshalf array of | |
| 712 @code{fftw_complex} values in row-major order (with the division rounded | |
| 713 down). That is, we only store the @emph{lower} half (non-negative | |
| 714 frequencies), plus one element, of the last dimension of the data from | |
| 715 the ordinary complex transform. (We could have instead taken half of | |
| 716 any other dimension, but implementation turns out to be simpler if the | |
| 717 last, contiguous, dimension is used.) | |
| 718 | |
| 719 @cindex out-of-place | |
| 720 For an out-of-place transform, the real data is simply an array with | |
| 721 physical dimensions @ndims in row-major order. | |
| 722 | |
| 723 @cindex in-place | |
| 724 @cindex padding | |
| 725 For an in-place transform, some complications arise since the complex data | |
| 726 is slightly larger than the real data. In this case, the final | |
| 727 dimension of the real data must be @emph{padded} with extra values to | |
| 728 accommodate the size of the complex data---two extra if the last | |
| 729 dimension is even and one if it is odd. That is, the last dimension of | |
| 730 the real data must physically contain | |
| 731 @tex | |
| 732 $2 (n_{d-1}/2+1)$ | |
| 733 @end tex | |
| 734 @ifinfo | |
| 735 2 * (n[d-1]/2+1) | |
| 736 @end ifinfo | |
| 737 @html | |
| 738 2 * (n<sub>d-1</sub>/2+1) | |
| 739 @end html | |
| 740 @code{double} values (exactly enough to hold the complex data). This | |
| 741 physical array size does not, however, change the @emph{logical} array | |
| 742 size---only | |
| 743 @tex | |
| 744 $n_{d-1}$ | |
| 745 @end tex | |
| 746 @ifinfo | |
| 747 n[d-1] | |
| 748 @end ifinfo | |
| 749 @html | |
| 750 n<sub>d-1</sub> | |
| 751 @end html | |
| 752 values are actually stored in the last dimension, and | |
| 753 @tex | |
| 754 $n_{d-1}$ | |
| 755 @end tex | |
| 756 @ifinfo | |
| 757 n[d-1] | |
| 758 @end ifinfo | |
| 759 @html | |
| 760 n<sub>d-1</sub> | |
| 761 @end html | |
| 762 is the last dimension passed to the planner. | |
| 763 | |
| 764 @c =========> | |
| 765 @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface | |
| 766 @subsection Real-to-Real Transforms | |
| 767 @cindex r2r | |
| 768 | |
| 769 @example | |
| 770 fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out, | |
| 771 fftw_r2r_kind kind, unsigned flags); | |
| 772 fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out, | |
| 773 fftw_r2r_kind kind0, fftw_r2r_kind kind1, | |
| 774 unsigned flags); | |
| 775 fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2, | |
| 776 double *in, double *out, | |
| 777 fftw_r2r_kind kind0, | |
| 778 fftw_r2r_kind kind1, | |
| 779 fftw_r2r_kind kind2, | |
| 780 unsigned flags); | |
| 781 fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out, | |
| 782 const fftw_r2r_kind *kind, unsigned flags); | |
| 783 @end example | |
| 784 @findex fftw_plan_r2r_1d | |
| 785 @findex fftw_plan_r2r_2d | |
| 786 @findex fftw_plan_r2r_3d | |
| 787 @findex fftw_plan_r2r | |
| 788 | |
| 789 Plan a real input/output (r2r) transform of various kinds in zero or | |
| 790 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). | |
| 791 | |
| 792 Once you have created a plan for a certain transform type and | |
| 793 parameters, then creating another plan of the same type and parameters, | |
| 794 but for different arrays, is fast and shares constant data with the | |
| 795 first plan (if it still exists). | |
| 796 | |
| 797 The planner returns @code{NULL} if the plan cannot be created. A | |
| 798 non-@code{NULL} plan is always returned by the basic interface unless | |
| 799 you are using a customized FFTW configuration supporting a restricted | |
| 800 set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are | |
| 801 not defined). | |
| 802 @ctindex FFTW_REDFT00 | |
| 803 | |
| 804 @subsubheading Arguments | |
| 805 @itemize @bullet | |
| 806 | |
| 807 @item | |
| 808 @code{rank} is the dimensionality of the transform (it should be the | |
| 809 size of the arrays @code{*n} and @code{*kind}), and can be any | |
| 810 non-negative integer. The @samp{_1d}, @samp{_2d}, and @samp{_3d} | |
| 811 planners correspond to a @code{rank} of @code{1}, @code{2}, and | |
| 812 @code{3}, respectively. A @code{rank} of zero is equivalent to a copy | |
| 813 of one number from input to output. | |
| 814 | |
| 815 @item | |
| 816 @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]}, | |
| 817 respectively, gives the (physical) size of the transform dimensions. | |
| 818 They can be any positive integer. | |
| 819 | |
| 820 @itemize @minus | |
| 821 @item | |
| 822 @cindex row-major | |
| 823 Multi-dimensional arrays are stored in row-major order with dimensions: | |
| 824 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or | |
| 825 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. | |
| 826 @xref{Multi-dimensional Array Format}. | |
| 827 @item | |
| 828 FFTW is generally best at handling sizes of the form | |
| 829 @ifinfo | |
| 830 @math{2^a 3^b 5^c 7^d 11^e 13^f}, | |
| 831 @end ifinfo | |
| 832 @tex | |
| 833 $2^a 3^b 5^c 7^d 11^e 13^f$, | |
| 834 @end tex | |
| 835 @html | |
| 836 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup> | |
| 837 11<sup>e</sup> 13<sup>f</sup>, | |
| 838 @end html | |
| 839 where @math{e+f} is either @math{0} or @math{1}, and the other exponents | |
| 840 are arbitrary. Other sizes are computed by means of a slow, | |
| 841 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW | |
| 842 for different array sizes; see @ref{Installation and Customization}.) | |
| 843 Transforms whose sizes are powers of @math{2} are especially fast. | |
| 844 @item | |
| 845 For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of | |
| 846 size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that | |
| 847 should be factorizable in the above form. | |
| 848 @end itemize | |
| 849 | |
| 850 @item | |
| 851 @code{in} and @code{out} point to the input and output arrays of the | |
| 852 transform, which may be the same (yielding an in-place transform). | |
| 853 @cindex in-place | |
| 854 These arrays are overwritten during planning, unless | |
| 855 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be | |
| 856 initialized, but they must be allocated.) | |
| 857 | |
| 858 @item | |
| 859 @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or | |
| 860 @code{kind[rank]}, is the kind of r2r transform used for the | |
| 861 corresponding dimension. The valid kind constants are described in | |
| 862 @ref{Real-to-Real Transform Kinds}. In a multi-dimensional transform, | |
| 863 what is computed is the separable product formed by taking each | |
| 864 transform kind along the corresponding dimension, one dimension after | |
| 865 another. | |
| 866 | |
| 867 @item | |
| 868 @cindex flags | |
| 869 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | |
| 870 as defined in @ref{Planner Flags}. | |
| 871 | |
| 872 @end itemize | |
| 873 | |
| 874 @c =========> | |
| 875 @node Real-to-Real Transform Kinds, , Real-to-Real Transforms, Basic Interface | |
| 876 @subsection Real-to-Real Transform Kinds | |
| 877 @cindex kind (r2r) | |
| 878 | |
| 879 FFTW currently supports 11 different r2r transform kinds, specified by | |
| 880 one of the constants below. For the precise definitions of these | |
| 881 transforms, see @ref{What FFTW Really Computes}. For a more colloquial | |
| 882 introduction to these transform kinds, see @ref{More DFTs of Real Data}. | |
| 883 | |
| 884 For dimension of size @code{n}, there is a corresponding ``logical'' | |
| 885 dimension @code{N} that determines the normalization (and the optimal | |
| 886 factorization); the formula for @code{N} is given for each kind below. | |
| 887 Also, with each transform kind is listed its corrsponding inverse | |
| 888 transform. FFTW computes unnormalized transforms: a transform followed | |
| 889 by its inverse will result in the original data multiplied by @code{N} | |
| 890 (or the product of the @code{N}'s for each dimension, in | |
| 891 multi-dimensions). | |
| 892 @cindex normalization | |
| 893 | |
| 894 @itemize @bullet | |
| 895 | |
| 896 @item | |
| 897 @ctindex FFTW_R2HC | |
| 898 @code{FFTW_R2HC} computes a real-input DFT with output in | |
| 899 ``halfcomplex'' format, i.e. real and imaginary parts for a transform of | |
| 900 size @code{n} stored as: | |
| 901 @tex | |
| 902 $$ | |
| 903 r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1 | |
| 904 $$ | |
| 905 @end tex | |
| 906 @ifinfo | |
| 907 r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1 | |
| 908 @end ifinfo | |
| 909 @html | |
| 910 <p align=center> | |
| 911 r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub> | |
| 912 </p> | |
| 913 @end html | |
| 914 (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.) | |
| 915 | |
| 916 @item | |
| 917 @ctindex FFTW_HC2R | |
| 918 @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above. | |
| 919 (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.) | |
| 920 | |
| 921 @item | |
| 922 @ctindex FFTW_DHT | |
| 923 @code{FFTW_DHT} computes a discrete Hartley transform. | |
| 924 (Logical @code{N=n}, inverse is @code{FFTW_DHT}.) | |
| 925 @cindex discrete Hartley transform | |
| 926 | |
| 927 @item | |
| 928 @ctindex FFTW_REDFT00 | |
| 929 @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I. | |
| 930 (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.) | |
| 931 @cindex discrete cosine transform | |
| 932 @cindex DCT | |
| 933 | |
| 934 @item | |
| 935 @ctindex FFTW_REDFT10 | |
| 936 @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT). | |
| 937 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.) | |
| 938 | |
| 939 @item | |
| 940 @ctindex FFTW_REDFT01 | |
| 941 @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II). | |
| 942 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.) | |
| 943 @cindex IDCT | |
| 944 | |
| 945 @item | |
| 946 @ctindex FFTW_REDFT11 | |
| 947 @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV. | |
| 948 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.) | |
| 949 | |
| 950 @item | |
| 951 @ctindex FFTW_RODFT00 | |
| 952 @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I. | |
| 953 (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.) | |
| 954 @cindex discrete sine transform | |
| 955 @cindex DST | |
| 956 | |
| 957 @item | |
| 958 @ctindex FFTW_RODFT10 | |
| 959 @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II. | |
| 960 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.) | |
| 961 | |
| 962 @item | |
| 963 @ctindex FFTW_RODFT01 | |
| 964 @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III. | |
| 965 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.) | |
| 966 | |
| 967 @item | |
| 968 @ctindex FFTW_RODFT11 | |
| 969 @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV. | |
| 970 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.) | |
| 971 | |
| 972 @end itemize | |
| 973 | |
| 974 @c ------------------------------------------------------------ | |
| 975 @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference | |
| 976 @section Advanced Interface | |
| 977 @cindex advanced interface | |
| 978 | |
| 979 FFTW's ``advanced'' interface supplements the basic interface with four | |
| 980 new planner routines, providing a new level of flexibility: you can plan | |
| 981 a transform of multiple arrays simultaneously, operate on non-contiguous | |
| 982 (strided) data, and transform a subset of a larger multi-dimensional | |
| 983 array. Other than these additional features, the planner operates in | |
| 984 the same fashion as in the basic interface, and the resulting | |
| 985 @code{fftw_plan} is used in the same way (@pxref{Using Plans}). | |
| 986 | |
| 987 @menu | |
| 988 * Advanced Complex DFTs:: | |
| 989 * Advanced Real-data DFTs:: | |
| 990 * Advanced Real-to-real Transforms:: | |
| 991 @end menu | |
| 992 | |
| 993 @c =========> | |
| 994 @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface | |
| 995 @subsection Advanced Complex DFTs | |
| 996 | |
| 997 @example | |
| 998 fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany, | |
| 999 fftw_complex *in, const int *inembed, | |
| 1000 int istride, int idist, | |
| 1001 fftw_complex *out, const int *onembed, | |
| 1002 int ostride, int odist, | |
| 1003 int sign, unsigned flags); | |
| 1004 @end example | |
| 1005 @findex fftw_plan_many_dft | |
| 1006 | |
| 1007 This routine plans multiple multidimensional complex DFTs, and it | |
| 1008 extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to | |
| 1009 compute @code{howmany} transforms, each having rank @code{rank} and size | |
| 1010 @code{n}. In addition, the transform data need not be contiguous, but | |
| 1011 it may be laid out in memory with an arbitrary stride. To account for | |
| 1012 these possibilities, @code{fftw_plan_many_dft} adds the new parameters | |
| 1013 @code{howmany}, @{@code{i},@code{o}@}@code{nembed}, | |
| 1014 @{@code{i},@code{o}@}@code{stride}, and | |
| 1015 @{@code{i},@code{o}@}@code{dist}. The FFTW basic interface | |
| 1016 (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2, | |
| 1017 and@tie{}3, but the advanced interface handles only the general-rank | |
| 1018 case. | |
| 1019 | |
| 1020 @code{howmany} is the number of transforms to compute. The resulting | |
| 1021 plan computes @code{howmany} transforms, where the input of the | |
| 1022 @code{k}-th transform is at location @code{in+k*idist} (in C pointer | |
| 1023 arithmetic), and its output is at location @code{out+k*odist}. Plans | |
| 1024 obtained in this way can often be faster than calling FFTW multiple | |
| 1025 times for the individual transforms. The basic @code{fftw_plan_dft} | |
| 1026 interface corresponds to @code{howmany=1} (in which case the @code{dist} | |
| 1027 parameters are ignored). | |
| 1028 @cindex howmany parameter | |
| 1029 @cindex dist | |
| 1030 | |
| 1031 | |
| 1032 Each of the @code{howmany} transforms has rank @code{rank} and size | |
| 1033 @code{n}, as in the basic interface. In addition, the advanced | |
| 1034 interface allows the input and output arrays of each transform to be | |
| 1035 row-major subarrays of larger rank-@code{rank} arrays, described by | |
| 1036 @code{inembed} and @code{onembed} parameters, respectively. | |
| 1037 @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank}, | |
| 1038 and @code{n} should be elementwise less than or equal to | |
| 1039 @{@code{i},@code{o}@}@code{nembed}. Passing @code{NULL} for an | |
| 1040 @code{nembed} parameter is equivalent to passing @code{n} (i.e. same | |
| 1041 physical and logical dimensions, as in the basic interface.) | |
| 1042 | |
| 1043 The @code{stride} parameters indicate that the @code{j}-th element of | |
| 1044 the input or output arrays is located at @code{j*istride} or | |
| 1045 @code{j*ostride}, respectively. (For a multi-dimensional array, | |
| 1046 @code{j} is the ordinary row-major index.) When combined with the | |
| 1047 @code{k}-th transform in a @code{howmany} loop, from above, this means | |
| 1048 that the (@code{j},@code{k})-th element is at @code{j*stride+k*dist}. | |
| 1049 (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.) | |
| 1050 @cindex stride | |
| 1051 | |
| 1052 | |
| 1053 For in-place transforms, the input and output @code{stride} and | |
| 1054 @code{dist} parameters should be the same; otherwise, the planner may | |
| 1055 return @code{NULL}. | |
| 1056 | |
| 1057 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after | |
| 1058 this function returns. You can safely free or reuse them. | |
| 1059 | |
| 1060 @strong{Examples}: | |
| 1061 One transform of one 5 by 6 array contiguous in memory: | |
| 1062 @example | |
| 1063 int rank = 2; | |
| 1064 int n[] = @{5, 6@}; | |
| 1065 int howmany = 1; | |
| 1066 int idist = odist = 0; /* unused because howmany = 1 */ | |
| 1067 int istride = ostride = 1; /* array is contiguous in memory */ | |
| 1068 int *inembed = n, *onembed = n; | |
| 1069 @end example | |
| 1070 | |
| 1071 Transform of three 5 by 6 arrays, each contiguous in memory, | |
| 1072 stored in memory one after another: | |
| 1073 @example | |
| 1074 int rank = 2; | |
| 1075 int n[] = @{5, 6@}; | |
| 1076 int howmany = 3; | |
| 1077 int idist = odist = n[0]*n[1]; /* = 30, the distance in memory | |
| 1078 between the first element | |
| 1079 of the first array and the | |
| 1080 first element of the second array */ | |
| 1081 int istride = ostride = 1; /* array is contiguous in memory */ | |
| 1082 int *inembed = n, *onembed = n; | |
| 1083 @end example | |
| 1084 | |
| 1085 Transform each column of a 2d array with 10 rows and 3 columns: | |
| 1086 @example | |
| 1087 int rank = 1; /* not 2: we are computing 1d transforms */ | |
| 1088 int n[] = @{10@}; /* 1d transforms of length 10 */ | |
| 1089 int howmany = 3; | |
| 1090 int idist = odist = 1; | |
| 1091 int istride = ostride = 3; /* distance between two elements in | |
| 1092 the same column */ | |
| 1093 int *inembed = n, *onembed = n; | |
| 1094 @end example | |
| 1095 | |
| 1096 @c =========> | |
| 1097 @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface | |
| 1098 @subsection Advanced Real-data DFTs | |
| 1099 | |
| 1100 @example | |
| 1101 fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany, | |
| 1102 double *in, const int *inembed, | |
| 1103 int istride, int idist, | |
| 1104 fftw_complex *out, const int *onembed, | |
| 1105 int ostride, int odist, | |
| 1106 unsigned flags); | |
| 1107 fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany, | |
| 1108 fftw_complex *in, const int *inembed, | |
| 1109 int istride, int idist, | |
| 1110 double *out, const int *onembed, | |
| 1111 int ostride, int odist, | |
| 1112 unsigned flags); | |
| 1113 @end example | |
| 1114 @findex fftw_plan_many_dft_r2c | |
| 1115 @findex fftw_plan_many_dft_c2r | |
| 1116 | |
| 1117 Like @code{fftw_plan_many_dft}, these two functions add @code{howmany}, | |
| 1118 @code{nembed}, @code{stride}, and @code{dist} parameters to the | |
| 1119 @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but | |
| 1120 otherwise behave the same as the basic interface. | |
| 1121 | |
| 1122 The interpretation of @code{howmany}, @code{stride}, and @code{dist} are | |
| 1123 the same as for @code{fftw_plan_many_dft}, above. Note that the | |
| 1124 @code{stride} and @code{dist} for the real array are in units of | |
| 1125 @code{double}, and for the complex array are in units of | |
| 1126 @code{fftw_complex}. | |
| 1127 | |
| 1128 If an @code{nembed} parameter is @code{NULL}, it is interpreted as what | |
| 1129 it would be in the basic interface, as described in @ref{Real-data DFT | |
| 1130 Array Format}. That is, for the complex array the size is assumed to be | |
| 1131 the same as @code{n}, but with the last dimension cut roughly in half. | |
| 1132 For the real array, the size is assumed to be @code{n} if the transform | |
| 1133 is out-of-place, or @code{n} with the last dimension ``padded'' if the | |
| 1134 transform is in-place. | |
| 1135 | |
| 1136 If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as | |
| 1137 the physical size of the corresponding array, in row-major order, just | |
| 1138 as for @code{fftw_plan_many_dft}. In this case, each dimension of | |
| 1139 @code{nembed} should be @code{>=} what it would be in the basic | |
| 1140 interface (e.g. the halved or padded @code{n}). | |
| 1141 | |
| 1142 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after | |
| 1143 this function returns. You can safely free or reuse them. | |
| 1144 | |
| 1145 @c =========> | |
| 1146 @node Advanced Real-to-real Transforms, , Advanced Real-data DFTs, Advanced Interface | |
| 1147 @subsection Advanced Real-to-real Transforms | |
| 1148 | |
| 1149 @example | |
| 1150 fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany, | |
| 1151 double *in, const int *inembed, | |
| 1152 int istride, int idist, | |
| 1153 double *out, const int *onembed, | |
| 1154 int ostride, int odist, | |
| 1155 const fftw_r2r_kind *kind, unsigned flags); | |
| 1156 @end example | |
| 1157 @findex fftw_plan_many_r2r | |
| 1158 | |
| 1159 Like @code{fftw_plan_many_dft}, this functions adds @code{howmany}, | |
| 1160 @code{nembed}, @code{stride}, and @code{dist} parameters to the | |
| 1161 @code{fftw_plan_r2r} function, but otherwise behave the same as the | |
| 1162 basic interface. The interpretation of those additional parameters are | |
| 1163 the same as for @code{fftw_plan_many_dft}. (Of course, the | |
| 1164 @code{stride} and @code{dist} parameters are now in units of | |
| 1165 @code{double}, not @code{fftw_complex}.) | |
| 1166 | |
| 1167 Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not | |
| 1168 used after this function returns. You can safely free or reuse them. | |
| 1169 | |
| 1170 @c ------------------------------------------------------------ | |
| 1171 @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference | |
| 1172 @section Guru Interface | |
| 1173 @cindex guru interface | |
| 1174 | |
| 1175 The ``guru'' interface to FFTW is intended to expose as much as possible | |
| 1176 of the flexibility in the underlying FFTW architecture. It allows one | |
| 1177 to compute multi-dimensional ``vectors'' (loops) of multi-dimensional | |
| 1178 transforms, where each vector/transform dimension has an independent | |
| 1179 size and stride. | |
| 1180 @cindex vector | |
| 1181 One can also use more general complex-number formats, e.g. separate real | |
| 1182 and imaginary arrays. | |
| 1183 | |
| 1184 For those users who require the flexibility of the guru interface, it is | |
| 1185 important that they pay special attention to the documentation lest they | |
| 1186 shoot themselves in the foot. | |
| 1187 | |
| 1188 @menu | |
| 1189 * Interleaved and split arrays:: | |
| 1190 * Guru vector and transform sizes:: | |
| 1191 * Guru Complex DFTs:: | |
| 1192 * Guru Real-data DFTs:: | |
| 1193 * Guru Real-to-real Transforms:: | |
| 1194 * 64-bit Guru Interface:: | |
| 1195 @end menu | |
| 1196 | |
| 1197 @c =========> | |
| 1198 @node Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface | |
| 1199 @subsection Interleaved and split arrays | |
| 1200 | |
| 1201 The guru interface supports two representations of complex numbers, | |
| 1202 which we call the interleaved and the split format. | |
| 1203 | |
| 1204 The @dfn{interleaved} format is the same one used by the basic and | |
| 1205 advanced interfaces, and it is documented in @ref{Complex numbers}. | |
| 1206 In the interleaved format, you provide pointers to the real part of a | |
| 1207 complex number, and the imaginary part understood to be stored in the | |
| 1208 next memory location. | |
| 1209 @cindex interleaved format | |
| 1210 | |
| 1211 | |
| 1212 The @dfn{split} format allows separate pointers to the real and | |
| 1213 imaginary parts of a complex array. | |
| 1214 @cindex split format | |
| 1215 | |
| 1216 | |
| 1217 Technically, the interleaved format is redundant, because you can | |
| 1218 always express an interleaved array in terms of a split array with | |
| 1219 appropriate pointers and strides. On the other hand, the interleaved | |
| 1220 format is simpler to use, and it is common in practice. Hence, FFTW | |
| 1221 supports it as a special case. | |
| 1222 | |
| 1223 @c =========> | |
| 1224 @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface | |
| 1225 @subsection Guru vector and transform sizes | |
| 1226 | |
| 1227 The guru interface introduces one basic new data structure, | |
| 1228 @code{fftw_iodim}, that is used to specify sizes and strides for | |
| 1229 multi-dimensional transforms and vectors: | |
| 1230 | |
| 1231 @example | |
| 1232 typedef struct @{ | |
| 1233 int n; | |
| 1234 int is; | |
| 1235 int os; | |
| 1236 @} fftw_iodim; | |
| 1237 @end example | |
| 1238 @tindex fftw_iodim | |
| 1239 | |
| 1240 Here, @code{n} is the size of the dimension, and @code{is} and @code{os} | |
| 1241 are the strides of that dimension for the input and output arrays. (The | |
| 1242 stride is the separation of consecutive elements along this dimension.) | |
| 1243 | |
| 1244 The meaning of the stride parameter depends on the type of the array | |
| 1245 that the stride refers to. @emph{If the array is interleaved complex, | |
| 1246 strides are expressed in units of complex numbers | |
| 1247 (@code{fftw_complex}). If the array is split complex or real, strides | |
| 1248 are expressed in units of real numbers (@code{double}).} This | |
| 1249 convention is consistent with the usual pointer arithmetic in the C | |
| 1250 language. An interleaved array is denoted by a pointer @code{p} to | |
| 1251 @code{fftw_complex}, so that @code{p+1} points to the next complex | |
| 1252 number. Split arrays are denoted by pointers to @code{double}, in | |
| 1253 which case pointer arithmetic operates in units of | |
| 1254 @code{sizeof(double)}. | |
| 1255 @cindex stride | |
| 1256 | |
| 1257 | |
| 1258 The guru planner interfaces all take a (@code{rank}, @code{dims[rank]}) | |
| 1259 pair describing the transform size, and a (@code{howmany_rank}, | |
| 1260 @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a | |
| 1261 multi-dimensional loop of transforms to perform), where @code{dims} and | |
| 1262 @code{howmany_dims} are arrays of @code{fftw_iodim}. | |
| 1263 | |
| 1264 For example, the @code{howmany} parameter in the advanced complex-DFT | |
| 1265 interface corresponds to @code{howmany_rank} = 1, | |
| 1266 @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} = | |
| 1267 @code{idist}, and @code{howmany_dims[0].os} = @code{odist}. | |
| 1268 @cindex howmany loop | |
| 1269 @cindex dist | |
| 1270 (To compute a single transform, you can just use @code{howmany_rank} = 0.) | |
| 1271 | |
| 1272 | |
| 1273 A row-major multidimensional array with dimensions @code{n[rank]} | |
| 1274 (@pxref{Row-major Format}) corresponds to @code{dims[i].n} = | |
| 1275 @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] * | |
| 1276 dims[i+1].is} (similarly for @code{os}). The stride of the last | |
| 1277 (@code{i=rank-1}) dimension is the overall stride of the array. | |
| 1278 e.g. to be equivalent to the advanced complex-DFT interface, you would | |
| 1279 have @code{dims[rank-1].is} = @code{istride} and | |
| 1280 @code{dims[rank-1].os} = @code{ostride}. | |
| 1281 @cindex row-major | |
| 1282 | |
| 1283 | |
| 1284 In general, we only guarantee FFTW to return a non-@code{NULL} plan if | |
| 1285 the vector and transform dimensions correspond to a set of distinct | |
| 1286 indices, and for in-place transforms the input/output strides should | |
| 1287 be the same. | |
| 1288 | |
| 1289 @c =========> | |
| 1290 @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface | |
| 1291 @subsection Guru Complex DFTs | |
| 1292 | |
| 1293 @example | |
| 1294 fftw_plan fftw_plan_guru_dft( | |
| 1295 int rank, const fftw_iodim *dims, | |
| 1296 int howmany_rank, const fftw_iodim *howmany_dims, | |
| 1297 fftw_complex *in, fftw_complex *out, | |
| 1298 int sign, unsigned flags); | |
| 1299 | |
| 1300 fftw_plan fftw_plan_guru_split_dft( | |
| 1301 int rank, const fftw_iodim *dims, | |
| 1302 int howmany_rank, const fftw_iodim *howmany_dims, | |
| 1303 double *ri, double *ii, double *ro, double *io, | |
| 1304 unsigned flags); | |
| 1305 @end example | |
| 1306 @findex fftw_plan_guru_dft | |
| 1307 @findex fftw_plan_guru_split_dft | |
| 1308 | |
| 1309 These two functions plan a complex-data, multi-dimensional DFT | |
| 1310 for the interleaved and split format, respectively. | |
| 1311 Transform dimensions are given by (@code{rank}, @code{dims}) over a | |
| 1312 multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, | |
| 1313 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point | |
| 1314 to @code{fftw_iodim} arrays of length @code{rank} and | |
| 1315 @code{howmany_rank}, respectively. | |
| 1316 | |
| 1317 @cindex flags | |
| 1318 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | |
| 1319 as defined in @ref{Planner Flags}. | |
| 1320 | |
| 1321 In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and | |
| 1322 @code{out} point to the interleaved input and output arrays, | |
| 1323 respectively. The sign can be either @math{-1} (= | |
| 1324 @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). If the | |
| 1325 pointers are equal, the transform is in-place. | |
| 1326 | |
| 1327 In the @code{fftw_plan_guru_split_dft} function, | |
| 1328 @code{ri} and @code{ii} point to the real and imaginary input arrays, | |
| 1329 and @code{ro} and @code{io} point to the real and imaginary output | |
| 1330 arrays. The input and output pointers may be the same, indicating an | |
| 1331 in-place transform. For example, for @code{fftw_complex} pointers | |
| 1332 @code{in} and @code{out}, the corresponding parameters are: | |
| 1333 | |
| 1334 @example | |
| 1335 ri = (double *) in; | |
| 1336 ii = (double *) in + 1; | |
| 1337 ro = (double *) out; | |
| 1338 io = (double *) out + 1; | |
| 1339 @end example | |
| 1340 | |
| 1341 Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides | |
| 1342 are expressed in units of @code{double}. For a contiguous | |
| 1343 @code{fftw_complex} array, the overall stride of the transform should | |
| 1344 be 2, the distance between consecutive real parts or between | |
| 1345 consecutive imaginary parts; see @ref{Guru vector and transform | |
| 1346 sizes}. Note that the dimension strides are applied equally to the | |
| 1347 real and imaginary parts; real and imaginary arrays with different | |
| 1348 strides are not supported. | |
| 1349 | |
| 1350 There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}. | |
| 1351 This function always plans for an @code{FFTW_FORWARD} transform. To | |
| 1352 plan for an @code{FFTW_BACKWARD} transform, you can exploit the | |
| 1353 identity that the backwards DFT is equal to the forwards DFT with the | |
| 1354 real and imaginary parts swapped. For example, in the case of the | |
| 1355 @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform | |
| 1356 is computed by the parameters: | |
| 1357 | |
| 1358 @example | |
| 1359 ri = (double *) in + 1; | |
| 1360 ii = (double *) in; | |
| 1361 ro = (double *) out + 1; | |
| 1362 io = (double *) out; | |
| 1363 @end example | |
| 1364 | |
| 1365 @c =========> | |
| 1366 @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface | |
| 1367 @subsection Guru Real-data DFTs | |
| 1368 | |
| 1369 @example | |
| 1370 fftw_plan fftw_plan_guru_dft_r2c( | |
| 1371 int rank, const fftw_iodim *dims, | |
| 1372 int howmany_rank, const fftw_iodim *howmany_dims, | |
| 1373 double *in, fftw_complex *out, | |
| 1374 unsigned flags); | |
| 1375 | |
| 1376 fftw_plan fftw_plan_guru_split_dft_r2c( | |
| 1377 int rank, const fftw_iodim *dims, | |
| 1378 int howmany_rank, const fftw_iodim *howmany_dims, | |
| 1379 double *in, double *ro, double *io, | |
| 1380 unsigned flags); | |
| 1381 | |
| 1382 fftw_plan fftw_plan_guru_dft_c2r( | |
| 1383 int rank, const fftw_iodim *dims, | |
| 1384 int howmany_rank, const fftw_iodim *howmany_dims, | |
| 1385 fftw_complex *in, double *out, | |
| 1386 unsigned flags); | |
| 1387 | |
| 1388 fftw_plan fftw_plan_guru_split_dft_c2r( | |
| 1389 int rank, const fftw_iodim *dims, | |
| 1390 int howmany_rank, const fftw_iodim *howmany_dims, | |
| 1391 double *ri, double *ii, double *out, | |
| 1392 unsigned flags); | |
| 1393 @end example | |
| 1394 @findex fftw_plan_guru_dft_r2c | |
| 1395 @findex fftw_plan_guru_split_dft_r2c | |
| 1396 @findex fftw_plan_guru_dft_c2r | |
| 1397 @findex fftw_plan_guru_split_dft_c2r | |
| 1398 | |
| 1399 Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with | |
| 1400 transform dimensions given by (@code{rank}, @code{dims}) over a | |
| 1401 multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, | |
| 1402 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point | |
| 1403 to @code{fftw_iodim} arrays of length @code{rank} and | |
| 1404 @code{howmany_rank}, respectively. As for the basic and advanced | |
| 1405 interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform | |
| 1406 is @code{FFTW_BACKWARD}. | |
| 1407 | |
| 1408 The @emph{last} dimension of @code{dims} is interpreted specially: | |
| 1409 that dimension of the real array has size @code{dims[rank-1].n}, but | |
| 1410 that dimension of the complex array has size @code{dims[rank-1].n/2+1} | |
| 1411 (division rounded down). The strides, on the other hand, are taken to | |
| 1412 be exactly as specified. It is up to the user to specify the strides | |
| 1413 appropriately for the peculiar dimensions of the data, and we do not | |
| 1414 guarantee that the planner will succeed (return non-@code{NULL}) for | |
| 1415 any dimensions other than those described in @ref{Real-data DFT Array | |
| 1416 Format} and generalized in @ref{Advanced Real-data DFTs}. (That is, | |
| 1417 for an in-place transform, each individual dimension should be able to | |
| 1418 operate in place.) | |
| 1419 @cindex in-place | |
| 1420 | |
| 1421 | |
| 1422 @code{in} and @code{out} point to the input and output arrays for r2c | |
| 1423 and c2r transforms, respectively. For split arrays, @code{ri} and | |
| 1424 @code{ii} point to the real and imaginary input arrays for a c2r | |
| 1425 transform, and @code{ro} and @code{io} point to the real and imaginary | |
| 1426 output arrays for an r2c transform. @code{in} and @code{ro} or | |
| 1427 @code{ri} and @code{out} may be the same, indicating an in-place | |
| 1428 transform. (In-place transforms where @code{in} and @code{io} or | |
| 1429 @code{ii} and @code{out} are the same are not currently supported.) | |
| 1430 | |
| 1431 @cindex flags | |
| 1432 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | |
| 1433 as defined in @ref{Planner Flags}. | |
| 1434 | |
| 1435 In-place transforms of rank greater than 1 are currently only | |
| 1436 supported for interleaved arrays. For split arrays, the planner will | |
| 1437 return @code{NULL}. | |
| 1438 @cindex in-place | |
| 1439 | |
| 1440 @c =========> | |
| 1441 @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface | |
| 1442 @subsection Guru Real-to-real Transforms | |
| 1443 | |
| 1444 @example | |
| 1445 fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims, | |
| 1446 int howmany_rank, | |
| 1447 const fftw_iodim *howmany_dims, | |
| 1448 double *in, double *out, | |
| 1449 const fftw_r2r_kind *kind, | |
| 1450 unsigned flags); | |
| 1451 @end example | |
| 1452 @findex fftw_plan_guru_r2r | |
| 1453 | |
| 1454 Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD} | |
| 1455 transform with transform dimensions given by (@code{rank}, @code{dims}) | |
| 1456 over a multi-dimensional vector (loop) of dimensions | |
| 1457 (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and | |
| 1458 @code{howmany_dims} should point to @code{fftw_iodim} arrays of length | |
| 1459 @code{rank} and @code{howmany_rank}, respectively. | |
| 1460 | |
| 1461 The transform kind of each dimension is given by the @code{kind} | |
| 1462 parameter, which should point to an array of length @code{rank}. Valid | |
| 1463 @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform | |
| 1464 Kinds}. | |
| 1465 | |
| 1466 @code{in} and @code{out} point to the real input and output arrays; they | |
| 1467 may be the same, indicating an in-place transform. | |
| 1468 | |
| 1469 @cindex flags | |
| 1470 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | |
| 1471 as defined in @ref{Planner Flags}. | |
| 1472 | |
| 1473 @c =========> | |
| 1474 @node 64-bit Guru Interface, , Guru Real-to-real Transforms, Guru Interface | |
| 1475 @subsection 64-bit Guru Interface | |
| 1476 @cindex 64-bit architecture | |
| 1477 | |
| 1478 When compiled in 64-bit mode on a 64-bit architecture (where addresses | |
| 1479 are 64 bits wide), FFTW uses 64-bit quantities internally for all | |
| 1480 transform sizes, strides, and so on---you don't have to do anything | |
| 1481 special to exploit this. However, in the ordinary FFTW interfaces, | |
| 1482 you specify the transform size by an @code{int} quantity, which is | |
| 1483 normally only 32 bits wide. This means that, even though FFTW is | |
| 1484 using 64-bit sizes internally, you cannot specify a single transform | |
| 1485 dimension larger than | |
| 1486 @ifinfo | |
| 1487 2^31-1 | |
| 1488 @end ifinfo | |
| 1489 @html | |
| 1490 2<sup><small>31</small></sup>−1 | |
| 1491 @end html | |
| 1492 @tex | |
| 1493 $2^31-1$ | |
| 1494 @end tex | |
| 1495 numbers. | |
| 1496 | |
| 1497 We expect that few users will require transforms larger than this, but, | |
| 1498 for those who do, we provide a 64-bit version of the guru interface in | |
| 1499 which all sizes are specified as integers of type @code{ptrdiff_t} | |
| 1500 instead of @code{int}. (@code{ptrdiff_t} is a signed integer type | |
| 1501 defined by the C standard to be wide enough to represent address | |
| 1502 differences, and thus must be at least 64 bits wide on a 64-bit | |
| 1503 machine.) We stress that there is @emph{no performance advantage} to | |
| 1504 using this interface---the same internal FFTW code is employed | |
| 1505 regardless---and it is only necessary if you want to specify very | |
| 1506 large transform sizes. | |
| 1507 @tindex ptrdiff_t | |
| 1508 | |
| 1509 | |
| 1510 In particular, the 64-bit guru interface is a set of planner routines | |
| 1511 that are exactly the same as the guru planner routines, except that | |
| 1512 they are named with @samp{guru64} instead of @samp{guru} and they take | |
| 1513 arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}. | |
| 1514 For example, instead of @code{fftw_plan_guru_dft}, we have | |
| 1515 @code{fftw_plan_guru64_dft}. | |
| 1516 | |
| 1517 @example | |
| 1518 fftw_plan fftw_plan_guru64_dft( | |
| 1519 int rank, const fftw_iodim64 *dims, | |
| 1520 int howmany_rank, const fftw_iodim64 *howmany_dims, | |
| 1521 fftw_complex *in, fftw_complex *out, | |
| 1522 int sign, unsigned flags); | |
| 1523 @end example | |
| 1524 @findex fftw_plan_guru64_dft | |
| 1525 | |
| 1526 The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the | |
| 1527 same interpretation, except that it uses type @code{ptrdiff_t} instead | |
| 1528 of type @code{int}. | |
| 1529 | |
| 1530 @example | |
| 1531 typedef struct @{ | |
| 1532 ptrdiff_t n; | |
| 1533 ptrdiff_t is; | |
| 1534 ptrdiff_t os; | |
| 1535 @} fftw_iodim64; | |
| 1536 @end example | |
| 1537 @tindex fftw_iodim64 | |
| 1538 | |
| 1539 Every other @samp{fftw_plan_guru} function also has a | |
| 1540 @samp{fftw_plan_guru64} equivalent, but we do not repeat their | |
| 1541 documentation here since they are identical to the 32-bit versions | |
| 1542 except as noted above. | |
| 1543 | |
| 1544 @c ----------------------------------------------------------- | |
| 1545 @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference | |
| 1546 @section New-array Execute Functions | |
| 1547 @cindex execute | |
| 1548 @cindex new-array execution | |
| 1549 | |
| 1550 Normally, one executes a plan for the arrays with which the plan was | |
| 1551 created, by calling @code{fftw_execute(plan)} as described in @ref{Using | |
| 1552 Plans}. | |
| 1553 @findex fftw_execute | |
| 1554 However, it is possible for sophisticated users to apply a given plan | |
| 1555 to a @emph{different} array using the ``new-array execute'' functions | |
| 1556 detailed below, provided that the following conditions are met: | |
| 1557 | |
| 1558 @itemize @bullet | |
| 1559 | |
| 1560 @item | |
| 1561 The array size, strides, etcetera are the same (since those are set by | |
| 1562 the plan). | |
| 1563 | |
| 1564 @item | |
| 1565 The input and output arrays are the same (in-place) or different | |
| 1566 (out-of-place) if the plan was originally created to be in-place or | |
| 1567 out-of-place, respectively. | |
| 1568 | |
| 1569 @item | |
| 1570 For split arrays, the separations between the real and imaginary | |
| 1571 parts, @code{ii-ri} and @code{io-ro}, are the same as they were for | |
| 1572 the input and output arrays when the plan was created. (This | |
| 1573 condition is automatically satisfied for interleaved arrays.) | |
| 1574 | |
| 1575 @item | |
| 1576 The @dfn{alignment} of the new input/output arrays is the same as that | |
| 1577 of the input/output arrays when the plan was created, unless the plan | |
| 1578 was created with the @code{FFTW_UNALIGNED} flag. | |
| 1579 @ctindex FFTW_UNALIGNED | |
| 1580 Here, the alignment is a platform-dependent quantity (for example, it is | |
| 1581 the address modulo 16 if SSE SIMD instructions are used, but the address | |
| 1582 modulo 4 for non-SIMD single-precision FFTW on the same machine). In | |
| 1583 general, only arrays allocated with @code{fftw_malloc} are guaranteed to | |
| 1584 be equally aligned (@pxref{SIMD alignment and fftw_malloc}). | |
| 1585 | |
| 1586 @end itemize | |
| 1587 | |
| 1588 @cindex alignment | |
| 1589 The alignment issue is especially critical, because if you don't use | |
| 1590 @code{fftw_malloc} then you may have little control over the alignment | |
| 1591 of arrays in memory. For example, neither the C++ @code{new} function | |
| 1592 nor the Fortran @code{allocate} statement provide strong enough | |
| 1593 guarantees about data alignment. If you don't use @code{fftw_malloc}, | |
| 1594 therefore, you probably have to use @code{FFTW_UNALIGNED} (which | |
| 1595 disables most SIMD support). If possible, it is probably better for | |
| 1596 you to simply create multiple plans (creating a new plan is quick once | |
| 1597 one exists for a given size), or better yet re-use the same array for | |
| 1598 your transforms. | |
| 1599 | |
| 1600 If you are tempted to use the new-array execute interface because you | |
| 1601 want to transform a known bunch of arrays of the same size, you should | |
| 1602 probably go use the advanced interface instead (@pxref{Advanced | |
| 1603 Interface})). | |
| 1604 | |
| 1605 The new-array execute functions are: | |
| 1606 | |
| 1607 @example | |
| 1608 void fftw_execute_dft( | |
| 1609 const fftw_plan p, | |
| 1610 fftw_complex *in, fftw_complex *out); | |
| 1611 | |
| 1612 void fftw_execute_split_dft( | |
| 1613 const fftw_plan p, | |
| 1614 double *ri, double *ii, double *ro, double *io); | |
| 1615 | |
| 1616 void fftw_execute_dft_r2c( | |
| 1617 const fftw_plan p, | |
| 1618 double *in, fftw_complex *out); | |
| 1619 | |
| 1620 void fftw_execute_split_dft_r2c( | |
| 1621 const fftw_plan p, | |
| 1622 double *in, double *ro, double *io); | |
| 1623 | |
| 1624 void fftw_execute_dft_c2r( | |
| 1625 const fftw_plan p, | |
| 1626 fftw_complex *in, double *out); | |
| 1627 | |
| 1628 void fftw_execute_split_dft_c2r( | |
| 1629 const fftw_plan p, | |
| 1630 double *ri, double *ii, double *out); | |
| 1631 | |
| 1632 void fftw_execute_r2r( | |
| 1633 const fftw_plan p, | |
| 1634 double *in, double *out); | |
| 1635 @end example | |
| 1636 @findex fftw_execute_dft | |
| 1637 @findex fftw_execute_split_dft | |
| 1638 @findex fftw_execute_dft_r2c | |
| 1639 @findex fftw_execute_split_dft_r2c | |
| 1640 @findex fftw_execute_dft_c2r | |
| 1641 @findex fftw_execute_split_dft_c2r | |
| 1642 @findex fftw_execute_r2r | |
| 1643 | |
| 1644 These execute the @code{plan} to compute the corresponding transform on | |
| 1645 the input/output arrays specified by the subsequent arguments. The | |
| 1646 input/output array arguments have the same meanings as the ones passed | |
| 1647 to the guru planner routines in the preceding sections. The @code{plan} | |
| 1648 is not modified, and these routines can be called as many times as | |
| 1649 desired, or intermixed with calls to the ordinary @code{fftw_execute}. | |
| 1650 | |
| 1651 The @code{plan} @emph{must} have been created for the transform type | |
| 1652 corresponding to the execute function, e.g. it must be a complex-DFT | |
| 1653 plan for @code{fftw_execute_dft}. Any of the planner routines for that | |
| 1654 transform type, from the basic to the guru interface, could have been | |
| 1655 used to create the plan, however. | |
| 1656 | |
| 1657 @c ------------------------------------------------------------ | |
| 1658 @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference | |
| 1659 @section Wisdom | |
| 1660 @cindex wisdom | |
| 1661 @cindex saving plans to disk | |
| 1662 | |
| 1663 This section documents the FFTW mechanism for saving and restoring | |
| 1664 plans from disk. This mechanism is called @dfn{wisdom}. | |
| 1665 | |
| 1666 @menu | |
| 1667 * Wisdom Export:: | |
| 1668 * Wisdom Import:: | |
| 1669 * Forgetting Wisdom:: | |
| 1670 * Wisdom Utilities:: | |
| 1671 @end menu | |
| 1672 | |
| 1673 @c =========> | |
| 1674 @node Wisdom Export, Wisdom Import, Wisdom, Wisdom | |
| 1675 @subsection Wisdom Export | |
| 1676 | |
| 1677 @example | |
| 1678 int fftw_export_wisdom_to_filename(const char *filename); | |
| 1679 void fftw_export_wisdom_to_file(FILE *output_file); | |
| 1680 char *fftw_export_wisdom_to_string(void); | |
| 1681 void fftw_export_wisdom(void (*write_char)(char c, void *), void *data); | |
| 1682 @end example | |
| 1683 @findex fftw_export_wisdom | |
| 1684 @findex fftw_export_wisdom_to_filename | |
| 1685 @findex fftw_export_wisdom_to_file | |
| 1686 @findex fftw_export_wisdom_to_string | |
| 1687 | |
| 1688 These functions allow you to export all currently accumulated wisdom | |
| 1689 in a form from which it can be later imported and restored, even | |
| 1690 during a separate run of the program. (@xref{Words of Wisdom-Saving | |
| 1691 Plans}.) The current store of wisdom is not affected by calling any | |
| 1692 of these routines. | |
| 1693 | |
| 1694 @code{fftw_export_wisdom} exports the wisdom to any output | |
| 1695 medium, as specified by the callback function | |
| 1696 @code{write_char}. @code{write_char} is a @code{putc}-like function that | |
| 1697 writes the character @code{c} to some output; its second parameter is | |
| 1698 the @code{data} pointer passed to @code{fftw_export_wisdom}. For | |
| 1699 convenience, the following three ``wrapper'' routines are provided: | |
| 1700 | |
| 1701 @code{fftw_export_wisdom_to_filename} writes wisdom to a file named | |
| 1702 @code{filename} (which is created or overwritten), returning @code{1} | |
| 1703 on success and @code{0} on failure. A lower-level function, which | |
| 1704 requires you to open and close the file yourself (e.g. if you want to | |
| 1705 write wisdom to a portion of a larger file) is | |
| 1706 @code{fftw_export_wisdom_to_file}. This writes the wisdom to the | |
| 1707 current position in @code{output_file}, which should be open with | |
| 1708 write permission; upon exit, the file remains open and is positioned | |
| 1709 at the end of the wisdom data. | |
| 1710 | |
| 1711 @code{fftw_export_wisdom_to_string} returns a pointer to a | |
| 1712 @code{NULL}-terminated string holding the wisdom data. This string is | |
| 1713 dynamically allocated, and it is the responsibility of the caller to | |
| 1714 deallocate it with @code{free} when it is no longer needed. | |
| 1715 | |
| 1716 All of these routines export the wisdom in the same format, which we | |
| 1717 will not document here except to say that it is LISP-like ASCII text | |
| 1718 that is insensitive to white space. | |
| 1719 | |
| 1720 @c =========> | |
| 1721 @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom | |
| 1722 @subsection Wisdom Import | |
| 1723 | |
| 1724 @example | |
| 1725 int fftw_import_system_wisdom(void); | |
| 1726 int fftw_import_wisdom_from_filename(const char *filename); | |
| 1727 int fftw_import_wisdom_from_string(const char *input_string); | |
| 1728 int fftw_import_wisdom(int (*read_char)(void *), void *data); | |
| 1729 @end example | |
| 1730 @findex fftw_import_wisdom | |
| 1731 @findex fftw_import_system_wisdom | |
| 1732 @findex fftw_import_wisdom_from_filename | |
| 1733 @findex fftw_import_wisdom_from_file | |
| 1734 @findex fftw_import_wisdom_from_string | |
| 1735 | |
| 1736 These functions import wisdom into a program from data stored by the | |
| 1737 @code{fftw_export_wisdom} functions above. (@xref{Words of | |
| 1738 Wisdom-Saving Plans}.) The imported wisdom replaces any wisdom | |
| 1739 already accumulated by the running program. | |
| 1740 | |
| 1741 @code{fftw_import_wisdom} imports wisdom from any input medium, as | |
| 1742 specified by the callback function @code{read_char}. @code{read_char} is | |
| 1743 a @code{getc}-like function that returns the next character in the | |
| 1744 input; its parameter is the @code{data} pointer passed to | |
| 1745 @code{fftw_import_wisdom}. If the end of the input data is reached | |
| 1746 (which should never happen for valid data), @code{read_char} should | |
| 1747 return @code{EOF} (as defined in @code{<stdio.h>}). For convenience, | |
| 1748 the following three ``wrapper'' routines are provided: | |
| 1749 | |
| 1750 @code{fftw_import_wisdom_from_filename} reads wisdom from a file named | |
| 1751 @code{filename}. A lower-level function, which requires you to open | |
| 1752 and close the file yourself (e.g. if you want to read wisdom from a | |
| 1753 portion of a larger file) is @code{fftw_import_wisdom_from_file}. This | |
| 1754 reads wisdom from the current position in @code{input_file} (which | |
| 1755 should be open with read permission); upon exit, the file remains | |
| 1756 open, but the position of the read pointer is unspecified. | |
| 1757 | |
| 1758 @code{fftw_import_wisdom_from_string} reads wisdom from the | |
| 1759 @code{NULL}-terminated string @code{input_string}. | |
| 1760 | |
| 1761 @code{fftw_import_system_wisdom} reads wisdom from an | |
| 1762 implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix | |
| 1763 and GNU systems). | |
| 1764 @cindex wisdom, system-wide | |
| 1765 | |
| 1766 | |
| 1767 The return value of these import routines is @code{1} if the wisdom was | |
| 1768 read successfully and @code{0} otherwise. Note that, in all of these | |
| 1769 functions, any data in the input stream past the end of the wisdom data | |
| 1770 is simply ignored. | |
| 1771 | |
| 1772 @c =========> | |
| 1773 @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom | |
| 1774 @subsection Forgetting Wisdom | |
| 1775 | |
| 1776 @example | |
| 1777 void fftw_forget_wisdom(void); | |
| 1778 @end example | |
| 1779 @findex fftw_forget_wisdom | |
| 1780 | |
| 1781 Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom} | |
| 1782 to be discarded and its associated memory to be freed. (New | |
| 1783 @code{wisdom} can still be gathered subsequently, however.) | |
| 1784 | |
| 1785 @c =========> | |
| 1786 @node Wisdom Utilities, , Forgetting Wisdom, Wisdom | |
| 1787 @subsection Wisdom Utilities | |
| 1788 | |
| 1789 FFTW includes two standalone utility programs that deal with wisdom. We | |
| 1790 merely summarize them here, since they come with their own @code{man} | |
| 1791 pages for Unix and GNU systems (with HTML versions on our web site). | |
| 1792 | |
| 1793 The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in | |
| 1794 single precision, etcetera), which can be used to create a wisdom file | |
| 1795 containing plans for any of the transform sizes and types supported by | |
| 1796 FFTW. It is preferable to create wisdom directly from your executable | |
| 1797 (@pxref{Caveats in Using Wisdom}), but this program is useful for | |
| 1798 creating global wisdom files for @code{fftw_import_system_wisdom}. | |
| 1799 @cindex fftw-wisdom utility | |
| 1800 | |
| 1801 | |
| 1802 The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom | |
| 1803 file as input and produces a @dfn{configuration routine} as output. The | |
| 1804 latter is a C subroutine that you can compile and link into your | |
| 1805 program, replacing a routine of the same name in the FFTW library, that | |
| 1806 determines which parts of FFTW are callable by your program. | |
| 1807 @code{fftw-wisdom-to-conf} produces a configuration routine that links | |
| 1808 to only those parts of FFTW needed by the saved plans in the wisdom, | |
| 1809 greatly reducing the size of statically linked executables (which should | |
| 1810 only attempt to create plans corresponding to those in the wisdom, | |
| 1811 however). | |
| 1812 @cindex fftw-wisdom-to-conf utility | |
| 1813 @cindex configuration routines | |
| 1814 | |
| 1815 @c ------------------------------------------------------------ | |
| 1816 @node What FFTW Really Computes, , Wisdom, FFTW Reference | |
| 1817 @section What FFTW Really Computes | |
| 1818 | |
| 1819 In this section, we provide precise mathematical definitions for the | |
| 1820 transforms that FFTW computes. These transform definitions are fairly | |
| 1821 standard, but some authors follow slightly different conventions for the | |
| 1822 normalization of the transform (the constant factor in front) and the | |
| 1823 sign of the complex exponent. We begin by presenting the | |
| 1824 one-dimensional (1d) transform definitions, and then give the | |
| 1825 straightforward extension to multi-dimensional transforms. | |
| 1826 | |
| 1827 @menu | |
| 1828 * The 1d Discrete Fourier Transform (DFT):: | |
| 1829 * The 1d Real-data DFT:: | |
| 1830 * 1d Real-even DFTs (DCTs):: | |
| 1831 * 1d Real-odd DFTs (DSTs):: | |
| 1832 * 1d Discrete Hartley Transforms (DHTs):: | |
| 1833 * Multi-dimensional Transforms:: | |
| 1834 @end menu | |
| 1835 | |
| 1836 @c =========> | |
| 1837 @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes | |
| 1838 @subsection The 1d Discrete Fourier Transform (DFT) | |
| 1839 | |
| 1840 @cindex discrete Fourier transform | |
| 1841 @cindex DFT | |
| 1842 The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a | |
| 1843 1d complex array @math{X} of size @math{n} computes an array @math{Y}, | |
| 1844 where: | |
| 1845 @tex | |
| 1846 $$ | |
| 1847 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . | |
| 1848 $$ | |
| 1849 @end tex | |
| 1850 @ifinfo | |
| 1851 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . | |
| 1852 @end ifinfo | |
| 1853 @html | |
| 1854 <center><img src="equation-dft.png" align="top">.</center> | |
| 1855 @end html | |
| 1856 The backward (@code{FFTW_BACKWARD}) DFT computes: | |
| 1857 @tex | |
| 1858 $$ | |
| 1859 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . | |
| 1860 $$ | |
| 1861 @end tex | |
| 1862 @ifinfo | |
| 1863 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . | |
| 1864 @end ifinfo | |
| 1865 @html | |
| 1866 <center><img src="equation-idft.png" align="top">.</center> | |
| 1867 @end html | |
| 1868 | |
| 1869 @cindex normalization | |
| 1870 FFTW computes an unnormalized transform, in that there is no coefficient | |
| 1871 in front of the summation in the DFT. In other words, applying the | |
| 1872 forward and then the backward transform will multiply the input by | |
| 1873 @math{n}. | |
| 1874 | |
| 1875 @cindex frequency | |
| 1876 From above, an @code{FFTW_FORWARD} transform corresponds to a sign of | |
| 1877 @math{-1} in the exponent of the DFT. Note also that we use the | |
| 1878 standard ``in-order'' output ordering---the @math{k}-th output | |
| 1879 corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T} | |
| 1880 is your total sampling period). For those who like to think in terms of | |
| 1881 positive and negative frequencies, this means that the positive | |
| 1882 frequencies are stored in the first half of the output and the negative | |
| 1883 frequencies are stored in backwards order in the second half of the | |
| 1884 output. (The frequency @math{-k/n} is the same as the frequency | |
| 1885 @math{(n-k)/n}.) | |
| 1886 | |
| 1887 @c =========> | |
| 1888 @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes | |
| 1889 @subsection The 1d Real-data DFT | |
| 1890 | |
| 1891 The real-input (r2c) DFT in FFTW computes the @emph{forward} transform | |
| 1892 @math{Y} of the size @code{n} real array @math{X}, exactly as defined | |
| 1893 above, i.e. | |
| 1894 @tex | |
| 1895 $$ | |
| 1896 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . | |
| 1897 $$ | |
| 1898 @end tex | |
| 1899 @ifinfo | |
| 1900 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . | |
| 1901 @end ifinfo | |
| 1902 @html | |
| 1903 <center><img src="equation-dft.png" align="top">.</center> | |
| 1904 @end html | |
| 1905 This output array @math{Y} can easily be shown to possess the | |
| 1906 ``Hermitian'' symmetry | |
| 1907 @cindex Hermitian | |
| 1908 @tex | |
| 1909 $Y_k = Y_{n-k}^*$, | |
| 1910 @end tex | |
| 1911 @ifinfo | |
| 1912 Y[k] = Y[n-k]*, | |
| 1913 @end ifinfo | |
| 1914 @html | |
| 1915 <i>Y<sub>k</sub> = Y<sub>n-k</sub></i><sup>*</sup>, | |
| 1916 @end html | |
| 1917 where we take @math{Y} to be periodic so that | |
| 1918 @tex | |
| 1919 $Y_n = Y_0$. | |
| 1920 @end tex | |
| 1921 @ifinfo | |
| 1922 Y[n] = Y[0]. | |
| 1923 @end ifinfo | |
| 1924 @html | |
| 1925 <i>Y<sub>n</sub> = Y</i><sub>0</sub>. | |
| 1926 @end html | |
| 1927 | |
| 1928 As a result of this symmetry, half of the output @math{Y} is redundant | |
| 1929 (being the complex conjugate of the other half), and so the 1d r2c | |
| 1930 transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y} | |
| 1931 (@math{n/2+1} complex numbers), where the division by @math{2} is | |
| 1932 rounded down. | |
| 1933 | |
| 1934 Moreover, the Hermitian symmetry implies that | |
| 1935 @tex | |
| 1936 $Y_0$ | |
| 1937 @end tex | |
| 1938 @ifinfo | |
| 1939 Y[0] | |
| 1940 @end ifinfo | |
| 1941 @html | |
| 1942 <i>Y</i><sub>0</sub> | |
| 1943 @end html | |
| 1944 and, if @math{n} is even, the | |
| 1945 @tex | |
| 1946 $Y_{n/2}$ | |
| 1947 @end tex | |
| 1948 @ifinfo | |
| 1949 Y[n/2] | |
| 1950 @end ifinfo | |
| 1951 @html | |
| 1952 <i>Y</i><sub><i>n</i>/2</sub> | |
| 1953 @end html | |
| 1954 element, are purely real. So, for the @code{R2HC} r2r transform, these | |
| 1955 elements are not stored in the halfcomplex output format. | |
| 1956 @cindex r2r | |
| 1957 @ctindex R2HC | |
| 1958 @cindex halfcomplex format | |
| 1959 | |
| 1960 | |
| 1961 The c2r and @code{H2RC} r2r transforms compute the backward DFT of the | |
| 1962 @emph{complex} array @math{X} with Hermitian symmetry, stored in the | |
| 1963 r2c/@code{R2HC} output formats, respectively, where the backward | |
| 1964 transform is defined exactly as for the complex case: | |
| 1965 @tex | |
| 1966 $$ | |
| 1967 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . | |
| 1968 $$ | |
| 1969 @end tex | |
| 1970 @ifinfo | |
| 1971 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . | |
| 1972 @end ifinfo | |
| 1973 @html | |
| 1974 <center><img src="equation-idft.png" align="top">.</center> | |
| 1975 @end html | |
| 1976 The outputs @code{Y} of this transform can easily be seen to be purely | |
| 1977 real, and are stored as an array of real numbers. | |
| 1978 | |
| 1979 @cindex normalization | |
| 1980 Like FFTW's complex DFT, these transforms are unnormalized. In other | |
| 1981 words, applying the real-to-complex (forward) and then the | |
| 1982 complex-to-real (backward) transform will multiply the input by | |
| 1983 @math{n}. | |
| 1984 | |
| 1985 @c =========> | |
| 1986 @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes | |
| 1987 @subsection 1d Real-even DFTs (DCTs) | |
| 1988 | |
| 1989 The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized | |
| 1990 forward (and backward) DFTs as defined above, where the input array | |
| 1991 @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry. In | |
| 1992 this case, the output array is likewise real and even symmetry. | |
| 1993 @cindex real-even DFT | |
| 1994 @cindex REDFT | |
| 1995 | |
| 1996 | |
| 1997 @ctindex REDFT00 | |
| 1998 For the case of @code{REDFT00}, this even symmetry means that | |
| 1999 @tex | |
| 2000 $X_j = X_{N-j}$, | |
| 2001 @end tex | |
| 2002 @ifinfo | |
| 2003 X[j] = X[N-j], | |
| 2004 @end ifinfo | |
| 2005 @html | |
| 2006 <i>X<sub>j</sub> = X<sub>N-j</sub></i>, | |
| 2007 @end html | |
| 2008 where we take @math{X} to be periodic so that | |
| 2009 @tex | |
| 2010 $X_N = X_0$. | |
| 2011 @end tex | |
| 2012 @ifinfo | |
| 2013 X[N] = X[0]. | |
| 2014 @end ifinfo | |
| 2015 @html | |
| 2016 <i>X<sub>N</sub> = X</i><sub>0</sub>. | |
| 2017 @end html | |
| 2018 Because of this redundancy, only the first @math{n} real numbers are | |
| 2019 actually stored, where @math{N = 2(n-1)}. | |
| 2020 | |
| 2021 The proper definition of even symmetry for @code{REDFT10}, | |
| 2022 @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate | |
| 2023 because of the shifts by @math{1/2} of the input and/or output, although | |
| 2024 the corresponding boundary conditions are given in @ref{Real even/odd | |
| 2025 DFTs (cosine/sine transforms)}. Because of the even symmetry, however, | |
| 2026 the sine terms in the DFT all cancel and the remaining cosine terms are | |
| 2027 written explicitly below. This formulation often leads people to call | |
| 2028 such a transform a @dfn{discrete cosine transform} (DCT), although it is | |
| 2029 really just a special case of the DFT. | |
| 2030 @cindex discrete cosine transform | |
| 2031 @cindex DCT | |
| 2032 | |
| 2033 | |
| 2034 In each of the definitions below, we transform a real array @math{X} of | |
| 2035 length @math{n} to a real array @math{Y} of length @math{n}: | |
| 2036 | |
| 2037 @subsubheading REDFT00 (DCT-I) | |
| 2038 @ctindex REDFT00 | |
| 2039 An @code{REDFT00} transform (type-I DCT) in FFTW is defined by: | |
| 2040 @tex | |
| 2041 $$ | |
| 2042 Y_k = X_0 + (-1)^k X_{n-1} | |
| 2043 + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)]. | |
| 2044 $$ | |
| 2045 @end tex | |
| 2046 @ifinfo | |
| 2047 Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))). | |
| 2048 @end ifinfo | |
| 2049 @html | |
| 2050 <center><img src="equation-redft00.png" align="top">.</center> | |
| 2051 @end html | |
| 2052 Note that this transform is not defined for @math{n=1}. For @math{n=2}, | |
| 2053 the summation term above is dropped as you might expect. | |
| 2054 | |
| 2055 @subsubheading REDFT10 (DCT-II) | |
| 2056 @ctindex REDFT10 | |
| 2057 An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by: | |
| 2058 @tex | |
| 2059 $$ | |
| 2060 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n]. | |
| 2061 $$ | |
| 2062 @end tex | |
| 2063 @ifinfo | |
| 2064 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)). | |
| 2065 @end ifinfo | |
| 2066 @html | |
| 2067 <center><img src="equation-redft10.png" align="top">.</center> | |
| 2068 @end html | |
| 2069 | |
| 2070 @subsubheading REDFT01 (DCT-III) | |
| 2071 @ctindex REDFT01 | |
| 2072 An @code{REDFT01} transform (type-III DCT) in FFTW is defined by: | |
| 2073 @tex | |
| 2074 $$ | |
| 2075 Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n]. | |
| 2076 $$ | |
| 2077 @end tex | |
| 2078 @ifinfo | |
| 2079 Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)). | |
| 2080 @end ifinfo | |
| 2081 @html | |
| 2082 <center><img src="equation-redft01.png" align="top">.</center> | |
| 2083 @end html | |
| 2084 In the case of @math{n=1}, this reduces to | |
| 2085 @tex | |
| 2086 $Y_0 = X_0$. | |
| 2087 @end tex | |
| 2088 @ifinfo | |
| 2089 Y[0] = X[0]. | |
| 2090 @end ifinfo | |
| 2091 @html | |
| 2092 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>. | |
| 2093 @end html | |
| 2094 Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''. | |
| 2095 @cindex IDCT | |
| 2096 | |
| 2097 @subsubheading REDFT11 (DCT-IV) | |
| 2098 @ctindex REDFT11 | |
| 2099 An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by: | |
| 2100 @tex | |
| 2101 $$ | |
| 2102 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n]. | |
| 2103 $$ | |
| 2104 @end tex | |
| 2105 @ifinfo | |
| 2106 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)). | |
| 2107 @end ifinfo | |
| 2108 @html | |
| 2109 <center><img src="equation-redft11.png" align="top">.</center> | |
| 2110 @end html | |
| 2111 | |
| 2112 @subsubheading Inverses and Normalization | |
| 2113 | |
| 2114 These definitions correspond directly to the unnormalized DFTs used | |
| 2115 elsewhere in FFTW (hence the factors of @math{2} in front of the | |
| 2116 summations). The unnormalized inverse of @code{REDFT00} is | |
| 2117 @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and | |
| 2118 of @code{REDFT11} is @code{REDFT11}. Each unnormalized inverse results | |
| 2119 in the original array multiplied by @math{N}, where @math{N} is the | |
| 2120 @emph{logical} DFT size. For @code{REDFT00}, @math{N=2(n-1)} (note that | |
| 2121 @math{n=1} is not defined); otherwise, @math{N=2n}. | |
| 2122 @cindex normalization | |
| 2123 | |
| 2124 | |
| 2125 In defining the discrete cosine transform, some authors also include | |
| 2126 additional factors of | |
| 2127 @ifinfo | |
| 2128 sqrt(2) | |
| 2129 @end ifinfo | |
| 2130 @html | |
| 2131 √2 | |
| 2132 @end html | |
| 2133 @tex | |
| 2134 $\sqrt{2}$ | |
| 2135 @end tex | |
| 2136 (or its inverse) multiplying selected inputs and/or outputs. This is a | |
| 2137 mostly cosmetic change that makes the transform orthogonal, but | |
| 2138 sacrifices the direct equivalence to a symmetric DFT. | |
| 2139 | |
| 2140 @c =========> | |
| 2141 @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes | |
| 2142 @subsection 1d Real-odd DFTs (DSTs) | |
| 2143 | |
| 2144 The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized | |
| 2145 forward (and backward) DFTs as defined above, where the input array | |
| 2146 @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry. In | |
| 2147 this case, the output is odd symmetry and purely imaginary. | |
| 2148 @cindex real-odd DFT | |
| 2149 @cindex RODFT | |
| 2150 | |
| 2151 | |
| 2152 @ctindex RODFT00 | |
| 2153 For the case of @code{RODFT00}, this odd symmetry means that | |
| 2154 @tex | |
| 2155 $X_j = -X_{N-j}$, | |
| 2156 @end tex | |
| 2157 @ifinfo | |
| 2158 X[j] = -X[N-j], | |
| 2159 @end ifinfo | |
| 2160 @html | |
| 2161 <i>X<sub>j</sub> = -X<sub>N-j</sub></i>, | |
| 2162 @end html | |
| 2163 where we take @math{X} to be periodic so that | |
| 2164 @tex | |
| 2165 $X_N = X_0$. | |
| 2166 @end tex | |
| 2167 @ifinfo | |
| 2168 X[N] = X[0]. | |
| 2169 @end ifinfo | |
| 2170 @html | |
| 2171 <i>X<sub>N</sub> = X</i><sub>0</sub>. | |
| 2172 @end html | |
| 2173 Because of this redundancy, only the first @math{n} real numbers | |
| 2174 starting at @math{j=1} are actually stored (the @math{j=0} element is | |
| 2175 zero), where @math{N = 2(n+1)}. | |
| 2176 | |
| 2177 The proper definition of odd symmetry for @code{RODFT10}, | |
| 2178 @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate | |
| 2179 because of the shifts by @math{1/2} of the input and/or output, although | |
| 2180 the corresponding boundary conditions are given in @ref{Real even/odd | |
| 2181 DFTs (cosine/sine transforms)}. Because of the odd symmetry, however, | |
| 2182 the cosine terms in the DFT all cancel and the remaining sine terms are | |
| 2183 written explicitly below. This formulation often leads people to call | |
| 2184 such a transform a @dfn{discrete sine transform} (DST), although it is | |
| 2185 really just a special case of the DFT. | |
| 2186 @cindex discrete sine transform | |
| 2187 @cindex DST | |
| 2188 | |
| 2189 | |
| 2190 In each of the definitions below, we transform a real array @math{X} of | |
| 2191 length @math{n} to a real array @math{Y} of length @math{n}: | |
| 2192 | |
| 2193 @subsubheading RODFT00 (DST-I) | |
| 2194 @ctindex RODFT00 | |
| 2195 An @code{RODFT00} transform (type-I DST) in FFTW is defined by: | |
| 2196 @tex | |
| 2197 $$ | |
| 2198 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)]. | |
| 2199 $$ | |
| 2200 @end tex | |
| 2201 @ifinfo | |
| 2202 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))). | |
| 2203 @end ifinfo | |
| 2204 @html | |
| 2205 <center><img src="equation-rodft00.png" align="top">.</center> | |
| 2206 @end html | |
| 2207 | |
| 2208 @subsubheading RODFT10 (DST-II) | |
| 2209 @ctindex RODFT10 | |
| 2210 An @code{RODFT10} transform (type-II DST) in FFTW is defined by: | |
| 2211 @tex | |
| 2212 $$ | |
| 2213 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n]. | |
| 2214 $$ | |
| 2215 @end tex | |
| 2216 @ifinfo | |
| 2217 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)). | |
| 2218 @end ifinfo | |
| 2219 @html | |
| 2220 <center><img src="equation-rodft10.png" align="top">.</center> | |
| 2221 @end html | |
| 2222 | |
| 2223 @subsubheading RODFT01 (DST-III) | |
| 2224 @ctindex RODFT01 | |
| 2225 An @code{RODFT01} transform (type-III DST) in FFTW is defined by: | |
| 2226 @tex | |
| 2227 $$ | |
| 2228 Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n]. | |
| 2229 $$ | |
| 2230 @end tex | |
| 2231 @ifinfo | |
| 2232 Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)). | |
| 2233 @end ifinfo | |
| 2234 @html | |
| 2235 <center><img src="equation-rodft01.png" align="top">.</center> | |
| 2236 @end html | |
| 2237 In the case of @math{n=1}, this reduces to | |
| 2238 @tex | |
| 2239 $Y_0 = X_0$. | |
| 2240 @end tex | |
| 2241 @ifinfo | |
| 2242 Y[0] = X[0]. | |
| 2243 @end ifinfo | |
| 2244 @html | |
| 2245 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>. | |
| 2246 @end html | |
| 2247 | |
| 2248 @subsubheading RODFT11 (DST-IV) | |
| 2249 @ctindex RODFT11 | |
| 2250 An @code{RODFT11} transform (type-IV DST) in FFTW is defined by: | |
| 2251 @tex | |
| 2252 $$ | |
| 2253 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n]. | |
| 2254 $$ | |
| 2255 @end tex | |
| 2256 @ifinfo | |
| 2257 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)). | |
| 2258 @end ifinfo | |
| 2259 @html | |
| 2260 <center><img src="equation-rodft11.png" align="top">.</center> | |
| 2261 @end html | |
| 2262 | |
| 2263 @subsubheading Inverses and Normalization | |
| 2264 | |
| 2265 These definitions correspond directly to the unnormalized DFTs used | |
| 2266 elsewhere in FFTW (hence the factors of @math{2} in front of the | |
| 2267 summations). The unnormalized inverse of @code{RODFT00} is | |
| 2268 @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and | |
| 2269 of @code{RODFT11} is @code{RODFT11}. Each unnormalized inverse results | |
| 2270 in the original array multiplied by @math{N}, where @math{N} is the | |
| 2271 @emph{logical} DFT size. For @code{RODFT00}, @math{N=2(n+1)}; | |
| 2272 otherwise, @math{N=2n}. | |
| 2273 @cindex normalization | |
| 2274 | |
| 2275 | |
| 2276 In defining the discrete sine transform, some authors also include | |
| 2277 additional factors of | |
| 2278 @ifinfo | |
| 2279 sqrt(2) | |
| 2280 @end ifinfo | |
| 2281 @html | |
| 2282 √2 | |
| 2283 @end html | |
| 2284 @tex | |
| 2285 $\sqrt{2}$ | |
| 2286 @end tex | |
| 2287 (or its inverse) multiplying selected inputs and/or outputs. This is a | |
| 2288 mostly cosmetic change that makes the transform orthogonal, but | |
| 2289 sacrifices the direct equivalence to an antisymmetric DFT. | |
| 2290 | |
| 2291 @c =========> | |
| 2292 @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes | |
| 2293 @subsection 1d Discrete Hartley Transforms (DHTs) | |
| 2294 | |
| 2295 @cindex discrete Hartley transform | |
| 2296 @cindex DHT | |
| 2297 The discrete Hartley transform (DHT) of a 1d real array @math{X} of size | |
| 2298 @math{n} computes a real array @math{Y} of the same size, where: | |
| 2299 @tex | |
| 2300 $$ | |
| 2301 Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)]. | |
| 2302 $$ | |
| 2303 @end tex | |
| 2304 @ifinfo | |
| 2305 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)]. | |
| 2306 @end ifinfo | |
| 2307 @html | |
| 2308 <center><img src="equation-dht.png" align="top">.</center> | |
| 2309 @end html | |
| 2310 | |
| 2311 @cindex normalization | |
| 2312 FFTW computes an unnormalized transform, in that there is no coefficient | |
| 2313 in front of the summation in the DHT. In other words, applying the | |
| 2314 transform twice (the DHT is its own inverse) will multiply the input by | |
| 2315 @math{n}. | |
| 2316 | |
| 2317 @c =========> | |
| 2318 @node Multi-dimensional Transforms, , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes | |
| 2319 @subsection Multi-dimensional Transforms | |
| 2320 | |
| 2321 The multi-dimensional transforms of FFTW, in general, compute simply the | |
| 2322 separable product of the given 1d transform along each dimension of the | |
| 2323 array. Since each of these transforms is unnormalized, computing the | |
| 2324 forward followed by the backward/inverse multi-dimensional transform | |
| 2325 will result in the original array scaled by the product of the | |
| 2326 normalization factors for each dimension (e.g. the product of the | |
| 2327 dimension sizes, for a multi-dimensional DFT). | |
| 2328 | |
| 2329 @tex | |
| 2330 As an explicit example, consider the following exact mathematical | |
| 2331 definition of our multi-dimensional DFT. Let $X$ be a $d$-dimensional | |
| 2332 complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0 | |
| 2333 \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$. Let also | |
| 2334 $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d | |
| 2335 \}$. | |
| 2336 | |
| 2337 The forward transform computes a complex array~$Y$, whose | |
| 2338 structure is the same as that of~$X$, defined by | |
| 2339 | |
| 2340 $$ | |
| 2341 Y[k_1, k_2, \ldots, k_d] = | |
| 2342 \sum_{j_1 = 0}^{n_1 - 1} | |
| 2343 \sum_{j_2 = 0}^{n_2 - 1} | |
| 2344 \cdots | |
| 2345 \sum_{j_d = 0}^{n_d - 1} | |
| 2346 X[j_1, j_2, \ldots, j_d] | |
| 2347 \omega_1^{-j_1 k_1} | |
| 2348 \omega_2^{-j_2 k_2} | |
| 2349 \cdots | |
| 2350 \omega_d^{-j_d k_d} \ . | |
| 2351 $$ | |
| 2352 | |
| 2353 The backward transform computes | |
| 2354 $$ | |
| 2355 Y[k_1, k_2, \ldots, k_d] = | |
| 2356 \sum_{j_1 = 0}^{n_1 - 1} | |
| 2357 \sum_{j_2 = 0}^{n_2 - 1} | |
| 2358 \cdots | |
| 2359 \sum_{j_d = 0}^{n_d - 1} | |
| 2360 X[j_1, j_2, \ldots, j_d] | |
| 2361 \omega_1^{j_1 k_1} | |
| 2362 \omega_2^{j_2 k_2} | |
| 2363 \cdots | |
| 2364 \omega_d^{j_d k_d} \ . | |
| 2365 $$ | |
| 2366 | |
| 2367 Computing the forward transform followed by the backward transform | |
| 2368 will multiply the array by $\prod_{s=1}^{d} n_d$. | |
| 2369 @end tex | |
| 2370 | |
| 2371 @cindex r2c | |
| 2372 The definition of FFTW's multi-dimensional DFT of real data (r2c) | |
| 2373 deserves special attention. In this case, we logically compute the full | |
| 2374 multi-dimensional DFT of the input data; since the input data are purely | |
| 2375 real, the output data have the Hermitian symmetry and therefore only one | |
| 2376 non-redundant half need be stored. More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array | |
| 2377 @tex | |
| 2378 $Y[k_0, k_1, \ldots, k_{d-1}]$ | |
| 2379 @end tex | |
| 2380 @html | |
| 2381 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ..., | |
| 2382 <i>k</i><sub><i>d-1</i></sub>] | |
| 2383 @end html | |
| 2384 @ifinfo | |
| 2385 Y[k[0], k[1], ..., k[d-1]] | |
| 2386 @end ifinfo | |
| 2387 has the symmetry: | |
| 2388 @tex | |
| 2389 $$ | |
| 2390 Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^* | |
| 2391 $$ | |
| 2392 @end tex | |
| 2393 @html | |
| 2394 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ..., | |
| 2395 <i>k</i><sub><i>d-1</i></sub>] = <i>Y</i>[<i>n</i><sub>0</sub> - | |
| 2396 <i>k</i><sub>0</sub>, <i>n</i><sub>1</sub> - <i>k</i><sub>1</sub>, ..., | |
| 2397 <i>n</i><sub><i>d-1</i></sub> - <i>k</i><sub><i>d-1</i></sub>]<sup>*</sup> | |
| 2398 @end html | |
| 2399 @ifinfo | |
| 2400 Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]* | |
| 2401 @end ifinfo | |
| 2402 (where each dimension is periodic). Because of this symmetry, we only | |
| 2403 store the | |
| 2404 @tex | |
| 2405 $k_{d-1} = 0 \cdots n_{d-1}/2$ | |
| 2406 @end tex | |
| 2407 @html | |
| 2408 <i>k</i><sub><i>d-1</i></sub> = 0...<i>n</i><sub><i>d-1</i></sub>/2+1 | |
| 2409 @end html | |
| 2410 @ifinfo | |
| 2411 k[d-1] = 0...n[d-1]/2 | |
| 2412 @end ifinfo | |
| 2413 elements of the @emph{last} dimension (division by @math{2} is rounded | |
| 2414 down). (We could instead have cut any other dimension in half, but the | |
| 2415 last dimension proved computationally convenient.) This results in the | |
| 2416 peculiar array format described in more detail by @ref{Real-data DFT | |
| 2417 Array Format}. | |
| 2418 | |
| 2419 The multi-dimensional c2r transform is simply the unnormalized inverse | |
| 2420 of the r2c transform. i.e. it is the same as FFTW's complex backward | |
| 2421 multi-dimensional DFT, operating on a Hermitian input array in the | |
| 2422 peculiar format mentioned above and outputting a real array (since the | |
| 2423 DFT output is purely real). | |
| 2424 | |
| 2425 We should remind the user that the separable product of 1d transforms | |
| 2426 along each dimension, as computed by FFTW, is not always the same thing | |
| 2427 as the usual multi-dimensional transform. A multi-dimensional | |
| 2428 @code{R2HC} (or @code{HC2R}) transform is not identical to the | |
| 2429 multi-dimensional DFT, requiring some post-processing to combine the | |
| 2430 requisite real and imaginary parts, as was described in @ref{The | |
| 2431 Halfcomplex-format DFT}. Likewise, FFTW's multidimensional | |
| 2432 @code{FFTW_DHT} r2r transform is not the same thing as the logical | |
| 2433 multi-dimensional discrete Hartley transform defined in the literature, | |
| 2434 as discussed in @ref{The Discrete Hartley Transform}. | |
| 2435 |
