comparison src/fftw-3.3.5/doc/reference.texi @ 127:7867fa7e1b6b

Current fftw source
author Chris Cannam <cannam@all-day-breakfast.com>
date Tue, 18 Oct 2016 13:40:26 +0100
parents
children
comparison
equal deleted inserted replaced
126:4a7071416412 127:7867fa7e1b6b
1 @node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top
2 @chapter FFTW Reference
3
4 This chapter provides a complete reference for all sequential (i.e.,
5 one-processor) FFTW functions. Parallel transforms are described in
6 later chapters.
7
8 @menu
9 * Data Types and Files::
10 * Using Plans::
11 * Basic Interface::
12 * Advanced Interface::
13 * Guru Interface::
14 * New-array Execute Functions::
15 * Wisdom::
16 * What FFTW Really Computes::
17 @end menu
18
19 @c ------------------------------------------------------------
20 @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference
21 @section Data Types and Files
22
23 All programs using FFTW should include its header file:
24
25 @example
26 #include <fftw3.h>
27 @end example
28
29 You must also link to the FFTW library. On Unix, this
30 means adding @code{-lfftw3 -lm} at the @emph{end} of the link command.
31
32 @menu
33 * Complex numbers::
34 * Precision::
35 * Memory Allocation::
36 @end menu
37
38 @c =========>
39 @node Complex numbers, Precision, Data Types and Files, Data Types and Files
40 @subsection Complex numbers
41
42 The default FFTW interface uses @code{double} precision for all
43 floating-point numbers, and defines a @code{fftw_complex} type to hold
44 complex numbers as:
45
46 @example
47 typedef double fftw_complex[2];
48 @end example
49 @tindex fftw_complex
50
51 Here, the @code{[0]} element holds the real part and the @code{[1]}
52 element holds the imaginary part.
53
54 Alternatively, if you have a C compiler (such as @code{gcc}) that
55 supports the C99 revision of the ANSI C standard, you can use C's new
56 native complex type (which is binary-compatible with the typedef above).
57 In particular, if you @code{#include <complex.h>} @emph{before}
58 @code{<fftw3.h>}, then @code{fftw_complex} is defined to be the native
59 complex type and you can manipulate it with ordinary arithmetic
60 (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are
61 @code{fftw_complex} and @code{I} is the standard symbol for the
62 imaginary unit);
63 @cindex C99
64
65
66 C++ has its own @code{complex<T>} template class, defined in the
67 standard @code{<complex>} header file. Reportedly, the C++ standards
68 committee has recently agreed to mandate that the storage format used
69 for this type be binary-compatible with the C99 type, i.e. an array
70 @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]}
71 parts. (See report
72 @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf
73 WG21/N1388}.) Although not part of the official standard as of this
74 writing, the proposal stated that: ``This solution has been tested with
75 all current major implementations of the standard library and shown to
76 be working.'' To the extent that this is true, if you have a variable
77 @code{complex<double> *x}, you can pass it directly to FFTW via
78 @code{reinterpret_cast<fftw_complex*>(x)}.
79 @cindex C++
80 @cindex portability
81
82 @c =========>
83 @node Precision, Memory Allocation, Complex numbers, Data Types and Files
84 @subsection Precision
85 @cindex precision
86
87 You can install single and long-double precision versions of FFTW,
88 which replace @code{double} with @code{float} and @code{long double},
89 respectively (@pxref{Installation and Customization}). To use these
90 interfaces, you:
91
92 @itemize @bullet
93
94 @item
95 Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or
96 @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}. (You
97 can link to the different-precision libraries simultaneously.)
98
99 @item
100 Include the @emph{same} @code{<fftw3.h>} header file.
101
102 @item
103 Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or
104 @samp{fftwl_} for single or long-double precision, respectively.
105 (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute}
106 becomes @code{fftwf_execute}, etcetera.)
107
108 @item
109 Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the
110 same.
111
112 @item
113 Replace @code{double} with @code{float} or @code{long double} for
114 subroutine parameters.
115
116 @end itemize
117
118 Depending upon your compiler and/or hardware, @code{long double} may not
119 be any more precise than @code{double} (or may not be supported at all,
120 although it is standard in C99).
121 @cindex C99
122
123
124 We also support using the nonstandard @code{__float128}
125 quadruple-precision type provided by recent versions of @code{gcc} on
126 32- and 64-bit x86 hardware (@pxref{Installation and Customization}).
127 To use this type, link with @code{-lfftw3q -lquadmath -lm} (the
128 @code{libquadmath} library provided by @code{gcc} is needed for
129 quadruple-precision trigonometric functions) and use @samp{fftwq_}
130 identifiers.
131
132 @c =========>
133 @node Memory Allocation, , Precision, Data Types and Files
134 @subsection Memory Allocation
135
136 @example
137 void *fftw_malloc(size_t n);
138 void fftw_free(void *p);
139 @end example
140 @findex fftw_malloc
141 @findex fftw_free
142
143 These are functions that behave identically to @code{malloc} and
144 @code{free}, except that they guarantee that the returned pointer obeys
145 any special alignment restrictions imposed by any algorithm in FFTW
146 (e.g. for SIMD acceleration). @xref{SIMD alignment and fftw_malloc}.
147 @cindex alignment
148
149
150 Data allocated by @code{fftw_malloc} @emph{must} be deallocated by
151 @code{fftw_free} and not by the ordinary @code{free}.
152
153 These routines simply call through to your operating system's
154 @code{malloc} or, if necessary, its aligned equivalent
155 (e.g. @code{memalign}), so you normally need not worry about any
156 significant time or space overhead. You are @emph{not required} to use
157 them to allocate your data, but we strongly recommend it.
158
159 Note: in C++, just as with ordinary @code{malloc}, you must typecast
160 the output of @code{fftw_malloc} to whatever pointer type you are
161 allocating.
162 @cindex C++
163
164
165 We also provide the following two convenience functions to allocate
166 real and complex arrays with @code{n} elements, which are equivalent
167 to @code{(double *) fftw_malloc(sizeof(double) * n)} and
168 @code{(fftw_complex *) fftw_malloc(sizeof(fftw_complex) * n)},
169 respectively:
170
171 @example
172 double *fftw_alloc_real(size_t n);
173 fftw_complex *fftw_alloc_complex(size_t n);
174 @end example
175 @findex fftw_alloc_real
176 @findex fftw_alloc_complex
177
178 The equivalent functions in other precisions allocate arrays of @code{n}
179 elements in that precision. e.g. @code{fftwf_alloc_real(n)} is
180 equivalent to @code{(float *) fftwf_malloc(sizeof(float) * n)}.
181 @cindex precision
182
183 @c ------------------------------------------------------------
184 @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference
185 @section Using Plans
186
187 Plans for all transform types in FFTW are stored as type
188 @code{fftw_plan} (an opaque pointer type), and are created by one of the
189 various planning routines described in the following sections.
190 @tindex fftw_plan
191 An @code{fftw_plan} contains all information necessary to compute the
192 transform, including the pointers to the input and output arrays.
193
194 @example
195 void fftw_execute(const fftw_plan plan);
196 @end example
197 @findex fftw_execute
198
199 This executes the @code{plan}, to compute the corresponding transform on
200 the arrays for which it was planned (which must still exist). The plan
201 is not modified, and @code{fftw_execute} can be called as many times as
202 desired.
203
204 To apply a given plan to a different array, you can use the new-array execute
205 interface. @xref{New-array Execute Functions}.
206
207 @code{fftw_execute} (and equivalents) is the only function in FFTW
208 guaranteed to be thread-safe; see @ref{Thread safety}.
209
210 This function:
211 @example
212 void fftw_destroy_plan(fftw_plan plan);
213 @end example
214 @findex fftw_destroy_plan
215 deallocates the @code{plan} and all its associated data.
216
217 FFTW's planner saves some other persistent data, such as the
218 accumulated wisdom and a list of algorithms available in the current
219 configuration. If you want to deallocate all of that and reset FFTW
220 to the pristine state it was in when you started your program, you can
221 call:
222
223 @example
224 void fftw_cleanup(void);
225 @end example
226 @findex fftw_cleanup
227
228 After calling @code{fftw_cleanup}, all existing plans become undefined,
229 and you should not attempt to execute them nor to destroy them. You can
230 however create and execute/destroy new plans, in which case FFTW starts
231 accumulating wisdom information again.
232
233 @code{fftw_cleanup} does not deallocate your plans, however. To prevent
234 memory leaks, you must still call @code{fftw_destroy_plan} before
235 executing @code{fftw_cleanup}.
236
237 Occasionally, it may useful to know FFTW's internal ``cost'' metric
238 that it uses to compare plans to one another; this cost is
239 proportional to an execution time of the plan, in undocumented units,
240 if the plan was created with the @code{FFTW_MEASURE} or other
241 timing-based options, or alternatively is a heuristic cost function
242 for @code{FFTW_ESTIMATE} plans. (The cost values of measured and
243 estimated plans are not comparable, being in different units. Also,
244 costs from different FFTW versions or the same version compiled
245 differently may not be in the same units. Plans created from wisdom
246 have a cost of 0 since no timing measurement is performed for them.
247 Finally, certain problems for which only one top-level algorithm was
248 possible may have required no measurements of the cost of the whole
249 plan, in which case @code{fftw_cost} will also return 0.) The cost
250 metric for a given plan is returned by:
251
252 @example
253 double fftw_cost(const fftw_plan plan);
254 @end example
255 @findex fftw_cost
256
257 The following two routines are provided purely for academic purposes
258 (that is, for entertainment).
259
260 @example
261 void fftw_flops(const fftw_plan plan,
262 double *add, double *mul, double *fma);
263 @end example
264 @findex fftw_flops
265
266 Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an
267 exact count of the number of floating-point additions, multiplications,
268 and fused multiply-add operations involved in the plan's execution. The
269 total number of floating-point operations (flops) is @code{add + mul +
270 2*fma}, or @code{add + mul + fma} if the hardware supports fused
271 multiply-add instructions (although the number of FMA operations is only
272 approximate because of compiler voodoo). (The number of operations
273 should be an integer, but we use @code{double} to avoid overflowing
274 @code{int} for large transforms; the arguments are of type @code{double}
275 even for single and long-double precision versions of FFTW.)
276
277 @example
278 void fftw_fprint_plan(const fftw_plan plan, FILE *output_file);
279 void fftw_print_plan(const fftw_plan plan);
280 char *fftw_sprint_plan(const fftw_plan plan);
281 @end example
282 @findex fftw_fprint_plan
283 @findex fftw_print_plan
284
285 This outputs a ``nerd-readable'' representation of the @code{plan} to
286 the given file, to @code{stdout}, or two a newly allocated
287 NUL-terminated string (which the caller is responsible for deallocating
288 with @code{free}), respectively.
289
290 @c ------------------------------------------------------------
291 @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference
292 @section Basic Interface
293 @cindex basic interface
294
295 Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est
296 omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface}
297 computes a single transform of contiguous data, the @dfn{advanced
298 interface} computes transforms of multiple or strided arrays, and the
299 @dfn{guru interface} supports the most general data layouts,
300 multiplicities, and strides. This section describes the the basic
301 interface, which we expect to satisfy the needs of most users.
302
303 @menu
304 * Complex DFTs::
305 * Planner Flags::
306 * Real-data DFTs::
307 * Real-data DFT Array Format::
308 * Real-to-Real Transforms::
309 * Real-to-Real Transform Kinds::
310 @end menu
311
312 @c =========>
313 @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface
314 @subsection Complex DFTs
315
316 @example
317 fftw_plan fftw_plan_dft_1d(int n0,
318 fftw_complex *in, fftw_complex *out,
319 int sign, unsigned flags);
320 fftw_plan fftw_plan_dft_2d(int n0, int n1,
321 fftw_complex *in, fftw_complex *out,
322 int sign, unsigned flags);
323 fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2,
324 fftw_complex *in, fftw_complex *out,
325 int sign, unsigned flags);
326 fftw_plan fftw_plan_dft(int rank, const int *n,
327 fftw_complex *in, fftw_complex *out,
328 int sign, unsigned flags);
329 @end example
330 @findex fftw_plan_dft_1d
331 @findex fftw_plan_dft_2d
332 @findex fftw_plan_dft_3d
333 @findex fftw_plan_dft
334
335 Plan a complex input/output discrete Fourier transform (DFT) in zero or
336 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
337
338 Once you have created a plan for a certain transform type and
339 parameters, then creating another plan of the same type and parameters,
340 but for different arrays, is fast and shares constant data with the
341 first plan (if it still exists).
342
343 The planner returns @code{NULL} if the plan cannot be created. In the
344 standard FFTW distribution, the basic interface is guaranteed to return
345 a non-@code{NULL} plan. A plan may be @code{NULL}, however, if you are
346 using a customized FFTW configuration supporting a restricted set of
347 transforms.
348
349 @subsubheading Arguments
350 @itemize @bullet
351
352 @item
353 @code{rank} is the rank of the transform (it should be the size of the
354 array @code{*n}), and can be any non-negative integer. (@xref{Complex
355 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
356 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
357 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
358 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
359 copy of one number from input to output.
360
361 @item
362 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate
363 for each routine) specify the size of the transform dimensions. They
364 can be any positive integer.
365
366 @itemize @minus
367 @item
368 @cindex row-major
369 Multi-dimensional arrays are stored in row-major order with dimensions:
370 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
371 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
372 @xref{Multi-dimensional Array Format}.
373 @item
374 FFTW is best at handling sizes of the form
375 @ifinfo
376 @math{2^a 3^b 5^c 7^d 11^e 13^f},
377 @end ifinfo
378 @tex
379 $2^a 3^b 5^c 7^d 11^e 13^f$,
380 @end tex
381 @html
382 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
383 11<sup>e</sup> 13<sup>f</sup>,
384 @end html
385 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
386 are arbitrary. Other sizes are computed by means of a slow,
387 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). It is possible to customize FFTW
388 for different array sizes; see @ref{Installation and Customization}.
389 Transforms whose sizes are powers of @math{2} are especially fast.
390 @end itemize
391
392 @item
393 @code{in} and @code{out} point to the input and output arrays of the
394 transform, which may be the same (yielding an in-place transform).
395 @cindex in-place
396 These arrays are overwritten during planning, unless
397 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
398 initialized, but they must be allocated.)
399
400 If @code{in == out}, the transform is @dfn{in-place} and the input
401 array is overwritten. If @code{in != out}, the two arrays must
402 not overlap (but FFTW does not check for this condition).
403
404 @item
405 @ctindex FFTW_FORWARD
406 @ctindex FFTW_BACKWARD
407 @code{sign} is the sign of the exponent in the formula that defines the
408 Fourier transform. It can be @math{-1} (= @code{FFTW_FORWARD}) or
409 @math{+1} (= @code{FFTW_BACKWARD}).
410
411 @item
412 @cindex flags
413 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
414 as defined in @ref{Planner Flags}.
415
416 @end itemize
417
418 FFTW computes an unnormalized transform: computing a forward followed by
419 a backward transform (or vice versa) will result in the original data
420 multiplied by the size of the transform (the product of the dimensions).
421 @cindex normalization
422 For more information, see @ref{What FFTW Really Computes}.
423
424 @c =========>
425 @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface
426 @subsection Planner Flags
427
428 All of the planner routines in FFTW accept an integer @code{flags}
429 argument, which is a bitwise OR (@samp{|}) of zero or more of the flag
430 constants defined below. These flags control the rigor (and time) of
431 the planning process, and can also impose (or lift) restrictions on the
432 type of transform algorithm that is employed.
433
434 @emph{Important:} the planner overwrites the input array during
435 planning unless a saved plan (@pxref{Wisdom}) is available for that
436 problem, so you should initialize your input data after creating the
437 plan. The only exceptions to this are the @code{FFTW_ESTIMATE} and
438 @code{FFTW_WISDOM_ONLY} flags, as mentioned below.
439
440 In all cases, if wisdom is available for the given problem that was
441 created with equal-or-greater planning rigor, then the more rigorous
442 wisdom is used. For example, in @code{FFTW_ESTIMATE} mode any available
443 wisdom is used, whereas in @code{FFTW_PATIENT} mode only wisdom created
444 in patient or exhaustive mode can be used. @xref{Words of Wisdom-Saving
445 Plans}.
446
447 @subsubheading Planning-rigor flags
448 @itemize @bullet
449
450 @item
451 @ctindex FFTW_ESTIMATE
452 @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of
453 different algorithms, a simple heuristic is used to pick a (probably
454 sub-optimal) plan quickly. With this flag, the input/output arrays are
455 not overwritten during planning.
456
457 @item
458 @ctindex FFTW_MEASURE
459 @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually
460 @emph{computing} several FFTs and measuring their execution time.
461 Depending on your machine, this can take some time (often a few
462 seconds). @code{FFTW_MEASURE} is the default planning option.
463
464 @item
465 @ctindex FFTW_PATIENT
466 @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider
467 range of algorithms and often produces a ``more optimal'' plan
468 (especially for large transforms), but at the expense of several times
469 longer planning time (especially for large transforms).
470
471 @item
472 @ctindex FFTW_EXHAUSTIVE
473 @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an
474 even wider range of algorithms, including many that we think are
475 unlikely to be fast, to produce the most optimal plan but with a
476 substantially increased planning time.
477
478 @item
479 @ctindex FFTW_WISDOM_ONLY
480 @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan
481 is only created if wisdom is available for the given problem, and
482 otherwise a @code{NULL} plan is returned. This can be combined with
483 other flags, e.g. @samp{FFTW_WISDOM_ONLY | FFTW_PATIENT} creates a
484 plan only if wisdom is available that was created in
485 @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode. The
486 @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect
487 whether wisdom is available; for example, if wisdom is not available
488 one may wish to allocate new arrays for planning so that user data is
489 not overwritten.
490
491 @end itemize
492
493 @subsubheading Algorithm-restriction flags
494 @itemize @bullet
495
496 @item
497 @ctindex FFTW_DESTROY_INPUT
498 @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is
499 allowed to @emph{overwrite its input} array with arbitrary data; this
500 can sometimes allow more efficient algorithms to be employed.
501 @cindex out-of-place
502
503 @item
504 @ctindex FFTW_PRESERVE_INPUT
505 @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must
506 @emph{not change its input} array. This is ordinarily the
507 @emph{default}, except for c2r and hc2r (i.e. complex-to-real)
508 transforms for which @code{FFTW_DESTROY_INPUT} is the default. In the
509 latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use
510 algorithms that do not destroy the input, at the expense of worse
511 performance; for multi-dimensional c2r transforms, however, no
512 input-preserving algorithms are implemented and the planner will return
513 @code{NULL} if one is requested.
514 @cindex c2r
515 @cindex hc2r
516
517 @item
518 @ctindex FFTW_UNALIGNED
519 @cindex alignment
520 @findex fftw_malloc
521 @findex fftw_alignment_of
522 @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any
523 unusual alignment requirements on the input/output arrays (i.e. no
524 SIMD may be used). This flag is normally @emph{not necessary}, since
525 the planner automatically detects misaligned arrays. The only use for
526 this flag is if you want to use the new-array execute interface to
527 execute a given plan on a different array that may not be aligned like
528 the original. (Using @code{fftw_malloc} makes this flag unnecessary
529 even then. You can also use @code{fftw_alignment_of} to detect
530 whether two arrays are equivalently aligned.)
531
532 @end itemize
533
534 @subsubheading Limiting planning time
535
536 @example
537 extern void fftw_set_timelimit(double seconds);
538 @end example
539 @findex fftw_set_timelimit
540
541 This function instructs FFTW to spend at most @code{seconds} seconds
542 (approximately) in the planner. If @code{seconds ==
543 FFTW_NO_TIMELIMIT} (the default value, which is negative), then
544 planning time is unbounded. Otherwise, FFTW plans with a
545 progressively wider range of algorithms until the the given time limit
546 is reached or the given range of algorithms is explored, returning the
547 best available plan.
548 @ctindex FFTW_NO_TIMELIMIT
549
550
551 For example, specifying @code{FFTW_PATIENT} first plans in
552 @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then
553 finally (time permitting) in @code{FFTW_PATIENT}. If
554 @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further
555 progress to @code{FFTW_EXHAUSTIVE} mode.
556
557 Note that the @code{seconds} argument specifies only a rough limit; in
558 practice, the planner may use somewhat more time if the time limit is
559 reached when the planner is in the middle of an operation that cannot
560 be interrupted. At the very least, the planner will complete planning
561 in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit
562 of 0).
563
564
565 @c =========>
566 @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface
567 @subsection Real-data DFTs
568
569 @example
570 fftw_plan fftw_plan_dft_r2c_1d(int n0,
571 double *in, fftw_complex *out,
572 unsigned flags);
573 fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1,
574 double *in, fftw_complex *out,
575 unsigned flags);
576 fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2,
577 double *in, fftw_complex *out,
578 unsigned flags);
579 fftw_plan fftw_plan_dft_r2c(int rank, const int *n,
580 double *in, fftw_complex *out,
581 unsigned flags);
582 @end example
583 @findex fftw_plan_dft_r2c_1d
584 @findex fftw_plan_dft_r2c_2d
585 @findex fftw_plan_dft_r2c_3d
586 @findex fftw_plan_dft_r2c
587 @cindex r2c
588
589 Plan a real-input/complex-output discrete Fourier transform (DFT) in
590 zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using
591 Plans}).
592
593 Once you have created a plan for a certain transform type and
594 parameters, then creating another plan of the same type and parameters,
595 but for different arrays, is fast and shares constant data with the
596 first plan (if it still exists).
597
598 The planner returns @code{NULL} if the plan cannot be created. A
599 non-@code{NULL} plan is always returned by the basic interface unless
600 you are using a customized FFTW configuration supporting a restricted
601 set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag
602 with a multi-dimensional out-of-place c2r transform (see below).
603
604 @subsubheading Arguments
605 @itemize @bullet
606
607 @item
608 @code{rank} is the rank of the transform (it should be the size of the
609 array @code{*n}), and can be any non-negative integer. (@xref{Complex
610 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
611 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
612 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
613 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
614 copy of one real number (with zero imaginary part) from input to output.
615
616 @item
617 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate
618 for each routine) specify the size of the transform dimensions. They
619 can be any positive integer. This is different in general from the
620 @emph{physical} array dimensions, which are described in @ref{Real-data
621 DFT Array Format}.
622
623 @itemize @minus
624 @item
625 FFTW is best at handling sizes of the form
626 @ifinfo
627 @math{2^a 3^b 5^c 7^d 11^e 13^f},
628 @end ifinfo
629 @tex
630 $2^a 3^b 5^c 7^d 11^e 13^f$,
631 @end tex
632 @html
633 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
634 11<sup>e</sup> 13<sup>f</sup>,
635 @end html
636 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
637 are arbitrary. Other sizes are computed by means of a slow,
638 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
639 for different array sizes; see @ref{Installation and Customization}.)
640 Transforms whose sizes are powers of @math{2} are especially fast, and
641 it is generally beneficial for the @emph{last} dimension of an r2c/c2r
642 transform to be @emph{even}.
643 @end itemize
644
645 @item
646 @code{in} and @code{out} point to the input and output arrays of the
647 transform, which may be the same (yielding an in-place transform).
648 @cindex in-place
649 These arrays are overwritten during planning, unless
650 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
651 initialized, but they must be allocated.) For an in-place transform, it
652 is important to remember that the real array will require padding,
653 described in @ref{Real-data DFT Array Format}.
654 @cindex padding
655
656 @item
657 @cindex flags
658 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
659 as defined in @ref{Planner Flags}.
660
661 @end itemize
662
663 The inverse transforms, taking complex input (storing the non-redundant
664 half of a logically Hermitian array) to real output, are given by:
665
666 @example
667 fftw_plan fftw_plan_dft_c2r_1d(int n0,
668 fftw_complex *in, double *out,
669 unsigned flags);
670 fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1,
671 fftw_complex *in, double *out,
672 unsigned flags);
673 fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2,
674 fftw_complex *in, double *out,
675 unsigned flags);
676 fftw_plan fftw_plan_dft_c2r(int rank, const int *n,
677 fftw_complex *in, double *out,
678 unsigned flags);
679 @end example
680 @findex fftw_plan_dft_c2r_1d
681 @findex fftw_plan_dft_c2r_2d
682 @findex fftw_plan_dft_c2r_3d
683 @findex fftw_plan_dft_c2r
684 @cindex c2r
685
686 The arguments are the same as for the r2c transforms, except that the
687 input and output data formats are reversed.
688
689 FFTW computes an unnormalized transform: computing an r2c followed by a
690 c2r transform (or vice versa) will result in the original data
691 multiplied by the size of the transform (the product of the logical
692 dimensions).
693 @cindex normalization
694 An r2c transform produces the same output as a @code{FFTW_FORWARD}
695 complex DFT of the same input, and a c2r transform is correspondingly
696 equivalent to @code{FFTW_BACKWARD}. For more information, see @ref{What
697 FFTW Really Computes}.
698
699 @c =========>
700 @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface
701 @subsection Real-data DFT Array Format
702 @cindex r2c/c2r multi-dimensional array format
703
704 The output of a DFT of real data (r2c) contains symmetries that, in
705 principle, make half of the outputs redundant (@pxref{What FFTW Really
706 Computes}). (Similarly for the input of an inverse c2r transform.) In
707 practice, it is not possible to entirely realize these savings in an
708 efficient and understandable format that generalizes to
709 multi-dimensional transforms. Instead, the output of the r2c
710 transforms is @emph{slightly} over half of the output of the
711 corresponding complex transform. We do not ``pack'' the data in any
712 way, but store it as an ordinary array of @code{fftw_complex} values.
713 In fact, this data is simply a subsection of what would be the array in
714 the corresponding complex transform.
715
716 Specifically, for a real transform of @math{d} (= @code{rank})
717 dimensions @ndims{}, the complex data is an @ndimshalf array of
718 @code{fftw_complex} values in row-major order (with the division rounded
719 down). That is, we only store the @emph{lower} half (non-negative
720 frequencies), plus one element, of the last dimension of the data from
721 the ordinary complex transform. (We could have instead taken half of
722 any other dimension, but implementation turns out to be simpler if the
723 last, contiguous, dimension is used.)
724
725 @cindex out-of-place
726 For an out-of-place transform, the real data is simply an array with
727 physical dimensions @ndims in row-major order.
728
729 @cindex in-place
730 @cindex padding
731 For an in-place transform, some complications arise since the complex data
732 is slightly larger than the real data. In this case, the final
733 dimension of the real data must be @emph{padded} with extra values to
734 accommodate the size of the complex data---two extra if the last
735 dimension is even and one if it is odd. That is, the last dimension of
736 the real data must physically contain
737 @tex
738 $2 (n_{d-1}/2+1)$
739 @end tex
740 @ifinfo
741 2 * (n[d-1]/2+1)
742 @end ifinfo
743 @html
744 2 * (n<sub>d-1</sub>/2+1)
745 @end html
746 @code{double} values (exactly enough to hold the complex data). This
747 physical array size does not, however, change the @emph{logical} array
748 size---only
749 @tex
750 $n_{d-1}$
751 @end tex
752 @ifinfo
753 n[d-1]
754 @end ifinfo
755 @html
756 n<sub>d-1</sub>
757 @end html
758 values are actually stored in the last dimension, and
759 @tex
760 $n_{d-1}$
761 @end tex
762 @ifinfo
763 n[d-1]
764 @end ifinfo
765 @html
766 n<sub>d-1</sub>
767 @end html
768 is the last dimension passed to the planner.
769
770 @c =========>
771 @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface
772 @subsection Real-to-Real Transforms
773 @cindex r2r
774
775 @example
776 fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out,
777 fftw_r2r_kind kind, unsigned flags);
778 fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out,
779 fftw_r2r_kind kind0, fftw_r2r_kind kind1,
780 unsigned flags);
781 fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2,
782 double *in, double *out,
783 fftw_r2r_kind kind0,
784 fftw_r2r_kind kind1,
785 fftw_r2r_kind kind2,
786 unsigned flags);
787 fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out,
788 const fftw_r2r_kind *kind, unsigned flags);
789 @end example
790 @findex fftw_plan_r2r_1d
791 @findex fftw_plan_r2r_2d
792 @findex fftw_plan_r2r_3d
793 @findex fftw_plan_r2r
794
795 Plan a real input/output (r2r) transform of various kinds in zero or
796 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
797
798 Once you have created a plan for a certain transform type and
799 parameters, then creating another plan of the same type and parameters,
800 but for different arrays, is fast and shares constant data with the
801 first plan (if it still exists).
802
803 The planner returns @code{NULL} if the plan cannot be created. A
804 non-@code{NULL} plan is always returned by the basic interface unless
805 you are using a customized FFTW configuration supporting a restricted
806 set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are
807 not defined).
808 @ctindex FFTW_REDFT00
809
810 @subsubheading Arguments
811 @itemize @bullet
812
813 @item
814 @code{rank} is the dimensionality of the transform (it should be the
815 size of the arrays @code{*n} and @code{*kind}), and can be any
816 non-negative integer. The @samp{_1d}, @samp{_2d}, and @samp{_3d}
817 planners correspond to a @code{rank} of @code{1}, @code{2}, and
818 @code{3}, respectively. A @code{rank} of zero is equivalent to a copy
819 of one number from input to output.
820
821 @item
822 @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]},
823 respectively, gives the (physical) size of the transform dimensions.
824 They can be any positive integer.
825
826 @itemize @minus
827 @item
828 @cindex row-major
829 Multi-dimensional arrays are stored in row-major order with dimensions:
830 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
831 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
832 @xref{Multi-dimensional Array Format}.
833 @item
834 FFTW is generally best at handling sizes of the form
835 @ifinfo
836 @math{2^a 3^b 5^c 7^d 11^e 13^f},
837 @end ifinfo
838 @tex
839 $2^a 3^b 5^c 7^d 11^e 13^f$,
840 @end tex
841 @html
842 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
843 11<sup>e</sup> 13<sup>f</sup>,
844 @end html
845 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
846 are arbitrary. Other sizes are computed by means of a slow,
847 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
848 for different array sizes; see @ref{Installation and Customization}.)
849 Transforms whose sizes are powers of @math{2} are especially fast.
850 @item
851 For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of
852 size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that
853 should be factorizable in the above form.
854 @end itemize
855
856 @item
857 @code{in} and @code{out} point to the input and output arrays of the
858 transform, which may be the same (yielding an in-place transform).
859 @cindex in-place
860 These arrays are overwritten during planning, unless
861 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
862 initialized, but they must be allocated.)
863
864 @item
865 @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or
866 @code{kind[rank]}, is the kind of r2r transform used for the
867 corresponding dimension. The valid kind constants are described in
868 @ref{Real-to-Real Transform Kinds}. In a multi-dimensional transform,
869 what is computed is the separable product formed by taking each
870 transform kind along the corresponding dimension, one dimension after
871 another.
872
873 @item
874 @cindex flags
875 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
876 as defined in @ref{Planner Flags}.
877
878 @end itemize
879
880 @c =========>
881 @node Real-to-Real Transform Kinds, , Real-to-Real Transforms, Basic Interface
882 @subsection Real-to-Real Transform Kinds
883 @cindex kind (r2r)
884
885 FFTW currently supports 11 different r2r transform kinds, specified by
886 one of the constants below. For the precise definitions of these
887 transforms, see @ref{What FFTW Really Computes}. For a more colloquial
888 introduction to these transform kinds, see @ref{More DFTs of Real Data}.
889
890 For dimension of size @code{n}, there is a corresponding ``logical''
891 dimension @code{N} that determines the normalization (and the optimal
892 factorization); the formula for @code{N} is given for each kind below.
893 Also, with each transform kind is listed its corrsponding inverse
894 transform. FFTW computes unnormalized transforms: a transform followed
895 by its inverse will result in the original data multiplied by @code{N}
896 (or the product of the @code{N}'s for each dimension, in
897 multi-dimensions).
898 @cindex normalization
899
900 @itemize @bullet
901
902 @item
903 @ctindex FFTW_R2HC
904 @code{FFTW_R2HC} computes a real-input DFT with output in
905 ``halfcomplex'' format, i.e. real and imaginary parts for a transform of
906 size @code{n} stored as:
907 @tex
908 $$
909 r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1
910 $$
911 @end tex
912 @ifinfo
913 r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1
914 @end ifinfo
915 @html
916 <p align=center>
917 r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub>
918 </p>
919 @end html
920 (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.)
921
922 @item
923 @ctindex FFTW_HC2R
924 @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above.
925 (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.)
926
927 @item
928 @ctindex FFTW_DHT
929 @code{FFTW_DHT} computes a discrete Hartley transform.
930 (Logical @code{N=n}, inverse is @code{FFTW_DHT}.)
931 @cindex discrete Hartley transform
932
933 @item
934 @ctindex FFTW_REDFT00
935 @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I.
936 (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.)
937 @cindex discrete cosine transform
938 @cindex DCT
939
940 @item
941 @ctindex FFTW_REDFT10
942 @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT).
943 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.)
944
945 @item
946 @ctindex FFTW_REDFT01
947 @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II).
948 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.)
949 @cindex IDCT
950
951 @item
952 @ctindex FFTW_REDFT11
953 @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV.
954 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.)
955
956 @item
957 @ctindex FFTW_RODFT00
958 @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I.
959 (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.)
960 @cindex discrete sine transform
961 @cindex DST
962
963 @item
964 @ctindex FFTW_RODFT10
965 @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II.
966 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.)
967
968 @item
969 @ctindex FFTW_RODFT01
970 @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III.
971 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.)
972
973 @item
974 @ctindex FFTW_RODFT11
975 @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV.
976 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.)
977
978 @end itemize
979
980 @c ------------------------------------------------------------
981 @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference
982 @section Advanced Interface
983 @cindex advanced interface
984
985 FFTW's ``advanced'' interface supplements the basic interface with four
986 new planner routines, providing a new level of flexibility: you can plan
987 a transform of multiple arrays simultaneously, operate on non-contiguous
988 (strided) data, and transform a subset of a larger multi-dimensional
989 array. Other than these additional features, the planner operates in
990 the same fashion as in the basic interface, and the resulting
991 @code{fftw_plan} is used in the same way (@pxref{Using Plans}).
992
993 @menu
994 * Advanced Complex DFTs::
995 * Advanced Real-data DFTs::
996 * Advanced Real-to-real Transforms::
997 @end menu
998
999 @c =========>
1000 @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface
1001 @subsection Advanced Complex DFTs
1002
1003 @example
1004 fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany,
1005 fftw_complex *in, const int *inembed,
1006 int istride, int idist,
1007 fftw_complex *out, const int *onembed,
1008 int ostride, int odist,
1009 int sign, unsigned flags);
1010 @end example
1011 @findex fftw_plan_many_dft
1012
1013 This routine plans multiple multidimensional complex DFTs, and it
1014 extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to
1015 compute @code{howmany} transforms, each having rank @code{rank} and size
1016 @code{n}. In addition, the transform data need not be contiguous, but
1017 it may be laid out in memory with an arbitrary stride. To account for
1018 these possibilities, @code{fftw_plan_many_dft} adds the new parameters
1019 @code{howmany}, @{@code{i},@code{o}@}@code{nembed},
1020 @{@code{i},@code{o}@}@code{stride}, and
1021 @{@code{i},@code{o}@}@code{dist}. The FFTW basic interface
1022 (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2,
1023 and@tie{}3, but the advanced interface handles only the general-rank
1024 case.
1025
1026 @code{howmany} is the number of transforms to compute. The resulting
1027 plan computes @code{howmany} transforms, where the input of the
1028 @code{k}-th transform is at location @code{in+k*idist} (in C pointer
1029 arithmetic), and its output is at location @code{out+k*odist}. Plans
1030 obtained in this way can often be faster than calling FFTW multiple
1031 times for the individual transforms. The basic @code{fftw_plan_dft}
1032 interface corresponds to @code{howmany=1} (in which case the @code{dist}
1033 parameters are ignored).
1034 @cindex howmany parameter
1035 @cindex dist
1036
1037
1038 Each of the @code{howmany} transforms has rank @code{rank} and size
1039 @code{n}, as in the basic interface. In addition, the advanced
1040 interface allows the input and output arrays of each transform to be
1041 row-major subarrays of larger rank-@code{rank} arrays, described by
1042 @code{inembed} and @code{onembed} parameters, respectively.
1043 @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank},
1044 and @code{n} should be elementwise less than or equal to
1045 @{@code{i},@code{o}@}@code{nembed}. Passing @code{NULL} for an
1046 @code{nembed} parameter is equivalent to passing @code{n} (i.e. same
1047 physical and logical dimensions, as in the basic interface.)
1048
1049 The @code{stride} parameters indicate that the @code{j}-th element of
1050 the input or output arrays is located at @code{j*istride} or
1051 @code{j*ostride}, respectively. (For a multi-dimensional array,
1052 @code{j} is the ordinary row-major index.) When combined with the
1053 @code{k}-th transform in a @code{howmany} loop, from above, this means
1054 that the (@code{j},@code{k})-th element is at @code{j*stride+k*dist}.
1055 (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.)
1056 @cindex stride
1057
1058
1059 For in-place transforms, the input and output @code{stride} and
1060 @code{dist} parameters should be the same; otherwise, the planner may
1061 return @code{NULL}.
1062
1063 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
1064 this function returns. You can safely free or reuse them.
1065
1066 @strong{Examples}:
1067 One transform of one 5 by 6 array contiguous in memory:
1068 @example
1069 int rank = 2;
1070 int n[] = @{5, 6@};
1071 int howmany = 1;
1072 int idist = odist = 0; /* unused because howmany = 1 */
1073 int istride = ostride = 1; /* array is contiguous in memory */
1074 int *inembed = n, *onembed = n;
1075 @end example
1076
1077 Transform of three 5 by 6 arrays, each contiguous in memory,
1078 stored in memory one after another:
1079 @example
1080 int rank = 2;
1081 int n[] = @{5, 6@};
1082 int howmany = 3;
1083 int idist = odist = n[0]*n[1]; /* = 30, the distance in memory
1084 between the first element
1085 of the first array and the
1086 first element of the second array */
1087 int istride = ostride = 1; /* array is contiguous in memory */
1088 int *inembed = n, *onembed = n;
1089 @end example
1090
1091 Transform each column of a 2d array with 10 rows and 3 columns:
1092 @example
1093 int rank = 1; /* not 2: we are computing 1d transforms */
1094 int n[] = @{10@}; /* 1d transforms of length 10 */
1095 int howmany = 3;
1096 int idist = odist = 1;
1097 int istride = ostride = 3; /* distance between two elements in
1098 the same column */
1099 int *inembed = n, *onembed = n;
1100 @end example
1101
1102 @c =========>
1103 @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface
1104 @subsection Advanced Real-data DFTs
1105
1106 @example
1107 fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany,
1108 double *in, const int *inembed,
1109 int istride, int idist,
1110 fftw_complex *out, const int *onembed,
1111 int ostride, int odist,
1112 unsigned flags);
1113 fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany,
1114 fftw_complex *in, const int *inembed,
1115 int istride, int idist,
1116 double *out, const int *onembed,
1117 int ostride, int odist,
1118 unsigned flags);
1119 @end example
1120 @findex fftw_plan_many_dft_r2c
1121 @findex fftw_plan_many_dft_c2r
1122
1123 Like @code{fftw_plan_many_dft}, these two functions add @code{howmany},
1124 @code{nembed}, @code{stride}, and @code{dist} parameters to the
1125 @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but
1126 otherwise behave the same as the basic interface.
1127
1128 The interpretation of @code{howmany}, @code{stride}, and @code{dist} are
1129 the same as for @code{fftw_plan_many_dft}, above. Note that the
1130 @code{stride} and @code{dist} for the real array are in units of
1131 @code{double}, and for the complex array are in units of
1132 @code{fftw_complex}.
1133
1134 If an @code{nembed} parameter is @code{NULL}, it is interpreted as what
1135 it would be in the basic interface, as described in @ref{Real-data DFT
1136 Array Format}. That is, for the complex array the size is assumed to be
1137 the same as @code{n}, but with the last dimension cut roughly in half.
1138 For the real array, the size is assumed to be @code{n} if the transform
1139 is out-of-place, or @code{n} with the last dimension ``padded'' if the
1140 transform is in-place.
1141
1142 If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as
1143 the physical size of the corresponding array, in row-major order, just
1144 as for @code{fftw_plan_many_dft}. In this case, each dimension of
1145 @code{nembed} should be @code{>=} what it would be in the basic
1146 interface (e.g. the halved or padded @code{n}).
1147
1148 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
1149 this function returns. You can safely free or reuse them.
1150
1151 @c =========>
1152 @node Advanced Real-to-real Transforms, , Advanced Real-data DFTs, Advanced Interface
1153 @subsection Advanced Real-to-real Transforms
1154
1155 @example
1156 fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany,
1157 double *in, const int *inembed,
1158 int istride, int idist,
1159 double *out, const int *onembed,
1160 int ostride, int odist,
1161 const fftw_r2r_kind *kind, unsigned flags);
1162 @end example
1163 @findex fftw_plan_many_r2r
1164
1165 Like @code{fftw_plan_many_dft}, this functions adds @code{howmany},
1166 @code{nembed}, @code{stride}, and @code{dist} parameters to the
1167 @code{fftw_plan_r2r} function, but otherwise behave the same as the
1168 basic interface. The interpretation of those additional parameters are
1169 the same as for @code{fftw_plan_many_dft}. (Of course, the
1170 @code{stride} and @code{dist} parameters are now in units of
1171 @code{double}, not @code{fftw_complex}.)
1172
1173 Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not
1174 used after this function returns. You can safely free or reuse them.
1175
1176 @c ------------------------------------------------------------
1177 @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference
1178 @section Guru Interface
1179 @cindex guru interface
1180
1181 The ``guru'' interface to FFTW is intended to expose as much as possible
1182 of the flexibility in the underlying FFTW architecture. It allows one
1183 to compute multi-dimensional ``vectors'' (loops) of multi-dimensional
1184 transforms, where each vector/transform dimension has an independent
1185 size and stride.
1186 @cindex vector
1187 One can also use more general complex-number formats, e.g. separate real
1188 and imaginary arrays.
1189
1190 For those users who require the flexibility of the guru interface, it is
1191 important that they pay special attention to the documentation lest they
1192 shoot themselves in the foot.
1193
1194 @menu
1195 * Interleaved and split arrays::
1196 * Guru vector and transform sizes::
1197 * Guru Complex DFTs::
1198 * Guru Real-data DFTs::
1199 * Guru Real-to-real Transforms::
1200 * 64-bit Guru Interface::
1201 @end menu
1202
1203 @c =========>
1204 @node Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface
1205 @subsection Interleaved and split arrays
1206
1207 The guru interface supports two representations of complex numbers,
1208 which we call the interleaved and the split format.
1209
1210 The @dfn{interleaved} format is the same one used by the basic and
1211 advanced interfaces, and it is documented in @ref{Complex numbers}.
1212 In the interleaved format, you provide pointers to the real part of a
1213 complex number, and the imaginary part understood to be stored in the
1214 next memory location.
1215 @cindex interleaved format
1216
1217
1218 The @dfn{split} format allows separate pointers to the real and
1219 imaginary parts of a complex array.
1220 @cindex split format
1221
1222
1223 Technically, the interleaved format is redundant, because you can
1224 always express an interleaved array in terms of a split array with
1225 appropriate pointers and strides. On the other hand, the interleaved
1226 format is simpler to use, and it is common in practice. Hence, FFTW
1227 supports it as a special case.
1228
1229 @c =========>
1230 @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface
1231 @subsection Guru vector and transform sizes
1232
1233 The guru interface introduces one basic new data structure,
1234 @code{fftw_iodim}, that is used to specify sizes and strides for
1235 multi-dimensional transforms and vectors:
1236
1237 @example
1238 typedef struct @{
1239 int n;
1240 int is;
1241 int os;
1242 @} fftw_iodim;
1243 @end example
1244 @tindex fftw_iodim
1245
1246 Here, @code{n} is the size of the dimension, and @code{is} and @code{os}
1247 are the strides of that dimension for the input and output arrays. (The
1248 stride is the separation of consecutive elements along this dimension.)
1249
1250 The meaning of the stride parameter depends on the type of the array
1251 that the stride refers to. @emph{If the array is interleaved complex,
1252 strides are expressed in units of complex numbers
1253 (@code{fftw_complex}). If the array is split complex or real, strides
1254 are expressed in units of real numbers (@code{double}).} This
1255 convention is consistent with the usual pointer arithmetic in the C
1256 language. An interleaved array is denoted by a pointer @code{p} to
1257 @code{fftw_complex}, so that @code{p+1} points to the next complex
1258 number. Split arrays are denoted by pointers to @code{double}, in
1259 which case pointer arithmetic operates in units of
1260 @code{sizeof(double)}.
1261 @cindex stride
1262
1263
1264 The guru planner interfaces all take a (@code{rank}, @code{dims[rank]})
1265 pair describing the transform size, and a (@code{howmany_rank},
1266 @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a
1267 multi-dimensional loop of transforms to perform), where @code{dims} and
1268 @code{howmany_dims} are arrays of @code{fftw_iodim}.
1269
1270 For example, the @code{howmany} parameter in the advanced complex-DFT
1271 interface corresponds to @code{howmany_rank} = 1,
1272 @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} =
1273 @code{idist}, and @code{howmany_dims[0].os} = @code{odist}.
1274 @cindex howmany loop
1275 @cindex dist
1276 (To compute a single transform, you can just use @code{howmany_rank} = 0.)
1277
1278
1279 A row-major multidimensional array with dimensions @code{n[rank]}
1280 (@pxref{Row-major Format}) corresponds to @code{dims[i].n} =
1281 @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] *
1282 dims[i+1].is} (similarly for @code{os}). The stride of the last
1283 (@code{i=rank-1}) dimension is the overall stride of the array.
1284 e.g. to be equivalent to the advanced complex-DFT interface, you would
1285 have @code{dims[rank-1].is} = @code{istride} and
1286 @code{dims[rank-1].os} = @code{ostride}.
1287 @cindex row-major
1288
1289
1290 In general, we only guarantee FFTW to return a non-@code{NULL} plan if
1291 the vector and transform dimensions correspond to a set of distinct
1292 indices, and for in-place transforms the input/output strides should
1293 be the same.
1294
1295 @c =========>
1296 @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface
1297 @subsection Guru Complex DFTs
1298
1299 @example
1300 fftw_plan fftw_plan_guru_dft(
1301 int rank, const fftw_iodim *dims,
1302 int howmany_rank, const fftw_iodim *howmany_dims,
1303 fftw_complex *in, fftw_complex *out,
1304 int sign, unsigned flags);
1305
1306 fftw_plan fftw_plan_guru_split_dft(
1307 int rank, const fftw_iodim *dims,
1308 int howmany_rank, const fftw_iodim *howmany_dims,
1309 double *ri, double *ii, double *ro, double *io,
1310 unsigned flags);
1311 @end example
1312 @findex fftw_plan_guru_dft
1313 @findex fftw_plan_guru_split_dft
1314
1315 These two functions plan a complex-data, multi-dimensional DFT
1316 for the interleaved and split format, respectively.
1317 Transform dimensions are given by (@code{rank}, @code{dims}) over a
1318 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
1319 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
1320 to @code{fftw_iodim} arrays of length @code{rank} and
1321 @code{howmany_rank}, respectively.
1322
1323 @cindex flags
1324 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
1325 as defined in @ref{Planner Flags}.
1326
1327 In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and
1328 @code{out} point to the interleaved input and output arrays,
1329 respectively. The sign can be either @math{-1} (=
1330 @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). If the
1331 pointers are equal, the transform is in-place.
1332
1333 In the @code{fftw_plan_guru_split_dft} function,
1334 @code{ri} and @code{ii} point to the real and imaginary input arrays,
1335 and @code{ro} and @code{io} point to the real and imaginary output
1336 arrays. The input and output pointers may be the same, indicating an
1337 in-place transform. For example, for @code{fftw_complex} pointers
1338 @code{in} and @code{out}, the corresponding parameters are:
1339
1340 @example
1341 ri = (double *) in;
1342 ii = (double *) in + 1;
1343 ro = (double *) out;
1344 io = (double *) out + 1;
1345 @end example
1346
1347 Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides
1348 are expressed in units of @code{double}. For a contiguous
1349 @code{fftw_complex} array, the overall stride of the transform should
1350 be 2, the distance between consecutive real parts or between
1351 consecutive imaginary parts; see @ref{Guru vector and transform
1352 sizes}. Note that the dimension strides are applied equally to the
1353 real and imaginary parts; real and imaginary arrays with different
1354 strides are not supported.
1355
1356 There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}.
1357 This function always plans for an @code{FFTW_FORWARD} transform. To
1358 plan for an @code{FFTW_BACKWARD} transform, you can exploit the
1359 identity that the backwards DFT is equal to the forwards DFT with the
1360 real and imaginary parts swapped. For example, in the case of the
1361 @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform
1362 is computed by the parameters:
1363
1364 @example
1365 ri = (double *) in + 1;
1366 ii = (double *) in;
1367 ro = (double *) out + 1;
1368 io = (double *) out;
1369 @end example
1370
1371 @c =========>
1372 @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface
1373 @subsection Guru Real-data DFTs
1374
1375 @example
1376 fftw_plan fftw_plan_guru_dft_r2c(
1377 int rank, const fftw_iodim *dims,
1378 int howmany_rank, const fftw_iodim *howmany_dims,
1379 double *in, fftw_complex *out,
1380 unsigned flags);
1381
1382 fftw_plan fftw_plan_guru_split_dft_r2c(
1383 int rank, const fftw_iodim *dims,
1384 int howmany_rank, const fftw_iodim *howmany_dims,
1385 double *in, double *ro, double *io,
1386 unsigned flags);
1387
1388 fftw_plan fftw_plan_guru_dft_c2r(
1389 int rank, const fftw_iodim *dims,
1390 int howmany_rank, const fftw_iodim *howmany_dims,
1391 fftw_complex *in, double *out,
1392 unsigned flags);
1393
1394 fftw_plan fftw_plan_guru_split_dft_c2r(
1395 int rank, const fftw_iodim *dims,
1396 int howmany_rank, const fftw_iodim *howmany_dims,
1397 double *ri, double *ii, double *out,
1398 unsigned flags);
1399 @end example
1400 @findex fftw_plan_guru_dft_r2c
1401 @findex fftw_plan_guru_split_dft_r2c
1402 @findex fftw_plan_guru_dft_c2r
1403 @findex fftw_plan_guru_split_dft_c2r
1404
1405 Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with
1406 transform dimensions given by (@code{rank}, @code{dims}) over a
1407 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
1408 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
1409 to @code{fftw_iodim} arrays of length @code{rank} and
1410 @code{howmany_rank}, respectively. As for the basic and advanced
1411 interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform
1412 is @code{FFTW_BACKWARD}.
1413
1414 The @emph{last} dimension of @code{dims} is interpreted specially:
1415 that dimension of the real array has size @code{dims[rank-1].n}, but
1416 that dimension of the complex array has size @code{dims[rank-1].n/2+1}
1417 (division rounded down). The strides, on the other hand, are taken to
1418 be exactly as specified. It is up to the user to specify the strides
1419 appropriately for the peculiar dimensions of the data, and we do not
1420 guarantee that the planner will succeed (return non-@code{NULL}) for
1421 any dimensions other than those described in @ref{Real-data DFT Array
1422 Format} and generalized in @ref{Advanced Real-data DFTs}. (That is,
1423 for an in-place transform, each individual dimension should be able to
1424 operate in place.)
1425 @cindex in-place
1426
1427
1428 @code{in} and @code{out} point to the input and output arrays for r2c
1429 and c2r transforms, respectively. For split arrays, @code{ri} and
1430 @code{ii} point to the real and imaginary input arrays for a c2r
1431 transform, and @code{ro} and @code{io} point to the real and imaginary
1432 output arrays for an r2c transform. @code{in} and @code{ro} or
1433 @code{ri} and @code{out} may be the same, indicating an in-place
1434 transform. (In-place transforms where @code{in} and @code{io} or
1435 @code{ii} and @code{out} are the same are not currently supported.)
1436
1437 @cindex flags
1438 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
1439 as defined in @ref{Planner Flags}.
1440
1441 In-place transforms of rank greater than 1 are currently only
1442 supported for interleaved arrays. For split arrays, the planner will
1443 return @code{NULL}.
1444 @cindex in-place
1445
1446 @c =========>
1447 @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface
1448 @subsection Guru Real-to-real Transforms
1449
1450 @example
1451 fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims,
1452 int howmany_rank,
1453 const fftw_iodim *howmany_dims,
1454 double *in, double *out,
1455 const fftw_r2r_kind *kind,
1456 unsigned flags);
1457 @end example
1458 @findex fftw_plan_guru_r2r
1459
1460 Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD}
1461 transform with transform dimensions given by (@code{rank}, @code{dims})
1462 over a multi-dimensional vector (loop) of dimensions
1463 (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and
1464 @code{howmany_dims} should point to @code{fftw_iodim} arrays of length
1465 @code{rank} and @code{howmany_rank}, respectively.
1466
1467 The transform kind of each dimension is given by the @code{kind}
1468 parameter, which should point to an array of length @code{rank}. Valid
1469 @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform
1470 Kinds}.
1471
1472 @code{in} and @code{out} point to the real input and output arrays; they
1473 may be the same, indicating an in-place transform.
1474
1475 @cindex flags
1476 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
1477 as defined in @ref{Planner Flags}.
1478
1479 @c =========>
1480 @node 64-bit Guru Interface, , Guru Real-to-real Transforms, Guru Interface
1481 @subsection 64-bit Guru Interface
1482 @cindex 64-bit architecture
1483
1484 When compiled in 64-bit mode on a 64-bit architecture (where addresses
1485 are 64 bits wide), FFTW uses 64-bit quantities internally for all
1486 transform sizes, strides, and so on---you don't have to do anything
1487 special to exploit this. However, in the ordinary FFTW interfaces,
1488 you specify the transform size by an @code{int} quantity, which is
1489 normally only 32 bits wide. This means that, even though FFTW is
1490 using 64-bit sizes internally, you cannot specify a single transform
1491 dimension larger than
1492 @ifinfo
1493 2^31-1
1494 @end ifinfo
1495 @html
1496 2<sup><small>31</small></sup>&minus;1
1497 @end html
1498 @tex
1499 $2^{31}-1$
1500 @end tex
1501 numbers.
1502
1503 We expect that few users will require transforms larger than this, but,
1504 for those who do, we provide a 64-bit version of the guru interface in
1505 which all sizes are specified as integers of type @code{ptrdiff_t}
1506 instead of @code{int}. (@code{ptrdiff_t} is a signed integer type
1507 defined by the C standard to be wide enough to represent address
1508 differences, and thus must be at least 64 bits wide on a 64-bit
1509 machine.) We stress that there is @emph{no performance advantage} to
1510 using this interface---the same internal FFTW code is employed
1511 regardless---and it is only necessary if you want to specify very
1512 large transform sizes.
1513 @tindex ptrdiff_t
1514
1515
1516 In particular, the 64-bit guru interface is a set of planner routines
1517 that are exactly the same as the guru planner routines, except that
1518 they are named with @samp{guru64} instead of @samp{guru} and they take
1519 arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}.
1520 For example, instead of @code{fftw_plan_guru_dft}, we have
1521 @code{fftw_plan_guru64_dft}.
1522
1523 @example
1524 fftw_plan fftw_plan_guru64_dft(
1525 int rank, const fftw_iodim64 *dims,
1526 int howmany_rank, const fftw_iodim64 *howmany_dims,
1527 fftw_complex *in, fftw_complex *out,
1528 int sign, unsigned flags);
1529 @end example
1530 @findex fftw_plan_guru64_dft
1531
1532 The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the
1533 same interpretation, except that it uses type @code{ptrdiff_t} instead
1534 of type @code{int}.
1535
1536 @example
1537 typedef struct @{
1538 ptrdiff_t n;
1539 ptrdiff_t is;
1540 ptrdiff_t os;
1541 @} fftw_iodim64;
1542 @end example
1543 @tindex fftw_iodim64
1544
1545 Every other @samp{fftw_plan_guru} function also has a
1546 @samp{fftw_plan_guru64} equivalent, but we do not repeat their
1547 documentation here since they are identical to the 32-bit versions
1548 except as noted above.
1549
1550 @c -----------------------------------------------------------
1551 @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference
1552 @section New-array Execute Functions
1553 @cindex execute
1554 @cindex new-array execution
1555
1556 Normally, one executes a plan for the arrays with which the plan was
1557 created, by calling @code{fftw_execute(plan)} as described in @ref{Using
1558 Plans}.
1559 @findex fftw_execute
1560 However, it is possible for sophisticated users to apply a given plan
1561 to a @emph{different} array using the ``new-array execute'' functions
1562 detailed below, provided that the following conditions are met:
1563
1564 @itemize @bullet
1565
1566 @item
1567 The array size, strides, etcetera are the same (since those are set by
1568 the plan).
1569
1570 @item
1571 The input and output arrays are the same (in-place) or different
1572 (out-of-place) if the plan was originally created to be in-place or
1573 out-of-place, respectively.
1574
1575 @item
1576 For split arrays, the separations between the real and imaginary
1577 parts, @code{ii-ri} and @code{io-ro}, are the same as they were for
1578 the input and output arrays when the plan was created. (This
1579 condition is automatically satisfied for interleaved arrays.)
1580
1581 @item
1582 The @dfn{alignment} of the new input/output arrays is the same as that
1583 of the input/output arrays when the plan was created, unless the plan
1584 was created with the @code{FFTW_UNALIGNED} flag.
1585 @ctindex FFTW_UNALIGNED
1586 Here, the alignment is a platform-dependent quantity (for example, it is
1587 the address modulo 16 if SSE SIMD instructions are used, but the address
1588 modulo 4 for non-SIMD single-precision FFTW on the same machine). In
1589 general, only arrays allocated with @code{fftw_malloc} are guaranteed to
1590 be equally aligned (@pxref{SIMD alignment and fftw_malloc}).
1591
1592 @end itemize
1593
1594 @cindex alignment
1595 The alignment issue is especially critical, because if you don't use
1596 @code{fftw_malloc} then you may have little control over the alignment
1597 of arrays in memory. For example, neither the C++ @code{new} function
1598 nor the Fortran @code{allocate} statement provide strong enough
1599 guarantees about data alignment. If you don't use @code{fftw_malloc},
1600 therefore, you probably have to use @code{FFTW_UNALIGNED} (which
1601 disables most SIMD support). If possible, it is probably better for
1602 you to simply create multiple plans (creating a new plan is quick once
1603 one exists for a given size), or better yet re-use the same array for
1604 your transforms.
1605
1606 @findex fftw_alignment_of
1607 For rare circumstances in which you cannot control the alignment of
1608 allocated memory, but wish to determine where a given array is
1609 aligned like the original array for which a plan was created, you can
1610 use the @code{fftw_alignment_of} function:
1611 @example
1612 int fftw_alignment_of(double *p);
1613 @end example
1614 Two arrays have equivalent alignment (for the purposes of applying a
1615 plan) if and only if @code{fftw_alignment_of} returns the same value
1616 for the corresponding pointers to their data (typecast to @code{double*}
1617 if necessary).
1618
1619 If you are tempted to use the new-array execute interface because you
1620 want to transform a known bunch of arrays of the same size, you should
1621 probably go use the advanced interface instead (@pxref{Advanced
1622 Interface})).
1623
1624 The new-array execute functions are:
1625
1626 @example
1627 void fftw_execute_dft(
1628 const fftw_plan p,
1629 fftw_complex *in, fftw_complex *out);
1630
1631 void fftw_execute_split_dft(
1632 const fftw_plan p,
1633 double *ri, double *ii, double *ro, double *io);
1634
1635 void fftw_execute_dft_r2c(
1636 const fftw_plan p,
1637 double *in, fftw_complex *out);
1638
1639 void fftw_execute_split_dft_r2c(
1640 const fftw_plan p,
1641 double *in, double *ro, double *io);
1642
1643 void fftw_execute_dft_c2r(
1644 const fftw_plan p,
1645 fftw_complex *in, double *out);
1646
1647 void fftw_execute_split_dft_c2r(
1648 const fftw_plan p,
1649 double *ri, double *ii, double *out);
1650
1651 void fftw_execute_r2r(
1652 const fftw_plan p,
1653 double *in, double *out);
1654 @end example
1655 @findex fftw_execute_dft
1656 @findex fftw_execute_split_dft
1657 @findex fftw_execute_dft_r2c
1658 @findex fftw_execute_split_dft_r2c
1659 @findex fftw_execute_dft_c2r
1660 @findex fftw_execute_split_dft_c2r
1661 @findex fftw_execute_r2r
1662
1663 These execute the @code{plan} to compute the corresponding transform on
1664 the input/output arrays specified by the subsequent arguments. The
1665 input/output array arguments have the same meanings as the ones passed
1666 to the guru planner routines in the preceding sections. The @code{plan}
1667 is not modified, and these routines can be called as many times as
1668 desired, or intermixed with calls to the ordinary @code{fftw_execute}.
1669
1670 The @code{plan} @emph{must} have been created for the transform type
1671 corresponding to the execute function, e.g. it must be a complex-DFT
1672 plan for @code{fftw_execute_dft}. Any of the planner routines for that
1673 transform type, from the basic to the guru interface, could have been
1674 used to create the plan, however.
1675
1676 @c ------------------------------------------------------------
1677 @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference
1678 @section Wisdom
1679 @cindex wisdom
1680 @cindex saving plans to disk
1681
1682 This section documents the FFTW mechanism for saving and restoring
1683 plans from disk. This mechanism is called @dfn{wisdom}.
1684
1685 @menu
1686 * Wisdom Export::
1687 * Wisdom Import::
1688 * Forgetting Wisdom::
1689 * Wisdom Utilities::
1690 @end menu
1691
1692 @c =========>
1693 @node Wisdom Export, Wisdom Import, Wisdom, Wisdom
1694 @subsection Wisdom Export
1695
1696 @example
1697 int fftw_export_wisdom_to_filename(const char *filename);
1698 void fftw_export_wisdom_to_file(FILE *output_file);
1699 char *fftw_export_wisdom_to_string(void);
1700 void fftw_export_wisdom(void (*write_char)(char c, void *), void *data);
1701 @end example
1702 @findex fftw_export_wisdom
1703 @findex fftw_export_wisdom_to_filename
1704 @findex fftw_export_wisdom_to_file
1705 @findex fftw_export_wisdom_to_string
1706
1707 These functions allow you to export all currently accumulated wisdom
1708 in a form from which it can be later imported and restored, even
1709 during a separate run of the program. (@xref{Words of Wisdom-Saving
1710 Plans}.) The current store of wisdom is not affected by calling any
1711 of these routines.
1712
1713 @code{fftw_export_wisdom} exports the wisdom to any output
1714 medium, as specified by the callback function
1715 @code{write_char}. @code{write_char} is a @code{putc}-like function that
1716 writes the character @code{c} to some output; its second parameter is
1717 the @code{data} pointer passed to @code{fftw_export_wisdom}. For
1718 convenience, the following three ``wrapper'' routines are provided:
1719
1720 @code{fftw_export_wisdom_to_filename} writes wisdom to a file named
1721 @code{filename} (which is created or overwritten), returning @code{1}
1722 on success and @code{0} on failure. A lower-level function, which
1723 requires you to open and close the file yourself (e.g. if you want to
1724 write wisdom to a portion of a larger file) is
1725 @code{fftw_export_wisdom_to_file}. This writes the wisdom to the
1726 current position in @code{output_file}, which should be open with
1727 write permission; upon exit, the file remains open and is positioned
1728 at the end of the wisdom data.
1729
1730 @code{fftw_export_wisdom_to_string} returns a pointer to a
1731 @code{NULL}-terminated string holding the wisdom data. This string is
1732 dynamically allocated, and it is the responsibility of the caller to
1733 deallocate it with @code{free} when it is no longer needed.
1734
1735 All of these routines export the wisdom in the same format, which we
1736 will not document here except to say that it is LISP-like ASCII text
1737 that is insensitive to white space.
1738
1739 @c =========>
1740 @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom
1741 @subsection Wisdom Import
1742
1743 @example
1744 int fftw_import_system_wisdom(void);
1745 int fftw_import_wisdom_from_filename(const char *filename);
1746 int fftw_import_wisdom_from_string(const char *input_string);
1747 int fftw_import_wisdom(int (*read_char)(void *), void *data);
1748 @end example
1749 @findex fftw_import_wisdom
1750 @findex fftw_import_system_wisdom
1751 @findex fftw_import_wisdom_from_filename
1752 @findex fftw_import_wisdom_from_file
1753 @findex fftw_import_wisdom_from_string
1754
1755 These functions import wisdom into a program from data stored by the
1756 @code{fftw_export_wisdom} functions above. (@xref{Words of
1757 Wisdom-Saving Plans}.) The imported wisdom replaces any wisdom
1758 already accumulated by the running program.
1759
1760 @code{fftw_import_wisdom} imports wisdom from any input medium, as
1761 specified by the callback function @code{read_char}. @code{read_char} is
1762 a @code{getc}-like function that returns the next character in the
1763 input; its parameter is the @code{data} pointer passed to
1764 @code{fftw_import_wisdom}. If the end of the input data is reached
1765 (which should never happen for valid data), @code{read_char} should
1766 return @code{EOF} (as defined in @code{<stdio.h>}). For convenience,
1767 the following three ``wrapper'' routines are provided:
1768
1769 @code{fftw_import_wisdom_from_filename} reads wisdom from a file named
1770 @code{filename}. A lower-level function, which requires you to open
1771 and close the file yourself (e.g. if you want to read wisdom from a
1772 portion of a larger file) is @code{fftw_import_wisdom_from_file}. This
1773 reads wisdom from the current position in @code{input_file} (which
1774 should be open with read permission); upon exit, the file remains
1775 open, but the position of the read pointer is unspecified.
1776
1777 @code{fftw_import_wisdom_from_string} reads wisdom from the
1778 @code{NULL}-terminated string @code{input_string}.
1779
1780 @code{fftw_import_system_wisdom} reads wisdom from an
1781 implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix
1782 and GNU systems).
1783 @cindex wisdom, system-wide
1784
1785
1786 The return value of these import routines is @code{1} if the wisdom was
1787 read successfully and @code{0} otherwise. Note that, in all of these
1788 functions, any data in the input stream past the end of the wisdom data
1789 is simply ignored.
1790
1791 @c =========>
1792 @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom
1793 @subsection Forgetting Wisdom
1794
1795 @example
1796 void fftw_forget_wisdom(void);
1797 @end example
1798 @findex fftw_forget_wisdom
1799
1800 Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom}
1801 to be discarded and its associated memory to be freed. (New
1802 @code{wisdom} can still be gathered subsequently, however.)
1803
1804 @c =========>
1805 @node Wisdom Utilities, , Forgetting Wisdom, Wisdom
1806 @subsection Wisdom Utilities
1807
1808 FFTW includes two standalone utility programs that deal with wisdom. We
1809 merely summarize them here, since they come with their own @code{man}
1810 pages for Unix and GNU systems (with HTML versions on our web site).
1811
1812 The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in
1813 single precision, etcetera), which can be used to create a wisdom file
1814 containing plans for any of the transform sizes and types supported by
1815 FFTW. It is preferable to create wisdom directly from your executable
1816 (@pxref{Caveats in Using Wisdom}), but this program is useful for
1817 creating global wisdom files for @code{fftw_import_system_wisdom}.
1818 @cindex fftw-wisdom utility
1819
1820
1821 The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom
1822 file as input and produces a @dfn{configuration routine} as output. The
1823 latter is a C subroutine that you can compile and link into your
1824 program, replacing a routine of the same name in the FFTW library, that
1825 determines which parts of FFTW are callable by your program.
1826 @code{fftw-wisdom-to-conf} produces a configuration routine that links
1827 to only those parts of FFTW needed by the saved plans in the wisdom,
1828 greatly reducing the size of statically linked executables (which should
1829 only attempt to create plans corresponding to those in the wisdom,
1830 however).
1831 @cindex fftw-wisdom-to-conf utility
1832 @cindex configuration routines
1833
1834 @c ------------------------------------------------------------
1835 @node What FFTW Really Computes, , Wisdom, FFTW Reference
1836 @section What FFTW Really Computes
1837
1838 In this section, we provide precise mathematical definitions for the
1839 transforms that FFTW computes. These transform definitions are fairly
1840 standard, but some authors follow slightly different conventions for the
1841 normalization of the transform (the constant factor in front) and the
1842 sign of the complex exponent. We begin by presenting the
1843 one-dimensional (1d) transform definitions, and then give the
1844 straightforward extension to multi-dimensional transforms.
1845
1846 @menu
1847 * The 1d Discrete Fourier Transform (DFT)::
1848 * The 1d Real-data DFT::
1849 * 1d Real-even DFTs (DCTs)::
1850 * 1d Real-odd DFTs (DSTs)::
1851 * 1d Discrete Hartley Transforms (DHTs)::
1852 * Multi-dimensional Transforms::
1853 @end menu
1854
1855 @c =========>
1856 @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes
1857 @subsection The 1d Discrete Fourier Transform (DFT)
1858
1859 @cindex discrete Fourier transform
1860 @cindex DFT
1861 The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a
1862 1d complex array @math{X} of size @math{n} computes an array @math{Y},
1863 where:
1864 @tex
1865 $$
1866 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
1867 $$
1868 @end tex
1869 @ifinfo
1870 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
1871 @end ifinfo
1872 @html
1873 <center><img src="equation-dft.png" align="top">.</center>
1874 @end html
1875 The backward (@code{FFTW_BACKWARD}) DFT computes:
1876 @tex
1877 $$
1878 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
1879 $$
1880 @end tex
1881 @ifinfo
1882 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
1883 @end ifinfo
1884 @html
1885 <center><img src="equation-idft.png" align="top">.</center>
1886 @end html
1887
1888 @cindex normalization
1889 FFTW computes an unnormalized transform, in that there is no coefficient
1890 in front of the summation in the DFT. In other words, applying the
1891 forward and then the backward transform will multiply the input by
1892 @math{n}.
1893
1894 @cindex frequency
1895 From above, an @code{FFTW_FORWARD} transform corresponds to a sign of
1896 @math{-1} in the exponent of the DFT. Note also that we use the
1897 standard ``in-order'' output ordering---the @math{k}-th output
1898 corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T}
1899 is your total sampling period). For those who like to think in terms of
1900 positive and negative frequencies, this means that the positive
1901 frequencies are stored in the first half of the output and the negative
1902 frequencies are stored in backwards order in the second half of the
1903 output. (The frequency @math{-k/n} is the same as the frequency
1904 @math{(n-k)/n}.)
1905
1906 @c =========>
1907 @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes
1908 @subsection The 1d Real-data DFT
1909
1910 The real-input (r2c) DFT in FFTW computes the @emph{forward} transform
1911 @math{Y} of the size @code{n} real array @math{X}, exactly as defined
1912 above, i.e.
1913 @tex
1914 $$
1915 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
1916 $$
1917 @end tex
1918 @ifinfo
1919 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
1920 @end ifinfo
1921 @html
1922 <center><img src="equation-dft.png" align="top">.</center>
1923 @end html
1924 This output array @math{Y} can easily be shown to possess the
1925 ``Hermitian'' symmetry
1926 @cindex Hermitian
1927 @tex
1928 $Y_k = Y_{n-k}^*$,
1929 @end tex
1930 @ifinfo
1931 Y[k] = Y[n-k]*,
1932 @end ifinfo
1933 @html
1934 <i>Y<sub>k</sub> = Y<sub>n-k</sub></i><sup>*</sup>,
1935 @end html
1936 where we take @math{Y} to be periodic so that
1937 @tex
1938 $Y_n = Y_0$.
1939 @end tex
1940 @ifinfo
1941 Y[n] = Y[0].
1942 @end ifinfo
1943 @html
1944 <i>Y<sub>n</sub> = Y</i><sub>0</sub>.
1945 @end html
1946
1947 As a result of this symmetry, half of the output @math{Y} is redundant
1948 (being the complex conjugate of the other half), and so the 1d r2c
1949 transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y}
1950 (@math{n/2+1} complex numbers), where the division by @math{2} is
1951 rounded down.
1952
1953 Moreover, the Hermitian symmetry implies that
1954 @tex
1955 $Y_0$
1956 @end tex
1957 @ifinfo
1958 Y[0]
1959 @end ifinfo
1960 @html
1961 <i>Y</i><sub>0</sub>
1962 @end html
1963 and, if @math{n} is even, the
1964 @tex
1965 $Y_{n/2}$
1966 @end tex
1967 @ifinfo
1968 Y[n/2]
1969 @end ifinfo
1970 @html
1971 <i>Y</i><sub><i>n</i>/2</sub>
1972 @end html
1973 element, are purely real. So, for the @code{R2HC} r2r transform, the
1974 halfcomplex format does not store the imaginary parts of these elements.
1975 @cindex r2r
1976 @ctindex R2HC
1977 @cindex halfcomplex format
1978
1979
1980 The c2r and @code{H2RC} r2r transforms compute the backward DFT of the
1981 @emph{complex} array @math{X} with Hermitian symmetry, stored in the
1982 r2c/@code{R2HC} output formats, respectively, where the backward
1983 transform is defined exactly as for the complex case:
1984 @tex
1985 $$
1986 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
1987 $$
1988 @end tex
1989 @ifinfo
1990 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
1991 @end ifinfo
1992 @html
1993 <center><img src="equation-idft.png" align="top">.</center>
1994 @end html
1995 The outputs @code{Y} of this transform can easily be seen to be purely
1996 real, and are stored as an array of real numbers.
1997
1998 @cindex normalization
1999 Like FFTW's complex DFT, these transforms are unnormalized. In other
2000 words, applying the real-to-complex (forward) and then the
2001 complex-to-real (backward) transform will multiply the input by
2002 @math{n}.
2003
2004 @c =========>
2005 @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes
2006 @subsection 1d Real-even DFTs (DCTs)
2007
2008 The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized
2009 forward (and backward) DFTs as defined above, where the input array
2010 @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry. In
2011 this case, the output array is likewise real and even symmetry.
2012 @cindex real-even DFT
2013 @cindex REDFT
2014
2015
2016 @ctindex REDFT00
2017 For the case of @code{REDFT00}, this even symmetry means that
2018 @tex
2019 $X_j = X_{N-j}$,
2020 @end tex
2021 @ifinfo
2022 X[j] = X[N-j],
2023 @end ifinfo
2024 @html
2025 <i>X<sub>j</sub> = X<sub>N-j</sub></i>,
2026 @end html
2027 where we take @math{X} to be periodic so that
2028 @tex
2029 $X_N = X_0$.
2030 @end tex
2031 @ifinfo
2032 X[N] = X[0].
2033 @end ifinfo
2034 @html
2035 <i>X<sub>N</sub> = X</i><sub>0</sub>.
2036 @end html
2037 Because of this redundancy, only the first @math{n} real numbers are
2038 actually stored, where @math{N = 2(n-1)}.
2039
2040 The proper definition of even symmetry for @code{REDFT10},
2041 @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate
2042 because of the shifts by @math{1/2} of the input and/or output, although
2043 the corresponding boundary conditions are given in @ref{Real even/odd
2044 DFTs (cosine/sine transforms)}. Because of the even symmetry, however,
2045 the sine terms in the DFT all cancel and the remaining cosine terms are
2046 written explicitly below. This formulation often leads people to call
2047 such a transform a @dfn{discrete cosine transform} (DCT), although it is
2048 really just a special case of the DFT.
2049 @cindex discrete cosine transform
2050 @cindex DCT
2051
2052
2053 In each of the definitions below, we transform a real array @math{X} of
2054 length @math{n} to a real array @math{Y} of length @math{n}:
2055
2056 @subsubheading REDFT00 (DCT-I)
2057 @ctindex REDFT00
2058 An @code{REDFT00} transform (type-I DCT) in FFTW is defined by:
2059 @tex
2060 $$
2061 Y_k = X_0 + (-1)^k X_{n-1}
2062 + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)].
2063 $$
2064 @end tex
2065 @ifinfo
2066 Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))).
2067 @end ifinfo
2068 @html
2069 <center><img src="equation-redft00.png" align="top">.</center>
2070 @end html
2071 Note that this transform is not defined for @math{n=1}. For @math{n=2},
2072 the summation term above is dropped as you might expect.
2073
2074 @subsubheading REDFT10 (DCT-II)
2075 @ctindex REDFT10
2076 An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by:
2077 @tex
2078 $$
2079 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n].
2080 $$
2081 @end tex
2082 @ifinfo
2083 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)).
2084 @end ifinfo
2085 @html
2086 <center><img src="equation-redft10.png" align="top">.</center>
2087 @end html
2088
2089 @subsubheading REDFT01 (DCT-III)
2090 @ctindex REDFT01
2091 An @code{REDFT01} transform (type-III DCT) in FFTW is defined by:
2092 @tex
2093 $$
2094 Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n].
2095 $$
2096 @end tex
2097 @ifinfo
2098 Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)).
2099 @end ifinfo
2100 @html
2101 <center><img src="equation-redft01.png" align="top">.</center>
2102 @end html
2103 In the case of @math{n=1}, this reduces to
2104 @tex
2105 $Y_0 = X_0$.
2106 @end tex
2107 @ifinfo
2108 Y[0] = X[0].
2109 @end ifinfo
2110 @html
2111 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
2112 @end html
2113 Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''.
2114 @cindex IDCT
2115
2116 @subsubheading REDFT11 (DCT-IV)
2117 @ctindex REDFT11
2118 An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by:
2119 @tex
2120 $$
2121 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n].
2122 $$
2123 @end tex
2124 @ifinfo
2125 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)).
2126 @end ifinfo
2127 @html
2128 <center><img src="equation-redft11.png" align="top">.</center>
2129 @end html
2130
2131 @subsubheading Inverses and Normalization
2132
2133 These definitions correspond directly to the unnormalized DFTs used
2134 elsewhere in FFTW (hence the factors of @math{2} in front of the
2135 summations). The unnormalized inverse of @code{REDFT00} is
2136 @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and
2137 of @code{REDFT11} is @code{REDFT11}. Each unnormalized inverse results
2138 in the original array multiplied by @math{N}, where @math{N} is the
2139 @emph{logical} DFT size. For @code{REDFT00}, @math{N=2(n-1)} (note that
2140 @math{n=1} is not defined); otherwise, @math{N=2n}.
2141 @cindex normalization
2142
2143
2144 In defining the discrete cosine transform, some authors also include
2145 additional factors of
2146 @ifinfo
2147 sqrt(2)
2148 @end ifinfo
2149 @html
2150 &radic;2
2151 @end html
2152 @tex
2153 $\sqrt{2}$
2154 @end tex
2155 (or its inverse) multiplying selected inputs and/or outputs. This is a
2156 mostly cosmetic change that makes the transform orthogonal, but
2157 sacrifices the direct equivalence to a symmetric DFT.
2158
2159 @c =========>
2160 @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes
2161 @subsection 1d Real-odd DFTs (DSTs)
2162
2163 The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized
2164 forward (and backward) DFTs as defined above, where the input array
2165 @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry. In
2166 this case, the output is odd symmetry and purely imaginary.
2167 @cindex real-odd DFT
2168 @cindex RODFT
2169
2170
2171 @ctindex RODFT00
2172 For the case of @code{RODFT00}, this odd symmetry means that
2173 @tex
2174 $X_j = -X_{N-j}$,
2175 @end tex
2176 @ifinfo
2177 X[j] = -X[N-j],
2178 @end ifinfo
2179 @html
2180 <i>X<sub>j</sub> = -X<sub>N-j</sub></i>,
2181 @end html
2182 where we take @math{X} to be periodic so that
2183 @tex
2184 $X_N = X_0$.
2185 @end tex
2186 @ifinfo
2187 X[N] = X[0].
2188 @end ifinfo
2189 @html
2190 <i>X<sub>N</sub> = X</i><sub>0</sub>.
2191 @end html
2192 Because of this redundancy, only the first @math{n} real numbers
2193 starting at @math{j=1} are actually stored (the @math{j=0} element is
2194 zero), where @math{N = 2(n+1)}.
2195
2196 The proper definition of odd symmetry for @code{RODFT10},
2197 @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate
2198 because of the shifts by @math{1/2} of the input and/or output, although
2199 the corresponding boundary conditions are given in @ref{Real even/odd
2200 DFTs (cosine/sine transforms)}. Because of the odd symmetry, however,
2201 the cosine terms in the DFT all cancel and the remaining sine terms are
2202 written explicitly below. This formulation often leads people to call
2203 such a transform a @dfn{discrete sine transform} (DST), although it is
2204 really just a special case of the DFT.
2205 @cindex discrete sine transform
2206 @cindex DST
2207
2208
2209 In each of the definitions below, we transform a real array @math{X} of
2210 length @math{n} to a real array @math{Y} of length @math{n}:
2211
2212 @subsubheading RODFT00 (DST-I)
2213 @ctindex RODFT00
2214 An @code{RODFT00} transform (type-I DST) in FFTW is defined by:
2215 @tex
2216 $$
2217 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)].
2218 $$
2219 @end tex
2220 @ifinfo
2221 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))).
2222 @end ifinfo
2223 @html
2224 <center><img src="equation-rodft00.png" align="top">.</center>
2225 @end html
2226
2227 @subsubheading RODFT10 (DST-II)
2228 @ctindex RODFT10
2229 An @code{RODFT10} transform (type-II DST) in FFTW is defined by:
2230 @tex
2231 $$
2232 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n].
2233 $$
2234 @end tex
2235 @ifinfo
2236 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)).
2237 @end ifinfo
2238 @html
2239 <center><img src="equation-rodft10.png" align="top">.</center>
2240 @end html
2241
2242 @subsubheading RODFT01 (DST-III)
2243 @ctindex RODFT01
2244 An @code{RODFT01} transform (type-III DST) in FFTW is defined by:
2245 @tex
2246 $$
2247 Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n].
2248 $$
2249 @end tex
2250 @ifinfo
2251 Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)).
2252 @end ifinfo
2253 @html
2254 <center><img src="equation-rodft01.png" align="top">.</center>
2255 @end html
2256 In the case of @math{n=1}, this reduces to
2257 @tex
2258 $Y_0 = X_0$.
2259 @end tex
2260 @ifinfo
2261 Y[0] = X[0].
2262 @end ifinfo
2263 @html
2264 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
2265 @end html
2266
2267 @subsubheading RODFT11 (DST-IV)
2268 @ctindex RODFT11
2269 An @code{RODFT11} transform (type-IV DST) in FFTW is defined by:
2270 @tex
2271 $$
2272 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n].
2273 $$
2274 @end tex
2275 @ifinfo
2276 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)).
2277 @end ifinfo
2278 @html
2279 <center><img src="equation-rodft11.png" align="top">.</center>
2280 @end html
2281
2282 @subsubheading Inverses and Normalization
2283
2284 These definitions correspond directly to the unnormalized DFTs used
2285 elsewhere in FFTW (hence the factors of @math{2} in front of the
2286 summations). The unnormalized inverse of @code{RODFT00} is
2287 @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and
2288 of @code{RODFT11} is @code{RODFT11}. Each unnormalized inverse results
2289 in the original array multiplied by @math{N}, where @math{N} is the
2290 @emph{logical} DFT size. For @code{RODFT00}, @math{N=2(n+1)};
2291 otherwise, @math{N=2n}.
2292 @cindex normalization
2293
2294
2295 In defining the discrete sine transform, some authors also include
2296 additional factors of
2297 @ifinfo
2298 sqrt(2)
2299 @end ifinfo
2300 @html
2301 &radic;2
2302 @end html
2303 @tex
2304 $\sqrt{2}$
2305 @end tex
2306 (or its inverse) multiplying selected inputs and/or outputs. This is a
2307 mostly cosmetic change that makes the transform orthogonal, but
2308 sacrifices the direct equivalence to an antisymmetric DFT.
2309
2310 @c =========>
2311 @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes
2312 @subsection 1d Discrete Hartley Transforms (DHTs)
2313
2314 @cindex discrete Hartley transform
2315 @cindex DHT
2316 The discrete Hartley transform (DHT) of a 1d real array @math{X} of size
2317 @math{n} computes a real array @math{Y} of the same size, where:
2318 @tex
2319 $$
2320 Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)].
2321 $$
2322 @end tex
2323 @ifinfo
2324 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)].
2325 @end ifinfo
2326 @html
2327 <center><img src="equation-dht.png" align="top">.</center>
2328 @end html
2329
2330 @cindex normalization
2331 FFTW computes an unnormalized transform, in that there is no coefficient
2332 in front of the summation in the DHT. In other words, applying the
2333 transform twice (the DHT is its own inverse) will multiply the input by
2334 @math{n}.
2335
2336 @c =========>
2337 @node Multi-dimensional Transforms, , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes
2338 @subsection Multi-dimensional Transforms
2339
2340 The multi-dimensional transforms of FFTW, in general, compute simply the
2341 separable product of the given 1d transform along each dimension of the
2342 array. Since each of these transforms is unnormalized, computing the
2343 forward followed by the backward/inverse multi-dimensional transform
2344 will result in the original array scaled by the product of the
2345 normalization factors for each dimension (e.g. the product of the
2346 dimension sizes, for a multi-dimensional DFT).
2347
2348 @tex
2349 As an explicit example, consider the following exact mathematical
2350 definition of our multi-dimensional DFT. Let $X$ be a $d$-dimensional
2351 complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0
2352 \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$. Let also
2353 $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d
2354 \}$.
2355
2356 The forward transform computes a complex array~$Y$, whose
2357 structure is the same as that of~$X$, defined by
2358
2359 $$
2360 Y[k_1, k_2, \ldots, k_d] =
2361 \sum_{j_1 = 0}^{n_1 - 1}
2362 \sum_{j_2 = 0}^{n_2 - 1}
2363 \cdots
2364 \sum_{j_d = 0}^{n_d - 1}
2365 X[j_1, j_2, \ldots, j_d]
2366 \omega_1^{-j_1 k_1}
2367 \omega_2^{-j_2 k_2}
2368 \cdots
2369 \omega_d^{-j_d k_d} \ .
2370 $$
2371
2372 The backward transform computes
2373 $$
2374 Y[k_1, k_2, \ldots, k_d] =
2375 \sum_{j_1 = 0}^{n_1 - 1}
2376 \sum_{j_2 = 0}^{n_2 - 1}
2377 \cdots
2378 \sum_{j_d = 0}^{n_d - 1}
2379 X[j_1, j_2, \ldots, j_d]
2380 \omega_1^{j_1 k_1}
2381 \omega_2^{j_2 k_2}
2382 \cdots
2383 \omega_d^{j_d k_d} \ .
2384 $$
2385
2386 Computing the forward transform followed by the backward transform
2387 will multiply the array by $\prod_{s=1}^{d} n_d$.
2388 @end tex
2389
2390 @cindex r2c
2391 The definition of FFTW's multi-dimensional DFT of real data (r2c)
2392 deserves special attention. In this case, we logically compute the full
2393 multi-dimensional DFT of the input data; since the input data are purely
2394 real, the output data have the Hermitian symmetry and therefore only one
2395 non-redundant half need be stored. More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array
2396 @tex
2397 $Y[k_0, k_1, \ldots, k_{d-1}]$
2398 @end tex
2399 @html
2400 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
2401 <i>k</i><sub><i>d-1</i></sub>]
2402 @end html
2403 @ifinfo
2404 Y[k[0], k[1], ..., k[d-1]]
2405 @end ifinfo
2406 has the symmetry:
2407 @tex
2408 $$
2409 Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^*
2410 $$
2411 @end tex
2412 @html
2413 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
2414 <i>k</i><sub><i>d-1</i></sub>] = <i>Y</i>[<i>n</i><sub>0</sub> -
2415 <i>k</i><sub>0</sub>, <i>n</i><sub>1</sub> - <i>k</i><sub>1</sub>, ...,
2416 <i>n</i><sub><i>d-1</i></sub> - <i>k</i><sub><i>d-1</i></sub>]<sup>*</sup>
2417 @end html
2418 @ifinfo
2419 Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]*
2420 @end ifinfo
2421 (where each dimension is periodic). Because of this symmetry, we only
2422 store the
2423 @tex
2424 $k_{d-1} = 0 \cdots n_{d-1}/2$
2425 @end tex
2426 @html
2427 <i>k</i><sub><i>d-1</i></sub> = 0...<i>n</i><sub><i>d-1</i></sub>/2+1
2428 @end html
2429 @ifinfo
2430 k[d-1] = 0...n[d-1]/2
2431 @end ifinfo
2432 elements of the @emph{last} dimension (division by @math{2} is rounded
2433 down). (We could instead have cut any other dimension in half, but the
2434 last dimension proved computationally convenient.) This results in the
2435 peculiar array format described in more detail by @ref{Real-data DFT
2436 Array Format}.
2437
2438 The multi-dimensional c2r transform is simply the unnormalized inverse
2439 of the r2c transform. i.e. it is the same as FFTW's complex backward
2440 multi-dimensional DFT, operating on a Hermitian input array in the
2441 peculiar format mentioned above and outputting a real array (since the
2442 DFT output is purely real).
2443
2444 We should remind the user that the separable product of 1d transforms
2445 along each dimension, as computed by FFTW, is not always the same thing
2446 as the usual multi-dimensional transform. A multi-dimensional
2447 @code{R2HC} (or @code{HC2R}) transform is not identical to the
2448 multi-dimensional DFT, requiring some post-processing to combine the
2449 requisite real and imaginary parts, as was described in @ref{The
2450 Halfcomplex-format DFT}. Likewise, FFTW's multidimensional
2451 @code{FFTW_DHT} r2r transform is not the same thing as the logical
2452 multi-dimensional discrete Hartley transform defined in the literature,
2453 as discussed in @ref{The Discrete Hartley Transform}.
2454