comparison src/fftw-3.3.3/doc/reference.texi @ 10:37bf6b4a2645

Add FFTW3
author Chris Cannam
date Wed, 20 Mar 2013 15:35:50 +0000
parents
children
comparison
equal deleted inserted replaced
9:c0fb53affa76 10:37bf6b4a2645
1 @node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top
2 @chapter FFTW Reference
3
4 This chapter provides a complete reference for all sequential (i.e.,
5 one-processor) FFTW functions. Parallel transforms are described in
6 later chapters.
7
8 @menu
9 * Data Types and Files::
10 * Using Plans::
11 * Basic Interface::
12 * Advanced Interface::
13 * Guru Interface::
14 * New-array Execute Functions::
15 * Wisdom::
16 * What FFTW Really Computes::
17 @end menu
18
19 @c ------------------------------------------------------------
20 @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference
21 @section Data Types and Files
22
23 All programs using FFTW should include its header file:
24
25 @example
26 #include <fftw3.h>
27 @end example
28
29 You must also link to the FFTW library. On Unix, this
30 means adding @code{-lfftw3 -lm} at the @emph{end} of the link command.
31
32 @menu
33 * Complex numbers::
34 * Precision::
35 * Memory Allocation::
36 @end menu
37
38 @c =========>
39 @node Complex numbers, Precision, Data Types and Files, Data Types and Files
40 @subsection Complex numbers
41
42 The default FFTW interface uses @code{double} precision for all
43 floating-point numbers, and defines a @code{fftw_complex} type to hold
44 complex numbers as:
45
46 @example
47 typedef double fftw_complex[2];
48 @end example
49 @tindex fftw_complex
50
51 Here, the @code{[0]} element holds the real part and the @code{[1]}
52 element holds the imaginary part.
53
54 Alternatively, if you have a C compiler (such as @code{gcc}) that
55 supports the C99 revision of the ANSI C standard, you can use C's new
56 native complex type (which is binary-compatible with the typedef above).
57 In particular, if you @code{#include <complex.h>} @emph{before}
58 @code{<fftw3.h>}, then @code{fftw_complex} is defined to be the native
59 complex type and you can manipulate it with ordinary arithmetic
60 (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are
61 @code{fftw_complex} and @code{I} is the standard symbol for the
62 imaginary unit);
63 @cindex C99
64
65
66 C++ has its own @code{complex<T>} template class, defined in the
67 standard @code{<complex>} header file. Reportedly, the C++ standards
68 committee has recently agreed to mandate that the storage format used
69 for this type be binary-compatible with the C99 type, i.e. an array
70 @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]}
71 parts. (See report
72 @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf
73 WG21/N1388}.) Although not part of the official standard as of this
74 writing, the proposal stated that: ``This solution has been tested with
75 all current major implementations of the standard library and shown to
76 be working.'' To the extent that this is true, if you have a variable
77 @code{complex<double> *x}, you can pass it directly to FFTW via
78 @code{reinterpret_cast<fftw_complex*>(x)}.
79 @cindex C++
80 @cindex portability
81
82 @c =========>
83 @node Precision, Memory Allocation, Complex numbers, Data Types and Files
84 @subsection Precision
85 @cindex precision
86
87 You can install single and long-double precision versions of FFTW,
88 which replace @code{double} with @code{float} and @code{long double},
89 respectively (@pxref{Installation and Customization}). To use these
90 interfaces, you:
91
92 @itemize @bullet
93
94 @item
95 Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or
96 @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}. (You
97 can link to the different-precision libraries simultaneously.)
98
99 @item
100 Include the @emph{same} @code{<fftw3.h>} header file.
101
102 @item
103 Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or
104 @samp{fftwl_} for single or long-double precision, respectively.
105 (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute}
106 becomes @code{fftwf_execute}, etcetera.)
107
108 @item
109 Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the
110 same.
111
112 @item
113 Replace @code{double} with @code{float} or @code{long double} for
114 subroutine parameters.
115
116 @end itemize
117
118 Depending upon your compiler and/or hardware, @code{long double} may not
119 be any more precise than @code{double} (or may not be supported at all,
120 although it is standard in C99).
121 @cindex C99
122
123
124 We also support using the nonstandard @code{__float128}
125 quadruple-precision type provided by recent versions of @code{gcc} on
126 32- and 64-bit x86 hardware (@pxref{Installation and Customization}).
127 To use this type, link with @code{-lfftw3q -lquadmath -lm} (the
128 @code{libquadmath} library provided by @code{gcc} is needed for
129 quadruple-precision trigonometric functions) and use @samp{fftwq_}
130 identifiers.
131
132 @c =========>
133 @node Memory Allocation, , Precision, Data Types and Files
134 @subsection Memory Allocation
135
136 @example
137 void *fftw_malloc(size_t n);
138 void fftw_free(void *p);
139 @end example
140 @findex fftw_malloc
141 @findex fftw_free
142
143 These are functions that behave identically to @code{malloc} and
144 @code{free}, except that they guarantee that the returned pointer obeys
145 any special alignment restrictions imposed by any algorithm in FFTW
146 (e.g. for SIMD acceleration). @xref{SIMD alignment and fftw_malloc}.
147 @cindex alignment
148
149
150 Data allocated by @code{fftw_malloc} @emph{must} be deallocated by
151 @code{fftw_free} and not by the ordinary @code{free}.
152
153 These routines simply call through to your operating system's
154 @code{malloc} or, if necessary, its aligned equivalent
155 (e.g. @code{memalign}), so you normally need not worry about any
156 significant time or space overhead. You are @emph{not required} to use
157 them to allocate your data, but we strongly recommend it.
158
159 Note: in C++, just as with ordinary @code{malloc}, you must typecast
160 the output of @code{fftw_malloc} to whatever pointer type you are
161 allocating.
162 @cindex C++
163
164
165 We also provide the following two convenience functions to allocate
166 real and complex arrays with @code{n} elements, which are equivalent
167 to @code{(double *) fftw_malloc(sizeof(double) * n)} and
168 @code{(fftw_complex *) fftw_malloc(sizeof(fftw_complex) * n)},
169 respectively:
170
171 @example
172 double *fftw_alloc_real(size_t n);
173 fftw_complex *fftw_alloc_complex(size_t n);
174 @end example
175 @findex fftw_alloc_real
176 @findex fftw_alloc_complex
177
178 The equivalent functions in other precisions allocate arrays of @code{n}
179 elements in that precision. e.g. @code{fftwf_alloc_real(n)} is
180 equivalent to @code{(float *) fftwf_malloc(sizeof(float) * n)}.
181 @cindex precision
182
183 @c ------------------------------------------------------------
184 @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference
185 @section Using Plans
186
187 Plans for all transform types in FFTW are stored as type
188 @code{fftw_plan} (an opaque pointer type), and are created by one of the
189 various planning routines described in the following sections.
190 @tindex fftw_plan
191 An @code{fftw_plan} contains all information necessary to compute the
192 transform, including the pointers to the input and output arrays.
193
194 @example
195 void fftw_execute(const fftw_plan plan);
196 @end example
197 @findex fftw_execute
198
199 This executes the @code{plan}, to compute the corresponding transform on
200 the arrays for which it was planned (which must still exist). The plan
201 is not modified, and @code{fftw_execute} can be called as many times as
202 desired.
203
204 To apply a given plan to a different array, you can use the new-array execute
205 interface. @xref{New-array Execute Functions}.
206
207 @code{fftw_execute} (and equivalents) is the only function in FFTW
208 guaranteed to be thread-safe; see @ref{Thread safety}.
209
210 This function:
211 @example
212 void fftw_destroy_plan(fftw_plan plan);
213 @end example
214 @findex fftw_destroy_plan
215 deallocates the @code{plan} and all its associated data.
216
217 FFTW's planner saves some other persistent data, such as the
218 accumulated wisdom and a list of algorithms available in the current
219 configuration. If you want to deallocate all of that and reset FFTW
220 to the pristine state it was in when you started your program, you can
221 call:
222
223 @example
224 void fftw_cleanup(void);
225 @end example
226 @findex fftw_cleanup
227
228 After calling @code{fftw_cleanup}, all existing plans become undefined,
229 and you should not attempt to execute them nor to destroy them. You can
230 however create and execute/destroy new plans, in which case FFTW starts
231 accumulating wisdom information again.
232
233 @code{fftw_cleanup} does not deallocate your plans, however. To prevent
234 memory leaks, you must still call @code{fftw_destroy_plan} before
235 executing @code{fftw_cleanup}.
236
237 Occasionally, it may useful to know FFTW's internal ``cost'' metric
238 that it uses to compare plans to one another; this cost is
239 proportional to an execution time of the plan, in undocumented units,
240 if the plan was created with the @code{FFTW_MEASURE} or other
241 timing-based options, or alternatively is a heuristic cost function
242 for @code{FFTW_ESTIMATE} plans. (The cost values of measured and
243 estimated plans are not comparable, being in different units. Also,
244 costs from different FFTW versions or the same version compiled
245 differently may not be in the same units. Plans created from wisdom
246 have a cost of 0 since no timing measurement is performed for them.
247 Finally, certain problems for which only one top-level algorithm was
248 possible may have required no measurements of the cost of the whole
249 plan, in which case @code{fftw_cost} will also return 0.) The cost
250 metric for a given plan is returned by:
251
252 @example
253 double fftw_cost(const fftw_plan plan);
254 @end example
255 @findex fftw_cost
256
257 The following two routines are provided purely for academic purposes
258 (that is, for entertainment).
259
260 @example
261 void fftw_flops(const fftw_plan plan,
262 double *add, double *mul, double *fma);
263 @end example
264 @findex fftw_flops
265
266 Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an
267 exact count of the number of floating-point additions, multiplications,
268 and fused multiply-add operations involved in the plan's execution. The
269 total number of floating-point operations (flops) is @code{add + mul +
270 2*fma}, or @code{add + mul + fma} if the hardware supports fused
271 multiply-add instructions (although the number of FMA operations is only
272 approximate because of compiler voodoo). (The number of operations
273 should be an integer, but we use @code{double} to avoid overflowing
274 @code{int} for large transforms; the arguments are of type @code{double}
275 even for single and long-double precision versions of FFTW.)
276
277 @example
278 void fftw_fprint_plan(const fftw_plan plan, FILE *output_file);
279 void fftw_print_plan(const fftw_plan plan);
280 @end example
281 @findex fftw_fprint_plan
282 @findex fftw_print_plan
283
284 This outputs a ``nerd-readable'' representation of the @code{plan} to
285 the given file or to @code{stdout}, respectively.
286
287 @c ------------------------------------------------------------
288 @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference
289 @section Basic Interface
290 @cindex basic interface
291
292 Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est
293 omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface}
294 computes a single transform of contiguous data, the @dfn{advanced
295 interface} computes transforms of multiple or strided arrays, and the
296 @dfn{guru interface} supports the most general data layouts,
297 multiplicities, and strides. This section describes the the basic
298 interface, which we expect to satisfy the needs of most users.
299
300 @menu
301 * Complex DFTs::
302 * Planner Flags::
303 * Real-data DFTs::
304 * Real-data DFT Array Format::
305 * Real-to-Real Transforms::
306 * Real-to-Real Transform Kinds::
307 @end menu
308
309 @c =========>
310 @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface
311 @subsection Complex DFTs
312
313 @example
314 fftw_plan fftw_plan_dft_1d(int n0,
315 fftw_complex *in, fftw_complex *out,
316 int sign, unsigned flags);
317 fftw_plan fftw_plan_dft_2d(int n0, int n1,
318 fftw_complex *in, fftw_complex *out,
319 int sign, unsigned flags);
320 fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2,
321 fftw_complex *in, fftw_complex *out,
322 int sign, unsigned flags);
323 fftw_plan fftw_plan_dft(int rank, const int *n,
324 fftw_complex *in, fftw_complex *out,
325 int sign, unsigned flags);
326 @end example
327 @findex fftw_plan_dft_1d
328 @findex fftw_plan_dft_2d
329 @findex fftw_plan_dft_3d
330 @findex fftw_plan_dft
331
332 Plan a complex input/output discrete Fourier transform (DFT) in zero or
333 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
334
335 Once you have created a plan for a certain transform type and
336 parameters, then creating another plan of the same type and parameters,
337 but for different arrays, is fast and shares constant data with the
338 first plan (if it still exists).
339
340 The planner returns @code{NULL} if the plan cannot be created. In the
341 standard FFTW distribution, the basic interface is guaranteed to return
342 a non-@code{NULL} plan. A plan may be @code{NULL}, however, if you are
343 using a customized FFTW configuration supporting a restricted set of
344 transforms.
345
346 @subsubheading Arguments
347 @itemize @bullet
348
349 @item
350 @code{rank} is the rank of the transform (it should be the size of the
351 array @code{*n}), and can be any non-negative integer. (@xref{Complex
352 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
353 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
354 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
355 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
356 copy of one number from input to output.
357
358 @item
359 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate
360 for each routine) specify the size of the transform dimensions. They
361 can be any positive integer.
362
363 @itemize @minus
364 @item
365 @cindex row-major
366 Multi-dimensional arrays are stored in row-major order with dimensions:
367 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
368 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
369 @xref{Multi-dimensional Array Format}.
370 @item
371 FFTW is best at handling sizes of the form
372 @ifinfo
373 @math{2^a 3^b 5^c 7^d 11^e 13^f},
374 @end ifinfo
375 @tex
376 $2^a 3^b 5^c 7^d 11^e 13^f$,
377 @end tex
378 @html
379 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
380 11<sup>e</sup> 13<sup>f</sup>,
381 @end html
382 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
383 are arbitrary. Other sizes are computed by means of a slow,
384 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). It is possible to customize FFTW
385 for different array sizes; see @ref{Installation and Customization}.
386 Transforms whose sizes are powers of @math{2} are especially fast.
387 @end itemize
388
389 @item
390 @code{in} and @code{out} point to the input and output arrays of the
391 transform, which may be the same (yielding an in-place transform).
392 @cindex in-place
393 These arrays are overwritten during planning, unless
394 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
395 initialized, but they must be allocated.)
396
397 If @code{in == out}, the transform is @dfn{in-place} and the input
398 array is overwritten. If @code{in != out}, the two arrays must
399 not overlap (but FFTW does not check for this condition).
400
401 @item
402 @ctindex FFTW_FORWARD
403 @ctindex FFTW_BACKWARD
404 @code{sign} is the sign of the exponent in the formula that defines the
405 Fourier transform. It can be @math{-1} (= @code{FFTW_FORWARD}) or
406 @math{+1} (= @code{FFTW_BACKWARD}).
407
408 @item
409 @cindex flags
410 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
411 as defined in @ref{Planner Flags}.
412
413 @end itemize
414
415 FFTW computes an unnormalized transform: computing a forward followed by
416 a backward transform (or vice versa) will result in the original data
417 multiplied by the size of the transform (the product of the dimensions).
418 @cindex normalization
419 For more information, see @ref{What FFTW Really Computes}.
420
421 @c =========>
422 @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface
423 @subsection Planner Flags
424
425 All of the planner routines in FFTW accept an integer @code{flags}
426 argument, which is a bitwise OR (@samp{|}) of zero or more of the flag
427 constants defined below. These flags control the rigor (and time) of
428 the planning process, and can also impose (or lift) restrictions on the
429 type of transform algorithm that is employed.
430
431 @emph{Important:} the planner overwrites the input array during
432 planning unless a saved plan (@pxref{Wisdom}) is available for that
433 problem, so you should initialize your input data after creating the
434 plan. The only exceptions to this are the @code{FFTW_ESTIMATE} and
435 @code{FFTW_WISDOM_ONLY} flags, as mentioned below.
436
437 In all cases, if wisdom is available for the given problem that was
438 created with equal-or-greater planning rigor, then the more rigorous
439 wisdom is used. For example, in @code{FFTW_ESTIMATE} mode any available
440 wisdom is used, whereas in @code{FFTW_PATIENT} mode only wisdom created
441 in patient or exhaustive mode can be used. @xref{Words of Wisdom-Saving
442 Plans}.
443
444 @subsubheading Planning-rigor flags
445 @itemize @bullet
446
447 @item
448 @ctindex FFTW_ESTIMATE
449 @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of
450 different algorithms, a simple heuristic is used to pick a (probably
451 sub-optimal) plan quickly. With this flag, the input/output arrays are
452 not overwritten during planning.
453
454 @item
455 @ctindex FFTW_MEASURE
456 @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually
457 @emph{computing} several FFTs and measuring their execution time.
458 Depending on your machine, this can take some time (often a few
459 seconds). @code{FFTW_MEASURE} is the default planning option.
460
461 @item
462 @ctindex FFTW_PATIENT
463 @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider
464 range of algorithms and often produces a ``more optimal'' plan
465 (especially for large transforms), but at the expense of several times
466 longer planning time (especially for large transforms).
467
468 @item
469 @ctindex FFTW_EXHAUSTIVE
470 @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an
471 even wider range of algorithms, including many that we think are
472 unlikely to be fast, to produce the most optimal plan but with a
473 substantially increased planning time.
474
475 @item
476 @ctindex FFTW_WISDOM_ONLY
477 @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan
478 is only created if wisdom is available for the given problem, and
479 otherwise a @code{NULL} plan is returned. This can be combined with
480 other flags, e.g. @samp{FFTW_WISDOM_ONLY | FFTW_PATIENT} creates a
481 plan only if wisdom is available that was created in
482 @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode. The
483 @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect
484 whether wisdom is available; for example, if wisdom is not available
485 one may wish to allocate new arrays for planning so that user data is
486 not overwritten.
487
488 @end itemize
489
490 @subsubheading Algorithm-restriction flags
491 @itemize @bullet
492
493 @item
494 @ctindex FFTW_DESTROY_INPUT
495 @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is
496 allowed to @emph{overwrite its input} array with arbitrary data; this
497 can sometimes allow more efficient algorithms to be employed.
498 @cindex out-of-place
499
500 @item
501 @ctindex FFTW_PRESERVE_INPUT
502 @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must
503 @emph{not change its input} array. This is ordinarily the
504 @emph{default}, except for c2r and hc2r (i.e. complex-to-real)
505 transforms for which @code{FFTW_DESTROY_INPUT} is the default. In the
506 latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use
507 algorithms that do not destroy the input, at the expense of worse
508 performance; for multi-dimensional c2r transforms, however, no
509 input-preserving algorithms are implemented and the planner will return
510 @code{NULL} if one is requested.
511 @cindex c2r
512 @cindex hc2r
513
514 @item
515 @ctindex FFTW_UNALIGNED
516 @cindex alignment
517 @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any
518 unusual alignment requirements on the input/output arrays (i.e. no
519 SIMD may be used). This flag is normally @emph{not necessary}, since
520 the planner automatically detects misaligned arrays. The only use for
521 this flag is if you want to use the new-array execute interface to
522 execute a given plan on a different array that may not be aligned like
523 the original. (Using @code{fftw_malloc} makes this flag unnecessary
524 even then.)
525
526 @end itemize
527
528 @subsubheading Limiting planning time
529
530 @example
531 extern void fftw_set_timelimit(double seconds);
532 @end example
533 @findex fftw_set_timelimit
534
535 This function instructs FFTW to spend at most @code{seconds} seconds
536 (approximately) in the planner. If @code{seconds ==
537 FFTW_NO_TIMELIMIT} (the default value, which is negative), then
538 planning time is unbounded. Otherwise, FFTW plans with a
539 progressively wider range of algorithms until the the given time limit
540 is reached or the given range of algorithms is explored, returning the
541 best available plan.
542 @ctindex FFTW_NO_TIMELIMIT
543
544
545 For example, specifying @code{FFTW_PATIENT} first plans in
546 @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then
547 finally (time permitting) in @code{FFTW_PATIENT}. If
548 @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further
549 progress to @code{FFTW_EXHAUSTIVE} mode.
550
551 Note that the @code{seconds} argument specifies only a rough limit; in
552 practice, the planner may use somewhat more time if the time limit is
553 reached when the planner is in the middle of an operation that cannot
554 be interrupted. At the very least, the planner will complete planning
555 in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit
556 of 0).
557
558
559 @c =========>
560 @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface
561 @subsection Real-data DFTs
562
563 @example
564 fftw_plan fftw_plan_dft_r2c_1d(int n0,
565 double *in, fftw_complex *out,
566 unsigned flags);
567 fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1,
568 double *in, fftw_complex *out,
569 unsigned flags);
570 fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2,
571 double *in, fftw_complex *out,
572 unsigned flags);
573 fftw_plan fftw_plan_dft_r2c(int rank, const int *n,
574 double *in, fftw_complex *out,
575 unsigned flags);
576 @end example
577 @findex fftw_plan_dft_r2c_1d
578 @findex fftw_plan_dft_r2c_2d
579 @findex fftw_plan_dft_r2c_3d
580 @findex fftw_plan_dft_r2c
581 @cindex r2c
582
583 Plan a real-input/complex-output discrete Fourier transform (DFT) in
584 zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using
585 Plans}).
586
587 Once you have created a plan for a certain transform type and
588 parameters, then creating another plan of the same type and parameters,
589 but for different arrays, is fast and shares constant data with the
590 first plan (if it still exists).
591
592 The planner returns @code{NULL} if the plan cannot be created. A
593 non-@code{NULL} plan is always returned by the basic interface unless
594 you are using a customized FFTW configuration supporting a restricted
595 set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag
596 with a multi-dimensional out-of-place c2r transform (see below).
597
598 @subsubheading Arguments
599 @itemize @bullet
600
601 @item
602 @code{rank} is the rank of the transform (it should be the size of the
603 array @code{*n}), and can be any non-negative integer. (@xref{Complex
604 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
605 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
606 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
607 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
608 copy of one real number (with zero imaginary part) from input to output.
609
610 @item
611 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate
612 for each routine) specify the size of the transform dimensions. They
613 can be any positive integer. This is different in general from the
614 @emph{physical} array dimensions, which are described in @ref{Real-data
615 DFT Array Format}.
616
617 @itemize @minus
618 @item
619 FFTW is best at handling sizes of the form
620 @ifinfo
621 @math{2^a 3^b 5^c 7^d 11^e 13^f},
622 @end ifinfo
623 @tex
624 $2^a 3^b 5^c 7^d 11^e 13^f$,
625 @end tex
626 @html
627 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
628 11<sup>e</sup> 13<sup>f</sup>,
629 @end html
630 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
631 are arbitrary. Other sizes are computed by means of a slow,
632 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
633 for different array sizes; see @ref{Installation and Customization}.)
634 Transforms whose sizes are powers of @math{2} are especially fast, and
635 it is generally beneficial for the @emph{last} dimension of an r2c/c2r
636 transform to be @emph{even}.
637 @end itemize
638
639 @item
640 @code{in} and @code{out} point to the input and output arrays of the
641 transform, which may be the same (yielding an in-place transform).
642 @cindex in-place
643 These arrays are overwritten during planning, unless
644 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
645 initialized, but they must be allocated.) For an in-place transform, it
646 is important to remember that the real array will require padding,
647 described in @ref{Real-data DFT Array Format}.
648 @cindex padding
649
650 @item
651 @cindex flags
652 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
653 as defined in @ref{Planner Flags}.
654
655 @end itemize
656
657 The inverse transforms, taking complex input (storing the non-redundant
658 half of a logically Hermitian array) to real output, are given by:
659
660 @example
661 fftw_plan fftw_plan_dft_c2r_1d(int n0,
662 fftw_complex *in, double *out,
663 unsigned flags);
664 fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1,
665 fftw_complex *in, double *out,
666 unsigned flags);
667 fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2,
668 fftw_complex *in, double *out,
669 unsigned flags);
670 fftw_plan fftw_plan_dft_c2r(int rank, const int *n,
671 fftw_complex *in, double *out,
672 unsigned flags);
673 @end example
674 @findex fftw_plan_dft_c2r_1d
675 @findex fftw_plan_dft_c2r_2d
676 @findex fftw_plan_dft_c2r_3d
677 @findex fftw_plan_dft_c2r
678 @cindex c2r
679
680 The arguments are the same as for the r2c transforms, except that the
681 input and output data formats are reversed.
682
683 FFTW computes an unnormalized transform: computing an r2c followed by a
684 c2r transform (or vice versa) will result in the original data
685 multiplied by the size of the transform (the product of the logical
686 dimensions).
687 @cindex normalization
688 An r2c transform produces the same output as a @code{FFTW_FORWARD}
689 complex DFT of the same input, and a c2r transform is correspondingly
690 equivalent to @code{FFTW_BACKWARD}. For more information, see @ref{What
691 FFTW Really Computes}.
692
693 @c =========>
694 @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface
695 @subsection Real-data DFT Array Format
696 @cindex r2c/c2r multi-dimensional array format
697
698 The output of a DFT of real data (r2c) contains symmetries that, in
699 principle, make half of the outputs redundant (@pxref{What FFTW Really
700 Computes}). (Similarly for the input of an inverse c2r transform.) In
701 practice, it is not possible to entirely realize these savings in an
702 efficient and understandable format that generalizes to
703 multi-dimensional transforms. Instead, the output of the r2c
704 transforms is @emph{slightly} over half of the output of the
705 corresponding complex transform. We do not ``pack'' the data in any
706 way, but store it as an ordinary array of @code{fftw_complex} values.
707 In fact, this data is simply a subsection of what would be the array in
708 the corresponding complex transform.
709
710 Specifically, for a real transform of @math{d} (= @code{rank})
711 dimensions @ndims{}, the complex data is an @ndimshalf array of
712 @code{fftw_complex} values in row-major order (with the division rounded
713 down). That is, we only store the @emph{lower} half (non-negative
714 frequencies), plus one element, of the last dimension of the data from
715 the ordinary complex transform. (We could have instead taken half of
716 any other dimension, but implementation turns out to be simpler if the
717 last, contiguous, dimension is used.)
718
719 @cindex out-of-place
720 For an out-of-place transform, the real data is simply an array with
721 physical dimensions @ndims in row-major order.
722
723 @cindex in-place
724 @cindex padding
725 For an in-place transform, some complications arise since the complex data
726 is slightly larger than the real data. In this case, the final
727 dimension of the real data must be @emph{padded} with extra values to
728 accommodate the size of the complex data---two extra if the last
729 dimension is even and one if it is odd. That is, the last dimension of
730 the real data must physically contain
731 @tex
732 $2 (n_{d-1}/2+1)$
733 @end tex
734 @ifinfo
735 2 * (n[d-1]/2+1)
736 @end ifinfo
737 @html
738 2 * (n<sub>d-1</sub>/2+1)
739 @end html
740 @code{double} values (exactly enough to hold the complex data). This
741 physical array size does not, however, change the @emph{logical} array
742 size---only
743 @tex
744 $n_{d-1}$
745 @end tex
746 @ifinfo
747 n[d-1]
748 @end ifinfo
749 @html
750 n<sub>d-1</sub>
751 @end html
752 values are actually stored in the last dimension, and
753 @tex
754 $n_{d-1}$
755 @end tex
756 @ifinfo
757 n[d-1]
758 @end ifinfo
759 @html
760 n<sub>d-1</sub>
761 @end html
762 is the last dimension passed to the planner.
763
764 @c =========>
765 @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface
766 @subsection Real-to-Real Transforms
767 @cindex r2r
768
769 @example
770 fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out,
771 fftw_r2r_kind kind, unsigned flags);
772 fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out,
773 fftw_r2r_kind kind0, fftw_r2r_kind kind1,
774 unsigned flags);
775 fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2,
776 double *in, double *out,
777 fftw_r2r_kind kind0,
778 fftw_r2r_kind kind1,
779 fftw_r2r_kind kind2,
780 unsigned flags);
781 fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out,
782 const fftw_r2r_kind *kind, unsigned flags);
783 @end example
784 @findex fftw_plan_r2r_1d
785 @findex fftw_plan_r2r_2d
786 @findex fftw_plan_r2r_3d
787 @findex fftw_plan_r2r
788
789 Plan a real input/output (r2r) transform of various kinds in zero or
790 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
791
792 Once you have created a plan for a certain transform type and
793 parameters, then creating another plan of the same type and parameters,
794 but for different arrays, is fast and shares constant data with the
795 first plan (if it still exists).
796
797 The planner returns @code{NULL} if the plan cannot be created. A
798 non-@code{NULL} plan is always returned by the basic interface unless
799 you are using a customized FFTW configuration supporting a restricted
800 set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are
801 not defined).
802 @ctindex FFTW_REDFT00
803
804 @subsubheading Arguments
805 @itemize @bullet
806
807 @item
808 @code{rank} is the dimensionality of the transform (it should be the
809 size of the arrays @code{*n} and @code{*kind}), and can be any
810 non-negative integer. The @samp{_1d}, @samp{_2d}, and @samp{_3d}
811 planners correspond to a @code{rank} of @code{1}, @code{2}, and
812 @code{3}, respectively. A @code{rank} of zero is equivalent to a copy
813 of one number from input to output.
814
815 @item
816 @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]},
817 respectively, gives the (physical) size of the transform dimensions.
818 They can be any positive integer.
819
820 @itemize @minus
821 @item
822 @cindex row-major
823 Multi-dimensional arrays are stored in row-major order with dimensions:
824 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
825 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
826 @xref{Multi-dimensional Array Format}.
827 @item
828 FFTW is generally best at handling sizes of the form
829 @ifinfo
830 @math{2^a 3^b 5^c 7^d 11^e 13^f},
831 @end ifinfo
832 @tex
833 $2^a 3^b 5^c 7^d 11^e 13^f$,
834 @end tex
835 @html
836 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
837 11<sup>e</sup> 13<sup>f</sup>,
838 @end html
839 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
840 are arbitrary. Other sizes are computed by means of a slow,
841 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
842 for different array sizes; see @ref{Installation and Customization}.)
843 Transforms whose sizes are powers of @math{2} are especially fast.
844 @item
845 For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of
846 size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that
847 should be factorizable in the above form.
848 @end itemize
849
850 @item
851 @code{in} and @code{out} point to the input and output arrays of the
852 transform, which may be the same (yielding an in-place transform).
853 @cindex in-place
854 These arrays are overwritten during planning, unless
855 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
856 initialized, but they must be allocated.)
857
858 @item
859 @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or
860 @code{kind[rank]}, is the kind of r2r transform used for the
861 corresponding dimension. The valid kind constants are described in
862 @ref{Real-to-Real Transform Kinds}. In a multi-dimensional transform,
863 what is computed is the separable product formed by taking each
864 transform kind along the corresponding dimension, one dimension after
865 another.
866
867 @item
868 @cindex flags
869 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
870 as defined in @ref{Planner Flags}.
871
872 @end itemize
873
874 @c =========>
875 @node Real-to-Real Transform Kinds, , Real-to-Real Transforms, Basic Interface
876 @subsection Real-to-Real Transform Kinds
877 @cindex kind (r2r)
878
879 FFTW currently supports 11 different r2r transform kinds, specified by
880 one of the constants below. For the precise definitions of these
881 transforms, see @ref{What FFTW Really Computes}. For a more colloquial
882 introduction to these transform kinds, see @ref{More DFTs of Real Data}.
883
884 For dimension of size @code{n}, there is a corresponding ``logical''
885 dimension @code{N} that determines the normalization (and the optimal
886 factorization); the formula for @code{N} is given for each kind below.
887 Also, with each transform kind is listed its corrsponding inverse
888 transform. FFTW computes unnormalized transforms: a transform followed
889 by its inverse will result in the original data multiplied by @code{N}
890 (or the product of the @code{N}'s for each dimension, in
891 multi-dimensions).
892 @cindex normalization
893
894 @itemize @bullet
895
896 @item
897 @ctindex FFTW_R2HC
898 @code{FFTW_R2HC} computes a real-input DFT with output in
899 ``halfcomplex'' format, i.e. real and imaginary parts for a transform of
900 size @code{n} stored as:
901 @tex
902 $$
903 r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1
904 $$
905 @end tex
906 @ifinfo
907 r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1
908 @end ifinfo
909 @html
910 <p align=center>
911 r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub>
912 </p>
913 @end html
914 (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.)
915
916 @item
917 @ctindex FFTW_HC2R
918 @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above.
919 (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.)
920
921 @item
922 @ctindex FFTW_DHT
923 @code{FFTW_DHT} computes a discrete Hartley transform.
924 (Logical @code{N=n}, inverse is @code{FFTW_DHT}.)
925 @cindex discrete Hartley transform
926
927 @item
928 @ctindex FFTW_REDFT00
929 @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I.
930 (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.)
931 @cindex discrete cosine transform
932 @cindex DCT
933
934 @item
935 @ctindex FFTW_REDFT10
936 @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT).
937 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.)
938
939 @item
940 @ctindex FFTW_REDFT01
941 @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II).
942 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.)
943 @cindex IDCT
944
945 @item
946 @ctindex FFTW_REDFT11
947 @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV.
948 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.)
949
950 @item
951 @ctindex FFTW_RODFT00
952 @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I.
953 (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.)
954 @cindex discrete sine transform
955 @cindex DST
956
957 @item
958 @ctindex FFTW_RODFT10
959 @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II.
960 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.)
961
962 @item
963 @ctindex FFTW_RODFT01
964 @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III.
965 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.)
966
967 @item
968 @ctindex FFTW_RODFT11
969 @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV.
970 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.)
971
972 @end itemize
973
974 @c ------------------------------------------------------------
975 @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference
976 @section Advanced Interface
977 @cindex advanced interface
978
979 FFTW's ``advanced'' interface supplements the basic interface with four
980 new planner routines, providing a new level of flexibility: you can plan
981 a transform of multiple arrays simultaneously, operate on non-contiguous
982 (strided) data, and transform a subset of a larger multi-dimensional
983 array. Other than these additional features, the planner operates in
984 the same fashion as in the basic interface, and the resulting
985 @code{fftw_plan} is used in the same way (@pxref{Using Plans}).
986
987 @menu
988 * Advanced Complex DFTs::
989 * Advanced Real-data DFTs::
990 * Advanced Real-to-real Transforms::
991 @end menu
992
993 @c =========>
994 @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface
995 @subsection Advanced Complex DFTs
996
997 @example
998 fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany,
999 fftw_complex *in, const int *inembed,
1000 int istride, int idist,
1001 fftw_complex *out, const int *onembed,
1002 int ostride, int odist,
1003 int sign, unsigned flags);
1004 @end example
1005 @findex fftw_plan_many_dft
1006
1007 This routine plans multiple multidimensional complex DFTs, and it
1008 extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to
1009 compute @code{howmany} transforms, each having rank @code{rank} and size
1010 @code{n}. In addition, the transform data need not be contiguous, but
1011 it may be laid out in memory with an arbitrary stride. To account for
1012 these possibilities, @code{fftw_plan_many_dft} adds the new parameters
1013 @code{howmany}, @{@code{i},@code{o}@}@code{nembed},
1014 @{@code{i},@code{o}@}@code{stride}, and
1015 @{@code{i},@code{o}@}@code{dist}. The FFTW basic interface
1016 (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2,
1017 and@tie{}3, but the advanced interface handles only the general-rank
1018 case.
1019
1020 @code{howmany} is the number of transforms to compute. The resulting
1021 plan computes @code{howmany} transforms, where the input of the
1022 @code{k}-th transform is at location @code{in+k*idist} (in C pointer
1023 arithmetic), and its output is at location @code{out+k*odist}. Plans
1024 obtained in this way can often be faster than calling FFTW multiple
1025 times for the individual transforms. The basic @code{fftw_plan_dft}
1026 interface corresponds to @code{howmany=1} (in which case the @code{dist}
1027 parameters are ignored).
1028 @cindex howmany parameter
1029 @cindex dist
1030
1031
1032 Each of the @code{howmany} transforms has rank @code{rank} and size
1033 @code{n}, as in the basic interface. In addition, the advanced
1034 interface allows the input and output arrays of each transform to be
1035 row-major subarrays of larger rank-@code{rank} arrays, described by
1036 @code{inembed} and @code{onembed} parameters, respectively.
1037 @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank},
1038 and @code{n} should be elementwise less than or equal to
1039 @{@code{i},@code{o}@}@code{nembed}. Passing @code{NULL} for an
1040 @code{nembed} parameter is equivalent to passing @code{n} (i.e. same
1041 physical and logical dimensions, as in the basic interface.)
1042
1043 The @code{stride} parameters indicate that the @code{j}-th element of
1044 the input or output arrays is located at @code{j*istride} or
1045 @code{j*ostride}, respectively. (For a multi-dimensional array,
1046 @code{j} is the ordinary row-major index.) When combined with the
1047 @code{k}-th transform in a @code{howmany} loop, from above, this means
1048 that the (@code{j},@code{k})-th element is at @code{j*stride+k*dist}.
1049 (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.)
1050 @cindex stride
1051
1052
1053 For in-place transforms, the input and output @code{stride} and
1054 @code{dist} parameters should be the same; otherwise, the planner may
1055 return @code{NULL}.
1056
1057 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
1058 this function returns. You can safely free or reuse them.
1059
1060 @strong{Examples}:
1061 One transform of one 5 by 6 array contiguous in memory:
1062 @example
1063 int rank = 2;
1064 int n[] = @{5, 6@};
1065 int howmany = 1;
1066 int idist = odist = 0; /* unused because howmany = 1 */
1067 int istride = ostride = 1; /* array is contiguous in memory */
1068 int *inembed = n, *onembed = n;
1069 @end example
1070
1071 Transform of three 5 by 6 arrays, each contiguous in memory,
1072 stored in memory one after another:
1073 @example
1074 int rank = 2;
1075 int n[] = @{5, 6@};
1076 int howmany = 3;
1077 int idist = odist = n[0]*n[1]; /* = 30, the distance in memory
1078 between the first element
1079 of the first array and the
1080 first element of the second array */
1081 int istride = ostride = 1; /* array is contiguous in memory */
1082 int *inembed = n, *onembed = n;
1083 @end example
1084
1085 Transform each column of a 2d array with 10 rows and 3 columns:
1086 @example
1087 int rank = 1; /* not 2: we are computing 1d transforms */
1088 int n[] = @{10@}; /* 1d transforms of length 10 */
1089 int howmany = 3;
1090 int idist = odist = 1;
1091 int istride = ostride = 3; /* distance between two elements in
1092 the same column */
1093 int *inembed = n, *onembed = n;
1094 @end example
1095
1096 @c =========>
1097 @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface
1098 @subsection Advanced Real-data DFTs
1099
1100 @example
1101 fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany,
1102 double *in, const int *inembed,
1103 int istride, int idist,
1104 fftw_complex *out, const int *onembed,
1105 int ostride, int odist,
1106 unsigned flags);
1107 fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany,
1108 fftw_complex *in, const int *inembed,
1109 int istride, int idist,
1110 double *out, const int *onembed,
1111 int ostride, int odist,
1112 unsigned flags);
1113 @end example
1114 @findex fftw_plan_many_dft_r2c
1115 @findex fftw_plan_many_dft_c2r
1116
1117 Like @code{fftw_plan_many_dft}, these two functions add @code{howmany},
1118 @code{nembed}, @code{stride}, and @code{dist} parameters to the
1119 @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but
1120 otherwise behave the same as the basic interface.
1121
1122 The interpretation of @code{howmany}, @code{stride}, and @code{dist} are
1123 the same as for @code{fftw_plan_many_dft}, above. Note that the
1124 @code{stride} and @code{dist} for the real array are in units of
1125 @code{double}, and for the complex array are in units of
1126 @code{fftw_complex}.
1127
1128 If an @code{nembed} parameter is @code{NULL}, it is interpreted as what
1129 it would be in the basic interface, as described in @ref{Real-data DFT
1130 Array Format}. That is, for the complex array the size is assumed to be
1131 the same as @code{n}, but with the last dimension cut roughly in half.
1132 For the real array, the size is assumed to be @code{n} if the transform
1133 is out-of-place, or @code{n} with the last dimension ``padded'' if the
1134 transform is in-place.
1135
1136 If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as
1137 the physical size of the corresponding array, in row-major order, just
1138 as for @code{fftw_plan_many_dft}. In this case, each dimension of
1139 @code{nembed} should be @code{>=} what it would be in the basic
1140 interface (e.g. the halved or padded @code{n}).
1141
1142 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
1143 this function returns. You can safely free or reuse them.
1144
1145 @c =========>
1146 @node Advanced Real-to-real Transforms, , Advanced Real-data DFTs, Advanced Interface
1147 @subsection Advanced Real-to-real Transforms
1148
1149 @example
1150 fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany,
1151 double *in, const int *inembed,
1152 int istride, int idist,
1153 double *out, const int *onembed,
1154 int ostride, int odist,
1155 const fftw_r2r_kind *kind, unsigned flags);
1156 @end example
1157 @findex fftw_plan_many_r2r
1158
1159 Like @code{fftw_plan_many_dft}, this functions adds @code{howmany},
1160 @code{nembed}, @code{stride}, and @code{dist} parameters to the
1161 @code{fftw_plan_r2r} function, but otherwise behave the same as the
1162 basic interface. The interpretation of those additional parameters are
1163 the same as for @code{fftw_plan_many_dft}. (Of course, the
1164 @code{stride} and @code{dist} parameters are now in units of
1165 @code{double}, not @code{fftw_complex}.)
1166
1167 Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not
1168 used after this function returns. You can safely free or reuse them.
1169
1170 @c ------------------------------------------------------------
1171 @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference
1172 @section Guru Interface
1173 @cindex guru interface
1174
1175 The ``guru'' interface to FFTW is intended to expose as much as possible
1176 of the flexibility in the underlying FFTW architecture. It allows one
1177 to compute multi-dimensional ``vectors'' (loops) of multi-dimensional
1178 transforms, where each vector/transform dimension has an independent
1179 size and stride.
1180 @cindex vector
1181 One can also use more general complex-number formats, e.g. separate real
1182 and imaginary arrays.
1183
1184 For those users who require the flexibility of the guru interface, it is
1185 important that they pay special attention to the documentation lest they
1186 shoot themselves in the foot.
1187
1188 @menu
1189 * Interleaved and split arrays::
1190 * Guru vector and transform sizes::
1191 * Guru Complex DFTs::
1192 * Guru Real-data DFTs::
1193 * Guru Real-to-real Transforms::
1194 * 64-bit Guru Interface::
1195 @end menu
1196
1197 @c =========>
1198 @node Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface
1199 @subsection Interleaved and split arrays
1200
1201 The guru interface supports two representations of complex numbers,
1202 which we call the interleaved and the split format.
1203
1204 The @dfn{interleaved} format is the same one used by the basic and
1205 advanced interfaces, and it is documented in @ref{Complex numbers}.
1206 In the interleaved format, you provide pointers to the real part of a
1207 complex number, and the imaginary part understood to be stored in the
1208 next memory location.
1209 @cindex interleaved format
1210
1211
1212 The @dfn{split} format allows separate pointers to the real and
1213 imaginary parts of a complex array.
1214 @cindex split format
1215
1216
1217 Technically, the interleaved format is redundant, because you can
1218 always express an interleaved array in terms of a split array with
1219 appropriate pointers and strides. On the other hand, the interleaved
1220 format is simpler to use, and it is common in practice. Hence, FFTW
1221 supports it as a special case.
1222
1223 @c =========>
1224 @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface
1225 @subsection Guru vector and transform sizes
1226
1227 The guru interface introduces one basic new data structure,
1228 @code{fftw_iodim}, that is used to specify sizes and strides for
1229 multi-dimensional transforms and vectors:
1230
1231 @example
1232 typedef struct @{
1233 int n;
1234 int is;
1235 int os;
1236 @} fftw_iodim;
1237 @end example
1238 @tindex fftw_iodim
1239
1240 Here, @code{n} is the size of the dimension, and @code{is} and @code{os}
1241 are the strides of that dimension for the input and output arrays. (The
1242 stride is the separation of consecutive elements along this dimension.)
1243
1244 The meaning of the stride parameter depends on the type of the array
1245 that the stride refers to. @emph{If the array is interleaved complex,
1246 strides are expressed in units of complex numbers
1247 (@code{fftw_complex}). If the array is split complex or real, strides
1248 are expressed in units of real numbers (@code{double}).} This
1249 convention is consistent with the usual pointer arithmetic in the C
1250 language. An interleaved array is denoted by a pointer @code{p} to
1251 @code{fftw_complex}, so that @code{p+1} points to the next complex
1252 number. Split arrays are denoted by pointers to @code{double}, in
1253 which case pointer arithmetic operates in units of
1254 @code{sizeof(double)}.
1255 @cindex stride
1256
1257
1258 The guru planner interfaces all take a (@code{rank}, @code{dims[rank]})
1259 pair describing the transform size, and a (@code{howmany_rank},
1260 @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a
1261 multi-dimensional loop of transforms to perform), where @code{dims} and
1262 @code{howmany_dims} are arrays of @code{fftw_iodim}.
1263
1264 For example, the @code{howmany} parameter in the advanced complex-DFT
1265 interface corresponds to @code{howmany_rank} = 1,
1266 @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} =
1267 @code{idist}, and @code{howmany_dims[0].os} = @code{odist}.
1268 @cindex howmany loop
1269 @cindex dist
1270 (To compute a single transform, you can just use @code{howmany_rank} = 0.)
1271
1272
1273 A row-major multidimensional array with dimensions @code{n[rank]}
1274 (@pxref{Row-major Format}) corresponds to @code{dims[i].n} =
1275 @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] *
1276 dims[i+1].is} (similarly for @code{os}). The stride of the last
1277 (@code{i=rank-1}) dimension is the overall stride of the array.
1278 e.g. to be equivalent to the advanced complex-DFT interface, you would
1279 have @code{dims[rank-1].is} = @code{istride} and
1280 @code{dims[rank-1].os} = @code{ostride}.
1281 @cindex row-major
1282
1283
1284 In general, we only guarantee FFTW to return a non-@code{NULL} plan if
1285 the vector and transform dimensions correspond to a set of distinct
1286 indices, and for in-place transforms the input/output strides should
1287 be the same.
1288
1289 @c =========>
1290 @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface
1291 @subsection Guru Complex DFTs
1292
1293 @example
1294 fftw_plan fftw_plan_guru_dft(
1295 int rank, const fftw_iodim *dims,
1296 int howmany_rank, const fftw_iodim *howmany_dims,
1297 fftw_complex *in, fftw_complex *out,
1298 int sign, unsigned flags);
1299
1300 fftw_plan fftw_plan_guru_split_dft(
1301 int rank, const fftw_iodim *dims,
1302 int howmany_rank, const fftw_iodim *howmany_dims,
1303 double *ri, double *ii, double *ro, double *io,
1304 unsigned flags);
1305 @end example
1306 @findex fftw_plan_guru_dft
1307 @findex fftw_plan_guru_split_dft
1308
1309 These two functions plan a complex-data, multi-dimensional DFT
1310 for the interleaved and split format, respectively.
1311 Transform dimensions are given by (@code{rank}, @code{dims}) over a
1312 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
1313 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
1314 to @code{fftw_iodim} arrays of length @code{rank} and
1315 @code{howmany_rank}, respectively.
1316
1317 @cindex flags
1318 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
1319 as defined in @ref{Planner Flags}.
1320
1321 In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and
1322 @code{out} point to the interleaved input and output arrays,
1323 respectively. The sign can be either @math{-1} (=
1324 @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). If the
1325 pointers are equal, the transform is in-place.
1326
1327 In the @code{fftw_plan_guru_split_dft} function,
1328 @code{ri} and @code{ii} point to the real and imaginary input arrays,
1329 and @code{ro} and @code{io} point to the real and imaginary output
1330 arrays. The input and output pointers may be the same, indicating an
1331 in-place transform. For example, for @code{fftw_complex} pointers
1332 @code{in} and @code{out}, the corresponding parameters are:
1333
1334 @example
1335 ri = (double *) in;
1336 ii = (double *) in + 1;
1337 ro = (double *) out;
1338 io = (double *) out + 1;
1339 @end example
1340
1341 Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides
1342 are expressed in units of @code{double}. For a contiguous
1343 @code{fftw_complex} array, the overall stride of the transform should
1344 be 2, the distance between consecutive real parts or between
1345 consecutive imaginary parts; see @ref{Guru vector and transform
1346 sizes}. Note that the dimension strides are applied equally to the
1347 real and imaginary parts; real and imaginary arrays with different
1348 strides are not supported.
1349
1350 There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}.
1351 This function always plans for an @code{FFTW_FORWARD} transform. To
1352 plan for an @code{FFTW_BACKWARD} transform, you can exploit the
1353 identity that the backwards DFT is equal to the forwards DFT with the
1354 real and imaginary parts swapped. For example, in the case of the
1355 @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform
1356 is computed by the parameters:
1357
1358 @example
1359 ri = (double *) in + 1;
1360 ii = (double *) in;
1361 ro = (double *) out + 1;
1362 io = (double *) out;
1363 @end example
1364
1365 @c =========>
1366 @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface
1367 @subsection Guru Real-data DFTs
1368
1369 @example
1370 fftw_plan fftw_plan_guru_dft_r2c(
1371 int rank, const fftw_iodim *dims,
1372 int howmany_rank, const fftw_iodim *howmany_dims,
1373 double *in, fftw_complex *out,
1374 unsigned flags);
1375
1376 fftw_plan fftw_plan_guru_split_dft_r2c(
1377 int rank, const fftw_iodim *dims,
1378 int howmany_rank, const fftw_iodim *howmany_dims,
1379 double *in, double *ro, double *io,
1380 unsigned flags);
1381
1382 fftw_plan fftw_plan_guru_dft_c2r(
1383 int rank, const fftw_iodim *dims,
1384 int howmany_rank, const fftw_iodim *howmany_dims,
1385 fftw_complex *in, double *out,
1386 unsigned flags);
1387
1388 fftw_plan fftw_plan_guru_split_dft_c2r(
1389 int rank, const fftw_iodim *dims,
1390 int howmany_rank, const fftw_iodim *howmany_dims,
1391 double *ri, double *ii, double *out,
1392 unsigned flags);
1393 @end example
1394 @findex fftw_plan_guru_dft_r2c
1395 @findex fftw_plan_guru_split_dft_r2c
1396 @findex fftw_plan_guru_dft_c2r
1397 @findex fftw_plan_guru_split_dft_c2r
1398
1399 Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with
1400 transform dimensions given by (@code{rank}, @code{dims}) over a
1401 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
1402 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
1403 to @code{fftw_iodim} arrays of length @code{rank} and
1404 @code{howmany_rank}, respectively. As for the basic and advanced
1405 interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform
1406 is @code{FFTW_BACKWARD}.
1407
1408 The @emph{last} dimension of @code{dims} is interpreted specially:
1409 that dimension of the real array has size @code{dims[rank-1].n}, but
1410 that dimension of the complex array has size @code{dims[rank-1].n/2+1}
1411 (division rounded down). The strides, on the other hand, are taken to
1412 be exactly as specified. It is up to the user to specify the strides
1413 appropriately for the peculiar dimensions of the data, and we do not
1414 guarantee that the planner will succeed (return non-@code{NULL}) for
1415 any dimensions other than those described in @ref{Real-data DFT Array
1416 Format} and generalized in @ref{Advanced Real-data DFTs}. (That is,
1417 for an in-place transform, each individual dimension should be able to
1418 operate in place.)
1419 @cindex in-place
1420
1421
1422 @code{in} and @code{out} point to the input and output arrays for r2c
1423 and c2r transforms, respectively. For split arrays, @code{ri} and
1424 @code{ii} point to the real and imaginary input arrays for a c2r
1425 transform, and @code{ro} and @code{io} point to the real and imaginary
1426 output arrays for an r2c transform. @code{in} and @code{ro} or
1427 @code{ri} and @code{out} may be the same, indicating an in-place
1428 transform. (In-place transforms where @code{in} and @code{io} or
1429 @code{ii} and @code{out} are the same are not currently supported.)
1430
1431 @cindex flags
1432 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
1433 as defined in @ref{Planner Flags}.
1434
1435 In-place transforms of rank greater than 1 are currently only
1436 supported for interleaved arrays. For split arrays, the planner will
1437 return @code{NULL}.
1438 @cindex in-place
1439
1440 @c =========>
1441 @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface
1442 @subsection Guru Real-to-real Transforms
1443
1444 @example
1445 fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims,
1446 int howmany_rank,
1447 const fftw_iodim *howmany_dims,
1448 double *in, double *out,
1449 const fftw_r2r_kind *kind,
1450 unsigned flags);
1451 @end example
1452 @findex fftw_plan_guru_r2r
1453
1454 Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD}
1455 transform with transform dimensions given by (@code{rank}, @code{dims})
1456 over a multi-dimensional vector (loop) of dimensions
1457 (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and
1458 @code{howmany_dims} should point to @code{fftw_iodim} arrays of length
1459 @code{rank} and @code{howmany_rank}, respectively.
1460
1461 The transform kind of each dimension is given by the @code{kind}
1462 parameter, which should point to an array of length @code{rank}. Valid
1463 @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform
1464 Kinds}.
1465
1466 @code{in} and @code{out} point to the real input and output arrays; they
1467 may be the same, indicating an in-place transform.
1468
1469 @cindex flags
1470 @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags,
1471 as defined in @ref{Planner Flags}.
1472
1473 @c =========>
1474 @node 64-bit Guru Interface, , Guru Real-to-real Transforms, Guru Interface
1475 @subsection 64-bit Guru Interface
1476 @cindex 64-bit architecture
1477
1478 When compiled in 64-bit mode on a 64-bit architecture (where addresses
1479 are 64 bits wide), FFTW uses 64-bit quantities internally for all
1480 transform sizes, strides, and so on---you don't have to do anything
1481 special to exploit this. However, in the ordinary FFTW interfaces,
1482 you specify the transform size by an @code{int} quantity, which is
1483 normally only 32 bits wide. This means that, even though FFTW is
1484 using 64-bit sizes internally, you cannot specify a single transform
1485 dimension larger than
1486 @ifinfo
1487 2^31-1
1488 @end ifinfo
1489 @html
1490 2<sup><small>31</small></sup>&minus;1
1491 @end html
1492 @tex
1493 $2^31-1$
1494 @end tex
1495 numbers.
1496
1497 We expect that few users will require transforms larger than this, but,
1498 for those who do, we provide a 64-bit version of the guru interface in
1499 which all sizes are specified as integers of type @code{ptrdiff_t}
1500 instead of @code{int}. (@code{ptrdiff_t} is a signed integer type
1501 defined by the C standard to be wide enough to represent address
1502 differences, and thus must be at least 64 bits wide on a 64-bit
1503 machine.) We stress that there is @emph{no performance advantage} to
1504 using this interface---the same internal FFTW code is employed
1505 regardless---and it is only necessary if you want to specify very
1506 large transform sizes.
1507 @tindex ptrdiff_t
1508
1509
1510 In particular, the 64-bit guru interface is a set of planner routines
1511 that are exactly the same as the guru planner routines, except that
1512 they are named with @samp{guru64} instead of @samp{guru} and they take
1513 arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}.
1514 For example, instead of @code{fftw_plan_guru_dft}, we have
1515 @code{fftw_plan_guru64_dft}.
1516
1517 @example
1518 fftw_plan fftw_plan_guru64_dft(
1519 int rank, const fftw_iodim64 *dims,
1520 int howmany_rank, const fftw_iodim64 *howmany_dims,
1521 fftw_complex *in, fftw_complex *out,
1522 int sign, unsigned flags);
1523 @end example
1524 @findex fftw_plan_guru64_dft
1525
1526 The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the
1527 same interpretation, except that it uses type @code{ptrdiff_t} instead
1528 of type @code{int}.
1529
1530 @example
1531 typedef struct @{
1532 ptrdiff_t n;
1533 ptrdiff_t is;
1534 ptrdiff_t os;
1535 @} fftw_iodim64;
1536 @end example
1537 @tindex fftw_iodim64
1538
1539 Every other @samp{fftw_plan_guru} function also has a
1540 @samp{fftw_plan_guru64} equivalent, but we do not repeat their
1541 documentation here since they are identical to the 32-bit versions
1542 except as noted above.
1543
1544 @c -----------------------------------------------------------
1545 @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference
1546 @section New-array Execute Functions
1547 @cindex execute
1548 @cindex new-array execution
1549
1550 Normally, one executes a plan for the arrays with which the plan was
1551 created, by calling @code{fftw_execute(plan)} as described in @ref{Using
1552 Plans}.
1553 @findex fftw_execute
1554 However, it is possible for sophisticated users to apply a given plan
1555 to a @emph{different} array using the ``new-array execute'' functions
1556 detailed below, provided that the following conditions are met:
1557
1558 @itemize @bullet
1559
1560 @item
1561 The array size, strides, etcetera are the same (since those are set by
1562 the plan).
1563
1564 @item
1565 The input and output arrays are the same (in-place) or different
1566 (out-of-place) if the plan was originally created to be in-place or
1567 out-of-place, respectively.
1568
1569 @item
1570 For split arrays, the separations between the real and imaginary
1571 parts, @code{ii-ri} and @code{io-ro}, are the same as they were for
1572 the input and output arrays when the plan was created. (This
1573 condition is automatically satisfied for interleaved arrays.)
1574
1575 @item
1576 The @dfn{alignment} of the new input/output arrays is the same as that
1577 of the input/output arrays when the plan was created, unless the plan
1578 was created with the @code{FFTW_UNALIGNED} flag.
1579 @ctindex FFTW_UNALIGNED
1580 Here, the alignment is a platform-dependent quantity (for example, it is
1581 the address modulo 16 if SSE SIMD instructions are used, but the address
1582 modulo 4 for non-SIMD single-precision FFTW on the same machine). In
1583 general, only arrays allocated with @code{fftw_malloc} are guaranteed to
1584 be equally aligned (@pxref{SIMD alignment and fftw_malloc}).
1585
1586 @end itemize
1587
1588 @cindex alignment
1589 The alignment issue is especially critical, because if you don't use
1590 @code{fftw_malloc} then you may have little control over the alignment
1591 of arrays in memory. For example, neither the C++ @code{new} function
1592 nor the Fortran @code{allocate} statement provide strong enough
1593 guarantees about data alignment. If you don't use @code{fftw_malloc},
1594 therefore, you probably have to use @code{FFTW_UNALIGNED} (which
1595 disables most SIMD support). If possible, it is probably better for
1596 you to simply create multiple plans (creating a new plan is quick once
1597 one exists for a given size), or better yet re-use the same array for
1598 your transforms.
1599
1600 If you are tempted to use the new-array execute interface because you
1601 want to transform a known bunch of arrays of the same size, you should
1602 probably go use the advanced interface instead (@pxref{Advanced
1603 Interface})).
1604
1605 The new-array execute functions are:
1606
1607 @example
1608 void fftw_execute_dft(
1609 const fftw_plan p,
1610 fftw_complex *in, fftw_complex *out);
1611
1612 void fftw_execute_split_dft(
1613 const fftw_plan p,
1614 double *ri, double *ii, double *ro, double *io);
1615
1616 void fftw_execute_dft_r2c(
1617 const fftw_plan p,
1618 double *in, fftw_complex *out);
1619
1620 void fftw_execute_split_dft_r2c(
1621 const fftw_plan p,
1622 double *in, double *ro, double *io);
1623
1624 void fftw_execute_dft_c2r(
1625 const fftw_plan p,
1626 fftw_complex *in, double *out);
1627
1628 void fftw_execute_split_dft_c2r(
1629 const fftw_plan p,
1630 double *ri, double *ii, double *out);
1631
1632 void fftw_execute_r2r(
1633 const fftw_plan p,
1634 double *in, double *out);
1635 @end example
1636 @findex fftw_execute_dft
1637 @findex fftw_execute_split_dft
1638 @findex fftw_execute_dft_r2c
1639 @findex fftw_execute_split_dft_r2c
1640 @findex fftw_execute_dft_c2r
1641 @findex fftw_execute_split_dft_c2r
1642 @findex fftw_execute_r2r
1643
1644 These execute the @code{plan} to compute the corresponding transform on
1645 the input/output arrays specified by the subsequent arguments. The
1646 input/output array arguments have the same meanings as the ones passed
1647 to the guru planner routines in the preceding sections. The @code{plan}
1648 is not modified, and these routines can be called as many times as
1649 desired, or intermixed with calls to the ordinary @code{fftw_execute}.
1650
1651 The @code{plan} @emph{must} have been created for the transform type
1652 corresponding to the execute function, e.g. it must be a complex-DFT
1653 plan for @code{fftw_execute_dft}. Any of the planner routines for that
1654 transform type, from the basic to the guru interface, could have been
1655 used to create the plan, however.
1656
1657 @c ------------------------------------------------------------
1658 @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference
1659 @section Wisdom
1660 @cindex wisdom
1661 @cindex saving plans to disk
1662
1663 This section documents the FFTW mechanism for saving and restoring
1664 plans from disk. This mechanism is called @dfn{wisdom}.
1665
1666 @menu
1667 * Wisdom Export::
1668 * Wisdom Import::
1669 * Forgetting Wisdom::
1670 * Wisdom Utilities::
1671 @end menu
1672
1673 @c =========>
1674 @node Wisdom Export, Wisdom Import, Wisdom, Wisdom
1675 @subsection Wisdom Export
1676
1677 @example
1678 int fftw_export_wisdom_to_filename(const char *filename);
1679 void fftw_export_wisdom_to_file(FILE *output_file);
1680 char *fftw_export_wisdom_to_string(void);
1681 void fftw_export_wisdom(void (*write_char)(char c, void *), void *data);
1682 @end example
1683 @findex fftw_export_wisdom
1684 @findex fftw_export_wisdom_to_filename
1685 @findex fftw_export_wisdom_to_file
1686 @findex fftw_export_wisdom_to_string
1687
1688 These functions allow you to export all currently accumulated wisdom
1689 in a form from which it can be later imported and restored, even
1690 during a separate run of the program. (@xref{Words of Wisdom-Saving
1691 Plans}.) The current store of wisdom is not affected by calling any
1692 of these routines.
1693
1694 @code{fftw_export_wisdom} exports the wisdom to any output
1695 medium, as specified by the callback function
1696 @code{write_char}. @code{write_char} is a @code{putc}-like function that
1697 writes the character @code{c} to some output; its second parameter is
1698 the @code{data} pointer passed to @code{fftw_export_wisdom}. For
1699 convenience, the following three ``wrapper'' routines are provided:
1700
1701 @code{fftw_export_wisdom_to_filename} writes wisdom to a file named
1702 @code{filename} (which is created or overwritten), returning @code{1}
1703 on success and @code{0} on failure. A lower-level function, which
1704 requires you to open and close the file yourself (e.g. if you want to
1705 write wisdom to a portion of a larger file) is
1706 @code{fftw_export_wisdom_to_file}. This writes the wisdom to the
1707 current position in @code{output_file}, which should be open with
1708 write permission; upon exit, the file remains open and is positioned
1709 at the end of the wisdom data.
1710
1711 @code{fftw_export_wisdom_to_string} returns a pointer to a
1712 @code{NULL}-terminated string holding the wisdom data. This string is
1713 dynamically allocated, and it is the responsibility of the caller to
1714 deallocate it with @code{free} when it is no longer needed.
1715
1716 All of these routines export the wisdom in the same format, which we
1717 will not document here except to say that it is LISP-like ASCII text
1718 that is insensitive to white space.
1719
1720 @c =========>
1721 @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom
1722 @subsection Wisdom Import
1723
1724 @example
1725 int fftw_import_system_wisdom(void);
1726 int fftw_import_wisdom_from_filename(const char *filename);
1727 int fftw_import_wisdom_from_string(const char *input_string);
1728 int fftw_import_wisdom(int (*read_char)(void *), void *data);
1729 @end example
1730 @findex fftw_import_wisdom
1731 @findex fftw_import_system_wisdom
1732 @findex fftw_import_wisdom_from_filename
1733 @findex fftw_import_wisdom_from_file
1734 @findex fftw_import_wisdom_from_string
1735
1736 These functions import wisdom into a program from data stored by the
1737 @code{fftw_export_wisdom} functions above. (@xref{Words of
1738 Wisdom-Saving Plans}.) The imported wisdom replaces any wisdom
1739 already accumulated by the running program.
1740
1741 @code{fftw_import_wisdom} imports wisdom from any input medium, as
1742 specified by the callback function @code{read_char}. @code{read_char} is
1743 a @code{getc}-like function that returns the next character in the
1744 input; its parameter is the @code{data} pointer passed to
1745 @code{fftw_import_wisdom}. If the end of the input data is reached
1746 (which should never happen for valid data), @code{read_char} should
1747 return @code{EOF} (as defined in @code{<stdio.h>}). For convenience,
1748 the following three ``wrapper'' routines are provided:
1749
1750 @code{fftw_import_wisdom_from_filename} reads wisdom from a file named
1751 @code{filename}. A lower-level function, which requires you to open
1752 and close the file yourself (e.g. if you want to read wisdom from a
1753 portion of a larger file) is @code{fftw_import_wisdom_from_file}. This
1754 reads wisdom from the current position in @code{input_file} (which
1755 should be open with read permission); upon exit, the file remains
1756 open, but the position of the read pointer is unspecified.
1757
1758 @code{fftw_import_wisdom_from_string} reads wisdom from the
1759 @code{NULL}-terminated string @code{input_string}.
1760
1761 @code{fftw_import_system_wisdom} reads wisdom from an
1762 implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix
1763 and GNU systems).
1764 @cindex wisdom, system-wide
1765
1766
1767 The return value of these import routines is @code{1} if the wisdom was
1768 read successfully and @code{0} otherwise. Note that, in all of these
1769 functions, any data in the input stream past the end of the wisdom data
1770 is simply ignored.
1771
1772 @c =========>
1773 @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom
1774 @subsection Forgetting Wisdom
1775
1776 @example
1777 void fftw_forget_wisdom(void);
1778 @end example
1779 @findex fftw_forget_wisdom
1780
1781 Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom}
1782 to be discarded and its associated memory to be freed. (New
1783 @code{wisdom} can still be gathered subsequently, however.)
1784
1785 @c =========>
1786 @node Wisdom Utilities, , Forgetting Wisdom, Wisdom
1787 @subsection Wisdom Utilities
1788
1789 FFTW includes two standalone utility programs that deal with wisdom. We
1790 merely summarize them here, since they come with their own @code{man}
1791 pages for Unix and GNU systems (with HTML versions on our web site).
1792
1793 The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in
1794 single precision, etcetera), which can be used to create a wisdom file
1795 containing plans for any of the transform sizes and types supported by
1796 FFTW. It is preferable to create wisdom directly from your executable
1797 (@pxref{Caveats in Using Wisdom}), but this program is useful for
1798 creating global wisdom files for @code{fftw_import_system_wisdom}.
1799 @cindex fftw-wisdom utility
1800
1801
1802 The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom
1803 file as input and produces a @dfn{configuration routine} as output. The
1804 latter is a C subroutine that you can compile and link into your
1805 program, replacing a routine of the same name in the FFTW library, that
1806 determines which parts of FFTW are callable by your program.
1807 @code{fftw-wisdom-to-conf} produces a configuration routine that links
1808 to only those parts of FFTW needed by the saved plans in the wisdom,
1809 greatly reducing the size of statically linked executables (which should
1810 only attempt to create plans corresponding to those in the wisdom,
1811 however).
1812 @cindex fftw-wisdom-to-conf utility
1813 @cindex configuration routines
1814
1815 @c ------------------------------------------------------------
1816 @node What FFTW Really Computes, , Wisdom, FFTW Reference
1817 @section What FFTW Really Computes
1818
1819 In this section, we provide precise mathematical definitions for the
1820 transforms that FFTW computes. These transform definitions are fairly
1821 standard, but some authors follow slightly different conventions for the
1822 normalization of the transform (the constant factor in front) and the
1823 sign of the complex exponent. We begin by presenting the
1824 one-dimensional (1d) transform definitions, and then give the
1825 straightforward extension to multi-dimensional transforms.
1826
1827 @menu
1828 * The 1d Discrete Fourier Transform (DFT)::
1829 * The 1d Real-data DFT::
1830 * 1d Real-even DFTs (DCTs)::
1831 * 1d Real-odd DFTs (DSTs)::
1832 * 1d Discrete Hartley Transforms (DHTs)::
1833 * Multi-dimensional Transforms::
1834 @end menu
1835
1836 @c =========>
1837 @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes
1838 @subsection The 1d Discrete Fourier Transform (DFT)
1839
1840 @cindex discrete Fourier transform
1841 @cindex DFT
1842 The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a
1843 1d complex array @math{X} of size @math{n} computes an array @math{Y},
1844 where:
1845 @tex
1846 $$
1847 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
1848 $$
1849 @end tex
1850 @ifinfo
1851 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
1852 @end ifinfo
1853 @html
1854 <center><img src="equation-dft.png" align="top">.</center>
1855 @end html
1856 The backward (@code{FFTW_BACKWARD}) DFT computes:
1857 @tex
1858 $$
1859 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
1860 $$
1861 @end tex
1862 @ifinfo
1863 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
1864 @end ifinfo
1865 @html
1866 <center><img src="equation-idft.png" align="top">.</center>
1867 @end html
1868
1869 @cindex normalization
1870 FFTW computes an unnormalized transform, in that there is no coefficient
1871 in front of the summation in the DFT. In other words, applying the
1872 forward and then the backward transform will multiply the input by
1873 @math{n}.
1874
1875 @cindex frequency
1876 From above, an @code{FFTW_FORWARD} transform corresponds to a sign of
1877 @math{-1} in the exponent of the DFT. Note also that we use the
1878 standard ``in-order'' output ordering---the @math{k}-th output
1879 corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T}
1880 is your total sampling period). For those who like to think in terms of
1881 positive and negative frequencies, this means that the positive
1882 frequencies are stored in the first half of the output and the negative
1883 frequencies are stored in backwards order in the second half of the
1884 output. (The frequency @math{-k/n} is the same as the frequency
1885 @math{(n-k)/n}.)
1886
1887 @c =========>
1888 @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes
1889 @subsection The 1d Real-data DFT
1890
1891 The real-input (r2c) DFT in FFTW computes the @emph{forward} transform
1892 @math{Y} of the size @code{n} real array @math{X}, exactly as defined
1893 above, i.e.
1894 @tex
1895 $$
1896 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
1897 $$
1898 @end tex
1899 @ifinfo
1900 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
1901 @end ifinfo
1902 @html
1903 <center><img src="equation-dft.png" align="top">.</center>
1904 @end html
1905 This output array @math{Y} can easily be shown to possess the
1906 ``Hermitian'' symmetry
1907 @cindex Hermitian
1908 @tex
1909 $Y_k = Y_{n-k}^*$,
1910 @end tex
1911 @ifinfo
1912 Y[k] = Y[n-k]*,
1913 @end ifinfo
1914 @html
1915 <i>Y<sub>k</sub> = Y<sub>n-k</sub></i><sup>*</sup>,
1916 @end html
1917 where we take @math{Y} to be periodic so that
1918 @tex
1919 $Y_n = Y_0$.
1920 @end tex
1921 @ifinfo
1922 Y[n] = Y[0].
1923 @end ifinfo
1924 @html
1925 <i>Y<sub>n</sub> = Y</i><sub>0</sub>.
1926 @end html
1927
1928 As a result of this symmetry, half of the output @math{Y} is redundant
1929 (being the complex conjugate of the other half), and so the 1d r2c
1930 transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y}
1931 (@math{n/2+1} complex numbers), where the division by @math{2} is
1932 rounded down.
1933
1934 Moreover, the Hermitian symmetry implies that
1935 @tex
1936 $Y_0$
1937 @end tex
1938 @ifinfo
1939 Y[0]
1940 @end ifinfo
1941 @html
1942 <i>Y</i><sub>0</sub>
1943 @end html
1944 and, if @math{n} is even, the
1945 @tex
1946 $Y_{n/2}$
1947 @end tex
1948 @ifinfo
1949 Y[n/2]
1950 @end ifinfo
1951 @html
1952 <i>Y</i><sub><i>n</i>/2</sub>
1953 @end html
1954 element, are purely real. So, for the @code{R2HC} r2r transform, these
1955 elements are not stored in the halfcomplex output format.
1956 @cindex r2r
1957 @ctindex R2HC
1958 @cindex halfcomplex format
1959
1960
1961 The c2r and @code{H2RC} r2r transforms compute the backward DFT of the
1962 @emph{complex} array @math{X} with Hermitian symmetry, stored in the
1963 r2c/@code{R2HC} output formats, respectively, where the backward
1964 transform is defined exactly as for the complex case:
1965 @tex
1966 $$
1967 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
1968 $$
1969 @end tex
1970 @ifinfo
1971 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
1972 @end ifinfo
1973 @html
1974 <center><img src="equation-idft.png" align="top">.</center>
1975 @end html
1976 The outputs @code{Y} of this transform can easily be seen to be purely
1977 real, and are stored as an array of real numbers.
1978
1979 @cindex normalization
1980 Like FFTW's complex DFT, these transforms are unnormalized. In other
1981 words, applying the real-to-complex (forward) and then the
1982 complex-to-real (backward) transform will multiply the input by
1983 @math{n}.
1984
1985 @c =========>
1986 @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes
1987 @subsection 1d Real-even DFTs (DCTs)
1988
1989 The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized
1990 forward (and backward) DFTs as defined above, where the input array
1991 @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry. In
1992 this case, the output array is likewise real and even symmetry.
1993 @cindex real-even DFT
1994 @cindex REDFT
1995
1996
1997 @ctindex REDFT00
1998 For the case of @code{REDFT00}, this even symmetry means that
1999 @tex
2000 $X_j = X_{N-j}$,
2001 @end tex
2002 @ifinfo
2003 X[j] = X[N-j],
2004 @end ifinfo
2005 @html
2006 <i>X<sub>j</sub> = X<sub>N-j</sub></i>,
2007 @end html
2008 where we take @math{X} to be periodic so that
2009 @tex
2010 $X_N = X_0$.
2011 @end tex
2012 @ifinfo
2013 X[N] = X[0].
2014 @end ifinfo
2015 @html
2016 <i>X<sub>N</sub> = X</i><sub>0</sub>.
2017 @end html
2018 Because of this redundancy, only the first @math{n} real numbers are
2019 actually stored, where @math{N = 2(n-1)}.
2020
2021 The proper definition of even symmetry for @code{REDFT10},
2022 @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate
2023 because of the shifts by @math{1/2} of the input and/or output, although
2024 the corresponding boundary conditions are given in @ref{Real even/odd
2025 DFTs (cosine/sine transforms)}. Because of the even symmetry, however,
2026 the sine terms in the DFT all cancel and the remaining cosine terms are
2027 written explicitly below. This formulation often leads people to call
2028 such a transform a @dfn{discrete cosine transform} (DCT), although it is
2029 really just a special case of the DFT.
2030 @cindex discrete cosine transform
2031 @cindex DCT
2032
2033
2034 In each of the definitions below, we transform a real array @math{X} of
2035 length @math{n} to a real array @math{Y} of length @math{n}:
2036
2037 @subsubheading REDFT00 (DCT-I)
2038 @ctindex REDFT00
2039 An @code{REDFT00} transform (type-I DCT) in FFTW is defined by:
2040 @tex
2041 $$
2042 Y_k = X_0 + (-1)^k X_{n-1}
2043 + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)].
2044 $$
2045 @end tex
2046 @ifinfo
2047 Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))).
2048 @end ifinfo
2049 @html
2050 <center><img src="equation-redft00.png" align="top">.</center>
2051 @end html
2052 Note that this transform is not defined for @math{n=1}. For @math{n=2},
2053 the summation term above is dropped as you might expect.
2054
2055 @subsubheading REDFT10 (DCT-II)
2056 @ctindex REDFT10
2057 An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by:
2058 @tex
2059 $$
2060 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n].
2061 $$
2062 @end tex
2063 @ifinfo
2064 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)).
2065 @end ifinfo
2066 @html
2067 <center><img src="equation-redft10.png" align="top">.</center>
2068 @end html
2069
2070 @subsubheading REDFT01 (DCT-III)
2071 @ctindex REDFT01
2072 An @code{REDFT01} transform (type-III DCT) in FFTW is defined by:
2073 @tex
2074 $$
2075 Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n].
2076 $$
2077 @end tex
2078 @ifinfo
2079 Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)).
2080 @end ifinfo
2081 @html
2082 <center><img src="equation-redft01.png" align="top">.</center>
2083 @end html
2084 In the case of @math{n=1}, this reduces to
2085 @tex
2086 $Y_0 = X_0$.
2087 @end tex
2088 @ifinfo
2089 Y[0] = X[0].
2090 @end ifinfo
2091 @html
2092 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
2093 @end html
2094 Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''.
2095 @cindex IDCT
2096
2097 @subsubheading REDFT11 (DCT-IV)
2098 @ctindex REDFT11
2099 An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by:
2100 @tex
2101 $$
2102 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n].
2103 $$
2104 @end tex
2105 @ifinfo
2106 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)).
2107 @end ifinfo
2108 @html
2109 <center><img src="equation-redft11.png" align="top">.</center>
2110 @end html
2111
2112 @subsubheading Inverses and Normalization
2113
2114 These definitions correspond directly to the unnormalized DFTs used
2115 elsewhere in FFTW (hence the factors of @math{2} in front of the
2116 summations). The unnormalized inverse of @code{REDFT00} is
2117 @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and
2118 of @code{REDFT11} is @code{REDFT11}. Each unnormalized inverse results
2119 in the original array multiplied by @math{N}, where @math{N} is the
2120 @emph{logical} DFT size. For @code{REDFT00}, @math{N=2(n-1)} (note that
2121 @math{n=1} is not defined); otherwise, @math{N=2n}.
2122 @cindex normalization
2123
2124
2125 In defining the discrete cosine transform, some authors also include
2126 additional factors of
2127 @ifinfo
2128 sqrt(2)
2129 @end ifinfo
2130 @html
2131 &radic;2
2132 @end html
2133 @tex
2134 $\sqrt{2}$
2135 @end tex
2136 (or its inverse) multiplying selected inputs and/or outputs. This is a
2137 mostly cosmetic change that makes the transform orthogonal, but
2138 sacrifices the direct equivalence to a symmetric DFT.
2139
2140 @c =========>
2141 @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes
2142 @subsection 1d Real-odd DFTs (DSTs)
2143
2144 The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized
2145 forward (and backward) DFTs as defined above, where the input array
2146 @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry. In
2147 this case, the output is odd symmetry and purely imaginary.
2148 @cindex real-odd DFT
2149 @cindex RODFT
2150
2151
2152 @ctindex RODFT00
2153 For the case of @code{RODFT00}, this odd symmetry means that
2154 @tex
2155 $X_j = -X_{N-j}$,
2156 @end tex
2157 @ifinfo
2158 X[j] = -X[N-j],
2159 @end ifinfo
2160 @html
2161 <i>X<sub>j</sub> = -X<sub>N-j</sub></i>,
2162 @end html
2163 where we take @math{X} to be periodic so that
2164 @tex
2165 $X_N = X_0$.
2166 @end tex
2167 @ifinfo
2168 X[N] = X[0].
2169 @end ifinfo
2170 @html
2171 <i>X<sub>N</sub> = X</i><sub>0</sub>.
2172 @end html
2173 Because of this redundancy, only the first @math{n} real numbers
2174 starting at @math{j=1} are actually stored (the @math{j=0} element is
2175 zero), where @math{N = 2(n+1)}.
2176
2177 The proper definition of odd symmetry for @code{RODFT10},
2178 @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate
2179 because of the shifts by @math{1/2} of the input and/or output, although
2180 the corresponding boundary conditions are given in @ref{Real even/odd
2181 DFTs (cosine/sine transforms)}. Because of the odd symmetry, however,
2182 the cosine terms in the DFT all cancel and the remaining sine terms are
2183 written explicitly below. This formulation often leads people to call
2184 such a transform a @dfn{discrete sine transform} (DST), although it is
2185 really just a special case of the DFT.
2186 @cindex discrete sine transform
2187 @cindex DST
2188
2189
2190 In each of the definitions below, we transform a real array @math{X} of
2191 length @math{n} to a real array @math{Y} of length @math{n}:
2192
2193 @subsubheading RODFT00 (DST-I)
2194 @ctindex RODFT00
2195 An @code{RODFT00} transform (type-I DST) in FFTW is defined by:
2196 @tex
2197 $$
2198 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)].
2199 $$
2200 @end tex
2201 @ifinfo
2202 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))).
2203 @end ifinfo
2204 @html
2205 <center><img src="equation-rodft00.png" align="top">.</center>
2206 @end html
2207
2208 @subsubheading RODFT10 (DST-II)
2209 @ctindex RODFT10
2210 An @code{RODFT10} transform (type-II DST) in FFTW is defined by:
2211 @tex
2212 $$
2213 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n].
2214 $$
2215 @end tex
2216 @ifinfo
2217 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)).
2218 @end ifinfo
2219 @html
2220 <center><img src="equation-rodft10.png" align="top">.</center>
2221 @end html
2222
2223 @subsubheading RODFT01 (DST-III)
2224 @ctindex RODFT01
2225 An @code{RODFT01} transform (type-III DST) in FFTW is defined by:
2226 @tex
2227 $$
2228 Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n].
2229 $$
2230 @end tex
2231 @ifinfo
2232 Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)).
2233 @end ifinfo
2234 @html
2235 <center><img src="equation-rodft01.png" align="top">.</center>
2236 @end html
2237 In the case of @math{n=1}, this reduces to
2238 @tex
2239 $Y_0 = X_0$.
2240 @end tex
2241 @ifinfo
2242 Y[0] = X[0].
2243 @end ifinfo
2244 @html
2245 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
2246 @end html
2247
2248 @subsubheading RODFT11 (DST-IV)
2249 @ctindex RODFT11
2250 An @code{RODFT11} transform (type-IV DST) in FFTW is defined by:
2251 @tex
2252 $$
2253 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n].
2254 $$
2255 @end tex
2256 @ifinfo
2257 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)).
2258 @end ifinfo
2259 @html
2260 <center><img src="equation-rodft11.png" align="top">.</center>
2261 @end html
2262
2263 @subsubheading Inverses and Normalization
2264
2265 These definitions correspond directly to the unnormalized DFTs used
2266 elsewhere in FFTW (hence the factors of @math{2} in front of the
2267 summations). The unnormalized inverse of @code{RODFT00} is
2268 @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and
2269 of @code{RODFT11} is @code{RODFT11}. Each unnormalized inverse results
2270 in the original array multiplied by @math{N}, where @math{N} is the
2271 @emph{logical} DFT size. For @code{RODFT00}, @math{N=2(n+1)};
2272 otherwise, @math{N=2n}.
2273 @cindex normalization
2274
2275
2276 In defining the discrete sine transform, some authors also include
2277 additional factors of
2278 @ifinfo
2279 sqrt(2)
2280 @end ifinfo
2281 @html
2282 &radic;2
2283 @end html
2284 @tex
2285 $\sqrt{2}$
2286 @end tex
2287 (or its inverse) multiplying selected inputs and/or outputs. This is a
2288 mostly cosmetic change that makes the transform orthogonal, but
2289 sacrifices the direct equivalence to an antisymmetric DFT.
2290
2291 @c =========>
2292 @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes
2293 @subsection 1d Discrete Hartley Transforms (DHTs)
2294
2295 @cindex discrete Hartley transform
2296 @cindex DHT
2297 The discrete Hartley transform (DHT) of a 1d real array @math{X} of size
2298 @math{n} computes a real array @math{Y} of the same size, where:
2299 @tex
2300 $$
2301 Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)].
2302 $$
2303 @end tex
2304 @ifinfo
2305 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)].
2306 @end ifinfo
2307 @html
2308 <center><img src="equation-dht.png" align="top">.</center>
2309 @end html
2310
2311 @cindex normalization
2312 FFTW computes an unnormalized transform, in that there is no coefficient
2313 in front of the summation in the DHT. In other words, applying the
2314 transform twice (the DHT is its own inverse) will multiply the input by
2315 @math{n}.
2316
2317 @c =========>
2318 @node Multi-dimensional Transforms, , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes
2319 @subsection Multi-dimensional Transforms
2320
2321 The multi-dimensional transforms of FFTW, in general, compute simply the
2322 separable product of the given 1d transform along each dimension of the
2323 array. Since each of these transforms is unnormalized, computing the
2324 forward followed by the backward/inverse multi-dimensional transform
2325 will result in the original array scaled by the product of the
2326 normalization factors for each dimension (e.g. the product of the
2327 dimension sizes, for a multi-dimensional DFT).
2328
2329 @tex
2330 As an explicit example, consider the following exact mathematical
2331 definition of our multi-dimensional DFT. Let $X$ be a $d$-dimensional
2332 complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0
2333 \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$. Let also
2334 $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d
2335 \}$.
2336
2337 The forward transform computes a complex array~$Y$, whose
2338 structure is the same as that of~$X$, defined by
2339
2340 $$
2341 Y[k_1, k_2, \ldots, k_d] =
2342 \sum_{j_1 = 0}^{n_1 - 1}
2343 \sum_{j_2 = 0}^{n_2 - 1}
2344 \cdots
2345 \sum_{j_d = 0}^{n_d - 1}
2346 X[j_1, j_2, \ldots, j_d]
2347 \omega_1^{-j_1 k_1}
2348 \omega_2^{-j_2 k_2}
2349 \cdots
2350 \omega_d^{-j_d k_d} \ .
2351 $$
2352
2353 The backward transform computes
2354 $$
2355 Y[k_1, k_2, \ldots, k_d] =
2356 \sum_{j_1 = 0}^{n_1 - 1}
2357 \sum_{j_2 = 0}^{n_2 - 1}
2358 \cdots
2359 \sum_{j_d = 0}^{n_d - 1}
2360 X[j_1, j_2, \ldots, j_d]
2361 \omega_1^{j_1 k_1}
2362 \omega_2^{j_2 k_2}
2363 \cdots
2364 \omega_d^{j_d k_d} \ .
2365 $$
2366
2367 Computing the forward transform followed by the backward transform
2368 will multiply the array by $\prod_{s=1}^{d} n_d$.
2369 @end tex
2370
2371 @cindex r2c
2372 The definition of FFTW's multi-dimensional DFT of real data (r2c)
2373 deserves special attention. In this case, we logically compute the full
2374 multi-dimensional DFT of the input data; since the input data are purely
2375 real, the output data have the Hermitian symmetry and therefore only one
2376 non-redundant half need be stored. More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array
2377 @tex
2378 $Y[k_0, k_1, \ldots, k_{d-1}]$
2379 @end tex
2380 @html
2381 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
2382 <i>k</i><sub><i>d-1</i></sub>]
2383 @end html
2384 @ifinfo
2385 Y[k[0], k[1], ..., k[d-1]]
2386 @end ifinfo
2387 has the symmetry:
2388 @tex
2389 $$
2390 Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^*
2391 $$
2392 @end tex
2393 @html
2394 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
2395 <i>k</i><sub><i>d-1</i></sub>] = <i>Y</i>[<i>n</i><sub>0</sub> -
2396 <i>k</i><sub>0</sub>, <i>n</i><sub>1</sub> - <i>k</i><sub>1</sub>, ...,
2397 <i>n</i><sub><i>d-1</i></sub> - <i>k</i><sub><i>d-1</i></sub>]<sup>*</sup>
2398 @end html
2399 @ifinfo
2400 Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]*
2401 @end ifinfo
2402 (where each dimension is periodic). Because of this symmetry, we only
2403 store the
2404 @tex
2405 $k_{d-1} = 0 \cdots n_{d-1}/2$
2406 @end tex
2407 @html
2408 <i>k</i><sub><i>d-1</i></sub> = 0...<i>n</i><sub><i>d-1</i></sub>/2+1
2409 @end html
2410 @ifinfo
2411 k[d-1] = 0...n[d-1]/2
2412 @end ifinfo
2413 elements of the @emph{last} dimension (division by @math{2} is rounded
2414 down). (We could instead have cut any other dimension in half, but the
2415 last dimension proved computationally convenient.) This results in the
2416 peculiar array format described in more detail by @ref{Real-data DFT
2417 Array Format}.
2418
2419 The multi-dimensional c2r transform is simply the unnormalized inverse
2420 of the r2c transform. i.e. it is the same as FFTW's complex backward
2421 multi-dimensional DFT, operating on a Hermitian input array in the
2422 peculiar format mentioned above and outputting a real array (since the
2423 DFT output is purely real).
2424
2425 We should remind the user that the separable product of 1d transforms
2426 along each dimension, as computed by FFTW, is not always the same thing
2427 as the usual multi-dimensional transform. A multi-dimensional
2428 @code{R2HC} (or @code{HC2R}) transform is not identical to the
2429 multi-dimensional DFT, requiring some post-processing to combine the
2430 requisite real and imaginary parts, as was described in @ref{The
2431 Halfcomplex-format DFT}. Likewise, FFTW's multidimensional
2432 @code{FFTW_DHT} r2r transform is not the same thing as the logical
2433 multi-dimensional discrete Hartley transform defined in the literature,
2434 as discussed in @ref{The Discrete Hartley Transform}.
2435