cannam@127: @node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top cannam@127: @chapter FFTW Reference cannam@127: cannam@127: This chapter provides a complete reference for all sequential (i.e., cannam@127: one-processor) FFTW functions. Parallel transforms are described in cannam@127: later chapters. cannam@127: cannam@127: @menu cannam@127: * Data Types and Files:: cannam@127: * Using Plans:: cannam@127: * Basic Interface:: cannam@127: * Advanced Interface:: cannam@127: * Guru Interface:: cannam@127: * New-array Execute Functions:: cannam@127: * Wisdom:: cannam@127: * What FFTW Really Computes:: cannam@127: @end menu cannam@127: cannam@127: @c ------------------------------------------------------------ cannam@127: @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference cannam@127: @section Data Types and Files cannam@127: cannam@127: All programs using FFTW should include its header file: cannam@127: cannam@127: @example cannam@127: #include cannam@127: @end example cannam@127: cannam@127: You must also link to the FFTW library. On Unix, this cannam@127: means adding @code{-lfftw3 -lm} at the @emph{end} of the link command. cannam@127: cannam@127: @menu cannam@127: * Complex numbers:: cannam@127: * Precision:: cannam@127: * Memory Allocation:: cannam@127: @end menu cannam@127: cannam@127: @c =========> cannam@127: @node Complex numbers, Precision, Data Types and Files, Data Types and Files cannam@127: @subsection Complex numbers cannam@127: cannam@127: The default FFTW interface uses @code{double} precision for all cannam@127: floating-point numbers, and defines a @code{fftw_complex} type to hold cannam@127: complex numbers as: cannam@127: cannam@127: @example cannam@127: typedef double fftw_complex[2]; cannam@127: @end example cannam@127: @tindex fftw_complex cannam@127: cannam@127: Here, the @code{[0]} element holds the real part and the @code{[1]} cannam@127: element holds the imaginary part. cannam@127: cannam@127: Alternatively, if you have a C compiler (such as @code{gcc}) that cannam@127: supports the C99 revision of the ANSI C standard, you can use C's new cannam@127: native complex type (which is binary-compatible with the typedef above). cannam@127: In particular, if you @code{#include } @emph{before} cannam@127: @code{}, then @code{fftw_complex} is defined to be the native cannam@127: complex type and you can manipulate it with ordinary arithmetic cannam@127: (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are cannam@127: @code{fftw_complex} and @code{I} is the standard symbol for the cannam@127: imaginary unit); cannam@127: @cindex C99 cannam@127: cannam@127: cannam@127: C++ has its own @code{complex} template class, defined in the cannam@127: standard @code{} header file. Reportedly, the C++ standards cannam@127: committee has recently agreed to mandate that the storage format used cannam@127: for this type be binary-compatible with the C99 type, i.e. an array cannam@127: @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]} cannam@127: parts. (See report cannam@127: @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf cannam@127: WG21/N1388}.) Although not part of the official standard as of this cannam@127: writing, the proposal stated that: ``This solution has been tested with cannam@127: all current major implementations of the standard library and shown to cannam@127: be working.'' To the extent that this is true, if you have a variable cannam@127: @code{complex *x}, you can pass it directly to FFTW via cannam@127: @code{reinterpret_cast(x)}. cannam@127: @cindex C++ cannam@127: @cindex portability cannam@127: cannam@127: @c =========> cannam@127: @node Precision, Memory Allocation, Complex numbers, Data Types and Files cannam@127: @subsection Precision cannam@127: @cindex precision cannam@127: cannam@127: You can install single and long-double precision versions of FFTW, cannam@127: which replace @code{double} with @code{float} and @code{long double}, cannam@127: respectively (@pxref{Installation and Customization}). To use these cannam@127: interfaces, you: cannam@127: cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or cannam@127: @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}. (You cannam@127: can link to the different-precision libraries simultaneously.) cannam@127: cannam@127: @item cannam@127: Include the @emph{same} @code{} header file. cannam@127: cannam@127: @item cannam@127: Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or cannam@127: @samp{fftwl_} for single or long-double precision, respectively. cannam@127: (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute} cannam@127: becomes @code{fftwf_execute}, etcetera.) cannam@127: cannam@127: @item cannam@127: Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the cannam@127: same. cannam@127: cannam@127: @item cannam@127: Replace @code{double} with @code{float} or @code{long double} for cannam@127: subroutine parameters. cannam@127: cannam@127: @end itemize cannam@127: cannam@127: Depending upon your compiler and/or hardware, @code{long double} may not cannam@127: be any more precise than @code{double} (or may not be supported at all, cannam@127: although it is standard in C99). cannam@127: @cindex C99 cannam@127: cannam@127: cannam@127: We also support using the nonstandard @code{__float128} cannam@127: quadruple-precision type provided by recent versions of @code{gcc} on cannam@127: 32- and 64-bit x86 hardware (@pxref{Installation and Customization}). cannam@127: To use this type, link with @code{-lfftw3q -lquadmath -lm} (the cannam@127: @code{libquadmath} library provided by @code{gcc} is needed for cannam@127: quadruple-precision trigonometric functions) and use @samp{fftwq_} cannam@127: identifiers. cannam@127: cannam@127: @c =========> cannam@127: @node Memory Allocation, , Precision, Data Types and Files cannam@127: @subsection Memory Allocation cannam@127: cannam@127: @example cannam@127: void *fftw_malloc(size_t n); cannam@127: void fftw_free(void *p); cannam@127: @end example cannam@127: @findex fftw_malloc cannam@127: @findex fftw_free cannam@127: cannam@127: These are functions that behave identically to @code{malloc} and cannam@127: @code{free}, except that they guarantee that the returned pointer obeys cannam@127: any special alignment restrictions imposed by any algorithm in FFTW cannam@127: (e.g. for SIMD acceleration). @xref{SIMD alignment and fftw_malloc}. cannam@127: @cindex alignment cannam@127: cannam@127: cannam@127: Data allocated by @code{fftw_malloc} @emph{must} be deallocated by cannam@127: @code{fftw_free} and not by the ordinary @code{free}. cannam@127: cannam@127: These routines simply call through to your operating system's cannam@127: @code{malloc} or, if necessary, its aligned equivalent cannam@127: (e.g. @code{memalign}), so you normally need not worry about any cannam@127: significant time or space overhead. You are @emph{not required} to use cannam@127: them to allocate your data, but we strongly recommend it. cannam@127: cannam@127: Note: in C++, just as with ordinary @code{malloc}, you must typecast cannam@127: the output of @code{fftw_malloc} to whatever pointer type you are cannam@127: allocating. cannam@127: @cindex C++ cannam@127: cannam@127: cannam@127: We also provide the following two convenience functions to allocate cannam@127: real and complex arrays with @code{n} elements, which are equivalent cannam@127: to @code{(double *) fftw_malloc(sizeof(double) * n)} and cannam@127: @code{(fftw_complex *) fftw_malloc(sizeof(fftw_complex) * n)}, cannam@127: respectively: cannam@127: cannam@127: @example cannam@127: double *fftw_alloc_real(size_t n); cannam@127: fftw_complex *fftw_alloc_complex(size_t n); cannam@127: @end example cannam@127: @findex fftw_alloc_real cannam@127: @findex fftw_alloc_complex cannam@127: cannam@127: The equivalent functions in other precisions allocate arrays of @code{n} cannam@127: elements in that precision. e.g. @code{fftwf_alloc_real(n)} is cannam@127: equivalent to @code{(float *) fftwf_malloc(sizeof(float) * n)}. cannam@127: @cindex precision cannam@127: cannam@127: @c ------------------------------------------------------------ cannam@127: @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference cannam@127: @section Using Plans cannam@127: cannam@127: Plans for all transform types in FFTW are stored as type cannam@127: @code{fftw_plan} (an opaque pointer type), and are created by one of the cannam@127: various planning routines described in the following sections. cannam@127: @tindex fftw_plan cannam@127: An @code{fftw_plan} contains all information necessary to compute the cannam@127: transform, including the pointers to the input and output arrays. cannam@127: cannam@127: @example cannam@127: void fftw_execute(const fftw_plan plan); cannam@127: @end example cannam@127: @findex fftw_execute cannam@127: cannam@127: This executes the @code{plan}, to compute the corresponding transform on cannam@127: the arrays for which it was planned (which must still exist). The plan cannam@127: is not modified, and @code{fftw_execute} can be called as many times as cannam@127: desired. cannam@127: cannam@127: To apply a given plan to a different array, you can use the new-array execute cannam@127: interface. @xref{New-array Execute Functions}. cannam@127: cannam@127: @code{fftw_execute} (and equivalents) is the only function in FFTW cannam@127: guaranteed to be thread-safe; see @ref{Thread safety}. cannam@127: cannam@127: This function: cannam@127: @example cannam@127: void fftw_destroy_plan(fftw_plan plan); cannam@127: @end example cannam@127: @findex fftw_destroy_plan cannam@127: deallocates the @code{plan} and all its associated data. cannam@127: cannam@127: FFTW's planner saves some other persistent data, such as the cannam@127: accumulated wisdom and a list of algorithms available in the current cannam@127: configuration. If you want to deallocate all of that and reset FFTW cannam@127: to the pristine state it was in when you started your program, you can cannam@127: call: cannam@127: cannam@127: @example cannam@127: void fftw_cleanup(void); cannam@127: @end example cannam@127: @findex fftw_cleanup cannam@127: cannam@127: After calling @code{fftw_cleanup}, all existing plans become undefined, cannam@127: and you should not attempt to execute them nor to destroy them. You can cannam@127: however create and execute/destroy new plans, in which case FFTW starts cannam@127: accumulating wisdom information again. cannam@127: cannam@127: @code{fftw_cleanup} does not deallocate your plans, however. To prevent cannam@127: memory leaks, you must still call @code{fftw_destroy_plan} before cannam@127: executing @code{fftw_cleanup}. cannam@127: cannam@127: Occasionally, it may useful to know FFTW's internal ``cost'' metric cannam@127: that it uses to compare plans to one another; this cost is cannam@127: proportional to an execution time of the plan, in undocumented units, cannam@127: if the plan was created with the @code{FFTW_MEASURE} or other cannam@127: timing-based options, or alternatively is a heuristic cost function cannam@127: for @code{FFTW_ESTIMATE} plans. (The cost values of measured and cannam@127: estimated plans are not comparable, being in different units. Also, cannam@127: costs from different FFTW versions or the same version compiled cannam@127: differently may not be in the same units. Plans created from wisdom cannam@127: have a cost of 0 since no timing measurement is performed for them. cannam@127: Finally, certain problems for which only one top-level algorithm was cannam@127: possible may have required no measurements of the cost of the whole cannam@127: plan, in which case @code{fftw_cost} will also return 0.) The cost cannam@127: metric for a given plan is returned by: cannam@127: cannam@127: @example cannam@127: double fftw_cost(const fftw_plan plan); cannam@127: @end example cannam@127: @findex fftw_cost cannam@127: cannam@127: The following two routines are provided purely for academic purposes cannam@127: (that is, for entertainment). cannam@127: cannam@127: @example cannam@127: void fftw_flops(const fftw_plan plan, cannam@127: double *add, double *mul, double *fma); cannam@127: @end example cannam@127: @findex fftw_flops cannam@127: cannam@127: Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an cannam@127: exact count of the number of floating-point additions, multiplications, cannam@127: and fused multiply-add operations involved in the plan's execution. The cannam@127: total number of floating-point operations (flops) is @code{add + mul + cannam@127: 2*fma}, or @code{add + mul + fma} if the hardware supports fused cannam@127: multiply-add instructions (although the number of FMA operations is only cannam@127: approximate because of compiler voodoo). (The number of operations cannam@127: should be an integer, but we use @code{double} to avoid overflowing cannam@127: @code{int} for large transforms; the arguments are of type @code{double} cannam@127: even for single and long-double precision versions of FFTW.) cannam@127: cannam@127: @example cannam@127: void fftw_fprint_plan(const fftw_plan plan, FILE *output_file); cannam@127: void fftw_print_plan(const fftw_plan plan); cannam@127: char *fftw_sprint_plan(const fftw_plan plan); cannam@127: @end example cannam@127: @findex fftw_fprint_plan cannam@127: @findex fftw_print_plan cannam@127: cannam@127: This outputs a ``nerd-readable'' representation of the @code{plan} to cannam@127: the given file, to @code{stdout}, or two a newly allocated cannam@127: NUL-terminated string (which the caller is responsible for deallocating cannam@127: with @code{free}), respectively. cannam@127: cannam@127: @c ------------------------------------------------------------ cannam@127: @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference cannam@127: @section Basic Interface cannam@127: @cindex basic interface cannam@127: cannam@127: Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est cannam@127: omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface} cannam@127: computes a single transform of contiguous data, the @dfn{advanced cannam@127: interface} computes transforms of multiple or strided arrays, and the cannam@127: @dfn{guru interface} supports the most general data layouts, cannam@127: multiplicities, and strides. This section describes the the basic cannam@127: interface, which we expect to satisfy the needs of most users. cannam@127: cannam@127: @menu cannam@127: * Complex DFTs:: cannam@127: * Planner Flags:: cannam@127: * Real-data DFTs:: cannam@127: * Real-data DFT Array Format:: cannam@127: * Real-to-Real Transforms:: cannam@127: * Real-to-Real Transform Kinds:: cannam@127: @end menu cannam@127: cannam@127: @c =========> cannam@127: @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface cannam@127: @subsection Complex DFTs cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_dft_1d(int n0, cannam@127: fftw_complex *in, fftw_complex *out, cannam@127: int sign, unsigned flags); cannam@127: fftw_plan fftw_plan_dft_2d(int n0, int n1, cannam@127: fftw_complex *in, fftw_complex *out, cannam@127: int sign, unsigned flags); cannam@127: fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2, cannam@127: fftw_complex *in, fftw_complex *out, cannam@127: int sign, unsigned flags); cannam@127: fftw_plan fftw_plan_dft(int rank, const int *n, cannam@127: fftw_complex *in, fftw_complex *out, cannam@127: int sign, unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_dft_1d cannam@127: @findex fftw_plan_dft_2d cannam@127: @findex fftw_plan_dft_3d cannam@127: @findex fftw_plan_dft cannam@127: cannam@127: Plan a complex input/output discrete Fourier transform (DFT) in zero or cannam@127: more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). cannam@127: cannam@127: Once you have created a plan for a certain transform type and cannam@127: parameters, then creating another plan of the same type and parameters, cannam@127: but for different arrays, is fast and shares constant data with the cannam@127: first plan (if it still exists). cannam@127: cannam@127: The planner returns @code{NULL} if the plan cannot be created. In the cannam@127: standard FFTW distribution, the basic interface is guaranteed to return cannam@127: a non-@code{NULL} plan. A plan may be @code{NULL}, however, if you are cannam@127: using a customized FFTW configuration supporting a restricted set of cannam@127: transforms. cannam@127: cannam@127: @subsubheading Arguments cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: @code{rank} is the rank of the transform (it should be the size of the cannam@127: array @code{*n}), and can be any non-negative integer. (@xref{Complex cannam@127: Multi-Dimensional DFTs}, for the definition of ``rank''.) The cannam@127: @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a cannam@127: @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank cannam@127: may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a cannam@127: copy of one number from input to output. cannam@127: cannam@127: @item cannam@127: @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate cannam@127: for each routine) specify the size of the transform dimensions. They cannam@127: can be any positive integer. cannam@127: cannam@127: @itemize @minus cannam@127: @item cannam@127: @cindex row-major cannam@127: Multi-dimensional arrays are stored in row-major order with dimensions: cannam@127: @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or cannam@127: @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. cannam@127: @xref{Multi-dimensional Array Format}. cannam@127: @item cannam@127: FFTW is best at handling sizes of the form cannam@127: @ifinfo cannam@127: @math{2^a 3^b 5^c 7^d 11^e 13^f}, cannam@127: @end ifinfo cannam@127: @tex cannam@127: $2^a 3^b 5^c 7^d 11^e 13^f$, cannam@127: @end tex cannam@127: @html cannam@127: 2a 3b 5c 7d cannam@127: 11e 13f, cannam@127: @end html cannam@127: where @math{e+f} is either @math{0} or @math{1}, and the other exponents cannam@127: are arbitrary. Other sizes are computed by means of a slow, cannam@127: general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). It is possible to customize FFTW cannam@127: for different array sizes; see @ref{Installation and Customization}. cannam@127: Transforms whose sizes are powers of @math{2} are especially fast. cannam@127: @end itemize cannam@127: cannam@127: @item cannam@127: @code{in} and @code{out} point to the input and output arrays of the cannam@127: transform, which may be the same (yielding an in-place transform). cannam@127: @cindex in-place cannam@127: These arrays are overwritten during planning, unless cannam@127: @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be cannam@127: initialized, but they must be allocated.) cannam@127: cannam@127: If @code{in == out}, the transform is @dfn{in-place} and the input cannam@127: array is overwritten. If @code{in != out}, the two arrays must cannam@127: not overlap (but FFTW does not check for this condition). cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_FORWARD cannam@127: @ctindex FFTW_BACKWARD cannam@127: @code{sign} is the sign of the exponent in the formula that defines the cannam@127: Fourier transform. It can be @math{-1} (= @code{FFTW_FORWARD}) or cannam@127: @math{+1} (= @code{FFTW_BACKWARD}). cannam@127: cannam@127: @item cannam@127: @cindex flags cannam@127: @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, cannam@127: as defined in @ref{Planner Flags}. cannam@127: cannam@127: @end itemize cannam@127: cannam@127: FFTW computes an unnormalized transform: computing a forward followed by cannam@127: a backward transform (or vice versa) will result in the original data cannam@127: multiplied by the size of the transform (the product of the dimensions). cannam@127: @cindex normalization cannam@127: For more information, see @ref{What FFTW Really Computes}. cannam@127: cannam@127: @c =========> cannam@127: @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface cannam@127: @subsection Planner Flags cannam@127: cannam@127: All of the planner routines in FFTW accept an integer @code{flags} cannam@127: argument, which is a bitwise OR (@samp{|}) of zero or more of the flag cannam@127: constants defined below. These flags control the rigor (and time) of cannam@127: the planning process, and can also impose (or lift) restrictions on the cannam@127: type of transform algorithm that is employed. cannam@127: cannam@127: @emph{Important:} the planner overwrites the input array during cannam@127: planning unless a saved plan (@pxref{Wisdom}) is available for that cannam@127: problem, so you should initialize your input data after creating the cannam@127: plan. The only exceptions to this are the @code{FFTW_ESTIMATE} and cannam@127: @code{FFTW_WISDOM_ONLY} flags, as mentioned below. cannam@127: cannam@127: In all cases, if wisdom is available for the given problem that was cannam@127: created with equal-or-greater planning rigor, then the more rigorous cannam@127: wisdom is used. For example, in @code{FFTW_ESTIMATE} mode any available cannam@127: wisdom is used, whereas in @code{FFTW_PATIENT} mode only wisdom created cannam@127: in patient or exhaustive mode can be used. @xref{Words of Wisdom-Saving cannam@127: Plans}. cannam@127: cannam@127: @subsubheading Planning-rigor flags cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_ESTIMATE cannam@127: @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of cannam@127: different algorithms, a simple heuristic is used to pick a (probably cannam@127: sub-optimal) plan quickly. With this flag, the input/output arrays are cannam@127: not overwritten during planning. cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_MEASURE cannam@127: @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually cannam@127: @emph{computing} several FFTs and measuring their execution time. cannam@127: Depending on your machine, this can take some time (often a few cannam@127: seconds). @code{FFTW_MEASURE} is the default planning option. cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_PATIENT cannam@127: @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider cannam@127: range of algorithms and often produces a ``more optimal'' plan cannam@127: (especially for large transforms), but at the expense of several times cannam@127: longer planning time (especially for large transforms). cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_EXHAUSTIVE cannam@127: @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an cannam@127: even wider range of algorithms, including many that we think are cannam@127: unlikely to be fast, to produce the most optimal plan but with a cannam@127: substantially increased planning time. cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_WISDOM_ONLY cannam@127: @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan cannam@127: is only created if wisdom is available for the given problem, and cannam@127: otherwise a @code{NULL} plan is returned. This can be combined with cannam@127: other flags, e.g. @samp{FFTW_WISDOM_ONLY | FFTW_PATIENT} creates a cannam@127: plan only if wisdom is available that was created in cannam@127: @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode. The cannam@127: @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect cannam@127: whether wisdom is available; for example, if wisdom is not available cannam@127: one may wish to allocate new arrays for planning so that user data is cannam@127: not overwritten. cannam@127: cannam@127: @end itemize cannam@127: cannam@127: @subsubheading Algorithm-restriction flags cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_DESTROY_INPUT cannam@127: @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is cannam@127: allowed to @emph{overwrite its input} array with arbitrary data; this cannam@127: can sometimes allow more efficient algorithms to be employed. cannam@127: @cindex out-of-place cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_PRESERVE_INPUT cannam@127: @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must cannam@127: @emph{not change its input} array. This is ordinarily the cannam@127: @emph{default}, except for c2r and hc2r (i.e. complex-to-real) cannam@127: transforms for which @code{FFTW_DESTROY_INPUT} is the default. In the cannam@127: latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use cannam@127: algorithms that do not destroy the input, at the expense of worse cannam@127: performance; for multi-dimensional c2r transforms, however, no cannam@127: input-preserving algorithms are implemented and the planner will return cannam@127: @code{NULL} if one is requested. cannam@127: @cindex c2r cannam@127: @cindex hc2r cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_UNALIGNED cannam@127: @cindex alignment cannam@127: @findex fftw_malloc cannam@127: @findex fftw_alignment_of cannam@127: @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any cannam@127: unusual alignment requirements on the input/output arrays (i.e. no cannam@127: SIMD may be used). This flag is normally @emph{not necessary}, since cannam@127: the planner automatically detects misaligned arrays. The only use for cannam@127: this flag is if you want to use the new-array execute interface to cannam@127: execute a given plan on a different array that may not be aligned like cannam@127: the original. (Using @code{fftw_malloc} makes this flag unnecessary cannam@127: even then. You can also use @code{fftw_alignment_of} to detect cannam@127: whether two arrays are equivalently aligned.) cannam@127: cannam@127: @end itemize cannam@127: cannam@127: @subsubheading Limiting planning time cannam@127: cannam@127: @example cannam@127: extern void fftw_set_timelimit(double seconds); cannam@127: @end example cannam@127: @findex fftw_set_timelimit cannam@127: cannam@127: This function instructs FFTW to spend at most @code{seconds} seconds cannam@127: (approximately) in the planner. If @code{seconds == cannam@127: FFTW_NO_TIMELIMIT} (the default value, which is negative), then cannam@127: planning time is unbounded. Otherwise, FFTW plans with a cannam@127: progressively wider range of algorithms until the the given time limit cannam@127: is reached or the given range of algorithms is explored, returning the cannam@127: best available plan. cannam@127: @ctindex FFTW_NO_TIMELIMIT cannam@127: cannam@127: cannam@127: For example, specifying @code{FFTW_PATIENT} first plans in cannam@127: @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then cannam@127: finally (time permitting) in @code{FFTW_PATIENT}. If cannam@127: @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further cannam@127: progress to @code{FFTW_EXHAUSTIVE} mode. cannam@127: cannam@127: Note that the @code{seconds} argument specifies only a rough limit; in cannam@127: practice, the planner may use somewhat more time if the time limit is cannam@127: reached when the planner is in the middle of an operation that cannot cannam@127: be interrupted. At the very least, the planner will complete planning cannam@127: in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit cannam@127: of 0). cannam@127: cannam@127: cannam@127: @c =========> cannam@127: @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface cannam@127: @subsection Real-data DFTs cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_dft_r2c_1d(int n0, cannam@127: double *in, fftw_complex *out, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1, cannam@127: double *in, fftw_complex *out, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2, cannam@127: double *in, fftw_complex *out, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_dft_r2c(int rank, const int *n, cannam@127: double *in, fftw_complex *out, cannam@127: unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_dft_r2c_1d cannam@127: @findex fftw_plan_dft_r2c_2d cannam@127: @findex fftw_plan_dft_r2c_3d cannam@127: @findex fftw_plan_dft_r2c cannam@127: @cindex r2c cannam@127: cannam@127: Plan a real-input/complex-output discrete Fourier transform (DFT) in cannam@127: zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using cannam@127: Plans}). cannam@127: cannam@127: Once you have created a plan for a certain transform type and cannam@127: parameters, then creating another plan of the same type and parameters, cannam@127: but for different arrays, is fast and shares constant data with the cannam@127: first plan (if it still exists). cannam@127: cannam@127: The planner returns @code{NULL} if the plan cannot be created. A cannam@127: non-@code{NULL} plan is always returned by the basic interface unless cannam@127: you are using a customized FFTW configuration supporting a restricted cannam@127: set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag cannam@127: with a multi-dimensional out-of-place c2r transform (see below). cannam@127: cannam@127: @subsubheading Arguments cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: @code{rank} is the rank of the transform (it should be the size of the cannam@127: array @code{*n}), and can be any non-negative integer. (@xref{Complex cannam@127: Multi-Dimensional DFTs}, for the definition of ``rank''.) The cannam@127: @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a cannam@127: @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank cannam@127: may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a cannam@127: copy of one real number (with zero imaginary part) from input to output. cannam@127: cannam@127: @item cannam@127: @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate cannam@127: for each routine) specify the size of the transform dimensions. They cannam@127: can be any positive integer. This is different in general from the cannam@127: @emph{physical} array dimensions, which are described in @ref{Real-data cannam@127: DFT Array Format}. cannam@127: cannam@127: @itemize @minus cannam@127: @item cannam@127: FFTW is best at handling sizes of the form cannam@127: @ifinfo cannam@127: @math{2^a 3^b 5^c 7^d 11^e 13^f}, cannam@127: @end ifinfo cannam@127: @tex cannam@127: $2^a 3^b 5^c 7^d 11^e 13^f$, cannam@127: @end tex cannam@127: @html cannam@127: 2a 3b 5c 7d cannam@127: 11e 13f, cannam@127: @end html cannam@127: where @math{e+f} is either @math{0} or @math{1}, and the other exponents cannam@127: are arbitrary. Other sizes are computed by means of a slow, cannam@127: general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW cannam@127: for different array sizes; see @ref{Installation and Customization}.) cannam@127: Transforms whose sizes are powers of @math{2} are especially fast, and cannam@127: it is generally beneficial for the @emph{last} dimension of an r2c/c2r cannam@127: transform to be @emph{even}. cannam@127: @end itemize cannam@127: cannam@127: @item cannam@127: @code{in} and @code{out} point to the input and output arrays of the cannam@127: transform, which may be the same (yielding an in-place transform). cannam@127: @cindex in-place cannam@127: These arrays are overwritten during planning, unless cannam@127: @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be cannam@127: initialized, but they must be allocated.) For an in-place transform, it cannam@127: is important to remember that the real array will require padding, cannam@127: described in @ref{Real-data DFT Array Format}. cannam@127: @cindex padding cannam@127: cannam@127: @item cannam@127: @cindex flags cannam@127: @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, cannam@127: as defined in @ref{Planner Flags}. cannam@127: cannam@127: @end itemize cannam@127: cannam@127: The inverse transforms, taking complex input (storing the non-redundant cannam@127: half of a logically Hermitian array) to real output, are given by: cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_dft_c2r_1d(int n0, cannam@127: fftw_complex *in, double *out, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1, cannam@127: fftw_complex *in, double *out, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2, cannam@127: fftw_complex *in, double *out, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_dft_c2r(int rank, const int *n, cannam@127: fftw_complex *in, double *out, cannam@127: unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_dft_c2r_1d cannam@127: @findex fftw_plan_dft_c2r_2d cannam@127: @findex fftw_plan_dft_c2r_3d cannam@127: @findex fftw_plan_dft_c2r cannam@127: @cindex c2r cannam@127: cannam@127: The arguments are the same as for the r2c transforms, except that the cannam@127: input and output data formats are reversed. cannam@127: cannam@127: FFTW computes an unnormalized transform: computing an r2c followed by a cannam@127: c2r transform (or vice versa) will result in the original data cannam@127: multiplied by the size of the transform (the product of the logical cannam@127: dimensions). cannam@127: @cindex normalization cannam@127: An r2c transform produces the same output as a @code{FFTW_FORWARD} cannam@127: complex DFT of the same input, and a c2r transform is correspondingly cannam@127: equivalent to @code{FFTW_BACKWARD}. For more information, see @ref{What cannam@127: FFTW Really Computes}. cannam@127: cannam@127: @c =========> cannam@127: @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface cannam@127: @subsection Real-data DFT Array Format cannam@127: @cindex r2c/c2r multi-dimensional array format cannam@127: cannam@127: The output of a DFT of real data (r2c) contains symmetries that, in cannam@127: principle, make half of the outputs redundant (@pxref{What FFTW Really cannam@127: Computes}). (Similarly for the input of an inverse c2r transform.) In cannam@127: practice, it is not possible to entirely realize these savings in an cannam@127: efficient and understandable format that generalizes to cannam@127: multi-dimensional transforms. Instead, the output of the r2c cannam@127: transforms is @emph{slightly} over half of the output of the cannam@127: corresponding complex transform. We do not ``pack'' the data in any cannam@127: way, but store it as an ordinary array of @code{fftw_complex} values. cannam@127: In fact, this data is simply a subsection of what would be the array in cannam@127: the corresponding complex transform. cannam@127: cannam@127: Specifically, for a real transform of @math{d} (= @code{rank}) cannam@127: dimensions @ndims{}, the complex data is an @ndimshalf array of cannam@127: @code{fftw_complex} values in row-major order (with the division rounded cannam@127: down). That is, we only store the @emph{lower} half (non-negative cannam@127: frequencies), plus one element, of the last dimension of the data from cannam@127: the ordinary complex transform. (We could have instead taken half of cannam@127: any other dimension, but implementation turns out to be simpler if the cannam@127: last, contiguous, dimension is used.) cannam@127: cannam@127: @cindex out-of-place cannam@127: For an out-of-place transform, the real data is simply an array with cannam@127: physical dimensions @ndims in row-major order. cannam@127: cannam@127: @cindex in-place cannam@127: @cindex padding cannam@127: For an in-place transform, some complications arise since the complex data cannam@127: is slightly larger than the real data. In this case, the final cannam@127: dimension of the real data must be @emph{padded} with extra values to cannam@127: accommodate the size of the complex data---two extra if the last cannam@127: dimension is even and one if it is odd. That is, the last dimension of cannam@127: the real data must physically contain cannam@127: @tex cannam@127: $2 (n_{d-1}/2+1)$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: 2 * (n[d-1]/2+1) cannam@127: @end ifinfo cannam@127: @html cannam@127: 2 * (nd-1/2+1) cannam@127: @end html cannam@127: @code{double} values (exactly enough to hold the complex data). This cannam@127: physical array size does not, however, change the @emph{logical} array cannam@127: size---only cannam@127: @tex cannam@127: $n_{d-1}$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: n[d-1] cannam@127: @end ifinfo cannam@127: @html cannam@127: nd-1 cannam@127: @end html cannam@127: values are actually stored in the last dimension, and cannam@127: @tex cannam@127: $n_{d-1}$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: n[d-1] cannam@127: @end ifinfo cannam@127: @html cannam@127: nd-1 cannam@127: @end html cannam@127: is the last dimension passed to the planner. cannam@127: cannam@127: @c =========> cannam@127: @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface cannam@127: @subsection Real-to-Real Transforms cannam@127: @cindex r2r cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out, cannam@127: fftw_r2r_kind kind, unsigned flags); cannam@127: fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out, cannam@127: fftw_r2r_kind kind0, fftw_r2r_kind kind1, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2, cannam@127: double *in, double *out, cannam@127: fftw_r2r_kind kind0, cannam@127: fftw_r2r_kind kind1, cannam@127: fftw_r2r_kind kind2, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out, cannam@127: const fftw_r2r_kind *kind, unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_r2r_1d cannam@127: @findex fftw_plan_r2r_2d cannam@127: @findex fftw_plan_r2r_3d cannam@127: @findex fftw_plan_r2r cannam@127: cannam@127: Plan a real input/output (r2r) transform of various kinds in zero or cannam@127: more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). cannam@127: cannam@127: Once you have created a plan for a certain transform type and cannam@127: parameters, then creating another plan of the same type and parameters, cannam@127: but for different arrays, is fast and shares constant data with the cannam@127: first plan (if it still exists). cannam@127: cannam@127: The planner returns @code{NULL} if the plan cannot be created. A cannam@127: non-@code{NULL} plan is always returned by the basic interface unless cannam@127: you are using a customized FFTW configuration supporting a restricted cannam@127: set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are cannam@127: not defined). cannam@127: @ctindex FFTW_REDFT00 cannam@127: cannam@127: @subsubheading Arguments cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: @code{rank} is the dimensionality of the transform (it should be the cannam@127: size of the arrays @code{*n} and @code{*kind}), and can be any cannam@127: non-negative integer. The @samp{_1d}, @samp{_2d}, and @samp{_3d} cannam@127: planners correspond to a @code{rank} of @code{1}, @code{2}, and cannam@127: @code{3}, respectively. A @code{rank} of zero is equivalent to a copy cannam@127: of one number from input to output. cannam@127: cannam@127: @item cannam@127: @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]}, cannam@127: respectively, gives the (physical) size of the transform dimensions. cannam@127: They can be any positive integer. cannam@127: cannam@127: @itemize @minus cannam@127: @item cannam@127: @cindex row-major cannam@127: Multi-dimensional arrays are stored in row-major order with dimensions: cannam@127: @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or cannam@127: @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. cannam@127: @xref{Multi-dimensional Array Format}. cannam@127: @item cannam@127: FFTW is generally best at handling sizes of the form cannam@127: @ifinfo cannam@127: @math{2^a 3^b 5^c 7^d 11^e 13^f}, cannam@127: @end ifinfo cannam@127: @tex cannam@127: $2^a 3^b 5^c 7^d 11^e 13^f$, cannam@127: @end tex cannam@127: @html cannam@127: 2a 3b 5c 7d cannam@127: 11e 13f, cannam@127: @end html cannam@127: where @math{e+f} is either @math{0} or @math{1}, and the other exponents cannam@127: are arbitrary. Other sizes are computed by means of a slow, cannam@127: general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW cannam@127: for different array sizes; see @ref{Installation and Customization}.) cannam@127: Transforms whose sizes are powers of @math{2} are especially fast. cannam@127: @item cannam@127: For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of cannam@127: size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that cannam@127: should be factorizable in the above form. cannam@127: @end itemize cannam@127: cannam@127: @item cannam@127: @code{in} and @code{out} point to the input and output arrays of the cannam@127: transform, which may be the same (yielding an in-place transform). cannam@127: @cindex in-place cannam@127: These arrays are overwritten during planning, unless cannam@127: @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be cannam@127: initialized, but they must be allocated.) cannam@127: cannam@127: @item cannam@127: @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or cannam@127: @code{kind[rank]}, is the kind of r2r transform used for the cannam@127: corresponding dimension. The valid kind constants are described in cannam@127: @ref{Real-to-Real Transform Kinds}. In a multi-dimensional transform, cannam@127: what is computed is the separable product formed by taking each cannam@127: transform kind along the corresponding dimension, one dimension after cannam@127: another. cannam@127: cannam@127: @item cannam@127: @cindex flags cannam@127: @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, cannam@127: as defined in @ref{Planner Flags}. cannam@127: cannam@127: @end itemize cannam@127: cannam@127: @c =========> cannam@127: @node Real-to-Real Transform Kinds, , Real-to-Real Transforms, Basic Interface cannam@127: @subsection Real-to-Real Transform Kinds cannam@127: @cindex kind (r2r) cannam@127: cannam@127: FFTW currently supports 11 different r2r transform kinds, specified by cannam@127: one of the constants below. For the precise definitions of these cannam@127: transforms, see @ref{What FFTW Really Computes}. For a more colloquial cannam@127: introduction to these transform kinds, see @ref{More DFTs of Real Data}. cannam@127: cannam@127: For dimension of size @code{n}, there is a corresponding ``logical'' cannam@127: dimension @code{N} that determines the normalization (and the optimal cannam@127: factorization); the formula for @code{N} is given for each kind below. cannam@127: Also, with each transform kind is listed its corrsponding inverse cannam@127: transform. FFTW computes unnormalized transforms: a transform followed cannam@127: by its inverse will result in the original data multiplied by @code{N} cannam@127: (or the product of the @code{N}'s for each dimension, in cannam@127: multi-dimensions). cannam@127: @cindex normalization cannam@127: cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_R2HC cannam@127: @code{FFTW_R2HC} computes a real-input DFT with output in cannam@127: ``halfcomplex'' format, i.e. real and imaginary parts for a transform of cannam@127: size @code{n} stored as: cannam@127: @tex cannam@127: $$ cannam@127: r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1 cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1 cannam@127: @end ifinfo cannam@127: @html cannam@127:

cannam@127: r0, r1, r2, ..., rn/2, i(n+1)/2-1, ..., i2, i1 cannam@127:

cannam@127: @end html cannam@127: (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.) cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_HC2R cannam@127: @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above. cannam@127: (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.) cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_DHT cannam@127: @code{FFTW_DHT} computes a discrete Hartley transform. cannam@127: (Logical @code{N=n}, inverse is @code{FFTW_DHT}.) cannam@127: @cindex discrete Hartley transform cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_REDFT00 cannam@127: @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I. cannam@127: (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.) cannam@127: @cindex discrete cosine transform cannam@127: @cindex DCT cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_REDFT10 cannam@127: @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT). cannam@127: (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.) cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_REDFT01 cannam@127: @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II). cannam@127: (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.) cannam@127: @cindex IDCT cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_REDFT11 cannam@127: @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV. cannam@127: (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.) cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_RODFT00 cannam@127: @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I. cannam@127: (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.) cannam@127: @cindex discrete sine transform cannam@127: @cindex DST cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_RODFT10 cannam@127: @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II. cannam@127: (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.) cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_RODFT01 cannam@127: @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III. cannam@127: (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.) cannam@127: cannam@127: @item cannam@127: @ctindex FFTW_RODFT11 cannam@127: @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV. cannam@127: (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.) cannam@127: cannam@127: @end itemize cannam@127: cannam@127: @c ------------------------------------------------------------ cannam@127: @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference cannam@127: @section Advanced Interface cannam@127: @cindex advanced interface cannam@127: cannam@127: FFTW's ``advanced'' interface supplements the basic interface with four cannam@127: new planner routines, providing a new level of flexibility: you can plan cannam@127: a transform of multiple arrays simultaneously, operate on non-contiguous cannam@127: (strided) data, and transform a subset of a larger multi-dimensional cannam@127: array. Other than these additional features, the planner operates in cannam@127: the same fashion as in the basic interface, and the resulting cannam@127: @code{fftw_plan} is used in the same way (@pxref{Using Plans}). cannam@127: cannam@127: @menu cannam@127: * Advanced Complex DFTs:: cannam@127: * Advanced Real-data DFTs:: cannam@127: * Advanced Real-to-real Transforms:: cannam@127: @end menu cannam@127: cannam@127: @c =========> cannam@127: @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface cannam@127: @subsection Advanced Complex DFTs cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany, cannam@127: fftw_complex *in, const int *inembed, cannam@127: int istride, int idist, cannam@127: fftw_complex *out, const int *onembed, cannam@127: int ostride, int odist, cannam@127: int sign, unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_many_dft cannam@127: cannam@127: This routine plans multiple multidimensional complex DFTs, and it cannam@127: extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to cannam@127: compute @code{howmany} transforms, each having rank @code{rank} and size cannam@127: @code{n}. In addition, the transform data need not be contiguous, but cannam@127: it may be laid out in memory with an arbitrary stride. To account for cannam@127: these possibilities, @code{fftw_plan_many_dft} adds the new parameters cannam@127: @code{howmany}, @{@code{i},@code{o}@}@code{nembed}, cannam@127: @{@code{i},@code{o}@}@code{stride}, and cannam@127: @{@code{i},@code{o}@}@code{dist}. The FFTW basic interface cannam@127: (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2, cannam@127: and@tie{}3, but the advanced interface handles only the general-rank cannam@127: case. cannam@127: cannam@127: @code{howmany} is the number of transforms to compute. The resulting cannam@127: plan computes @code{howmany} transforms, where the input of the cannam@127: @code{k}-th transform is at location @code{in+k*idist} (in C pointer cannam@127: arithmetic), and its output is at location @code{out+k*odist}. Plans cannam@127: obtained in this way can often be faster than calling FFTW multiple cannam@127: times for the individual transforms. The basic @code{fftw_plan_dft} cannam@127: interface corresponds to @code{howmany=1} (in which case the @code{dist} cannam@127: parameters are ignored). cannam@127: @cindex howmany parameter cannam@127: @cindex dist cannam@127: cannam@127: cannam@127: Each of the @code{howmany} transforms has rank @code{rank} and size cannam@127: @code{n}, as in the basic interface. In addition, the advanced cannam@127: interface allows the input and output arrays of each transform to be cannam@127: row-major subarrays of larger rank-@code{rank} arrays, described by cannam@127: @code{inembed} and @code{onembed} parameters, respectively. cannam@127: @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank}, cannam@127: and @code{n} should be elementwise less than or equal to cannam@127: @{@code{i},@code{o}@}@code{nembed}. Passing @code{NULL} for an cannam@127: @code{nembed} parameter is equivalent to passing @code{n} (i.e. same cannam@127: physical and logical dimensions, as in the basic interface.) cannam@127: cannam@127: The @code{stride} parameters indicate that the @code{j}-th element of cannam@127: the input or output arrays is located at @code{j*istride} or cannam@127: @code{j*ostride}, respectively. (For a multi-dimensional array, cannam@127: @code{j} is the ordinary row-major index.) When combined with the cannam@127: @code{k}-th transform in a @code{howmany} loop, from above, this means cannam@127: that the (@code{j},@code{k})-th element is at @code{j*stride+k*dist}. cannam@127: (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.) cannam@127: @cindex stride cannam@127: cannam@127: cannam@127: For in-place transforms, the input and output @code{stride} and cannam@127: @code{dist} parameters should be the same; otherwise, the planner may cannam@127: return @code{NULL}. cannam@127: cannam@127: Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after cannam@127: this function returns. You can safely free or reuse them. cannam@127: cannam@127: @strong{Examples}: cannam@127: One transform of one 5 by 6 array contiguous in memory: cannam@127: @example cannam@127: int rank = 2; cannam@127: int n[] = @{5, 6@}; cannam@127: int howmany = 1; cannam@127: int idist = odist = 0; /* unused because howmany = 1 */ cannam@127: int istride = ostride = 1; /* array is contiguous in memory */ cannam@127: int *inembed = n, *onembed = n; cannam@127: @end example cannam@127: cannam@127: Transform of three 5 by 6 arrays, each contiguous in memory, cannam@127: stored in memory one after another: cannam@127: @example cannam@127: int rank = 2; cannam@127: int n[] = @{5, 6@}; cannam@127: int howmany = 3; cannam@127: int idist = odist = n[0]*n[1]; /* = 30, the distance in memory cannam@127: between the first element cannam@127: of the first array and the cannam@127: first element of the second array */ cannam@127: int istride = ostride = 1; /* array is contiguous in memory */ cannam@127: int *inembed = n, *onembed = n; cannam@127: @end example cannam@127: cannam@127: Transform each column of a 2d array with 10 rows and 3 columns: cannam@127: @example cannam@127: int rank = 1; /* not 2: we are computing 1d transforms */ cannam@127: int n[] = @{10@}; /* 1d transforms of length 10 */ cannam@127: int howmany = 3; cannam@127: int idist = odist = 1; cannam@127: int istride = ostride = 3; /* distance between two elements in cannam@127: the same column */ cannam@127: int *inembed = n, *onembed = n; cannam@127: @end example cannam@127: cannam@127: @c =========> cannam@127: @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface cannam@127: @subsection Advanced Real-data DFTs cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany, cannam@127: double *in, const int *inembed, cannam@127: int istride, int idist, cannam@127: fftw_complex *out, const int *onembed, cannam@127: int ostride, int odist, cannam@127: unsigned flags); cannam@127: fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany, cannam@127: fftw_complex *in, const int *inembed, cannam@127: int istride, int idist, cannam@127: double *out, const int *onembed, cannam@127: int ostride, int odist, cannam@127: unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_many_dft_r2c cannam@127: @findex fftw_plan_many_dft_c2r cannam@127: cannam@127: Like @code{fftw_plan_many_dft}, these two functions add @code{howmany}, cannam@127: @code{nembed}, @code{stride}, and @code{dist} parameters to the cannam@127: @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but cannam@127: otherwise behave the same as the basic interface. cannam@127: cannam@127: The interpretation of @code{howmany}, @code{stride}, and @code{dist} are cannam@127: the same as for @code{fftw_plan_many_dft}, above. Note that the cannam@127: @code{stride} and @code{dist} for the real array are in units of cannam@127: @code{double}, and for the complex array are in units of cannam@127: @code{fftw_complex}. cannam@127: cannam@127: If an @code{nembed} parameter is @code{NULL}, it is interpreted as what cannam@127: it would be in the basic interface, as described in @ref{Real-data DFT cannam@127: Array Format}. That is, for the complex array the size is assumed to be cannam@127: the same as @code{n}, but with the last dimension cut roughly in half. cannam@127: For the real array, the size is assumed to be @code{n} if the transform cannam@127: is out-of-place, or @code{n} with the last dimension ``padded'' if the cannam@127: transform is in-place. cannam@127: cannam@127: If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as cannam@127: the physical size of the corresponding array, in row-major order, just cannam@127: as for @code{fftw_plan_many_dft}. In this case, each dimension of cannam@127: @code{nembed} should be @code{>=} what it would be in the basic cannam@127: interface (e.g. the halved or padded @code{n}). cannam@127: cannam@127: Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after cannam@127: this function returns. You can safely free or reuse them. cannam@127: cannam@127: @c =========> cannam@127: @node Advanced Real-to-real Transforms, , Advanced Real-data DFTs, Advanced Interface cannam@127: @subsection Advanced Real-to-real Transforms cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany, cannam@127: double *in, const int *inembed, cannam@127: int istride, int idist, cannam@127: double *out, const int *onembed, cannam@127: int ostride, int odist, cannam@127: const fftw_r2r_kind *kind, unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_many_r2r cannam@127: cannam@127: Like @code{fftw_plan_many_dft}, this functions adds @code{howmany}, cannam@127: @code{nembed}, @code{stride}, and @code{dist} parameters to the cannam@127: @code{fftw_plan_r2r} function, but otherwise behave the same as the cannam@127: basic interface. The interpretation of those additional parameters are cannam@127: the same as for @code{fftw_plan_many_dft}. (Of course, the cannam@127: @code{stride} and @code{dist} parameters are now in units of cannam@127: @code{double}, not @code{fftw_complex}.) cannam@127: cannam@127: Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not cannam@127: used after this function returns. You can safely free or reuse them. cannam@127: cannam@127: @c ------------------------------------------------------------ cannam@127: @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference cannam@127: @section Guru Interface cannam@127: @cindex guru interface cannam@127: cannam@127: The ``guru'' interface to FFTW is intended to expose as much as possible cannam@127: of the flexibility in the underlying FFTW architecture. It allows one cannam@127: to compute multi-dimensional ``vectors'' (loops) of multi-dimensional cannam@127: transforms, where each vector/transform dimension has an independent cannam@127: size and stride. cannam@127: @cindex vector cannam@127: One can also use more general complex-number formats, e.g. separate real cannam@127: and imaginary arrays. cannam@127: cannam@127: For those users who require the flexibility of the guru interface, it is cannam@127: important that they pay special attention to the documentation lest they cannam@127: shoot themselves in the foot. cannam@127: cannam@127: @menu cannam@127: * Interleaved and split arrays:: cannam@127: * Guru vector and transform sizes:: cannam@127: * Guru Complex DFTs:: cannam@127: * Guru Real-data DFTs:: cannam@127: * Guru Real-to-real Transforms:: cannam@127: * 64-bit Guru Interface:: cannam@127: @end menu cannam@127: cannam@127: @c =========> cannam@127: @node Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface cannam@127: @subsection Interleaved and split arrays cannam@127: cannam@127: The guru interface supports two representations of complex numbers, cannam@127: which we call the interleaved and the split format. cannam@127: cannam@127: The @dfn{interleaved} format is the same one used by the basic and cannam@127: advanced interfaces, and it is documented in @ref{Complex numbers}. cannam@127: In the interleaved format, you provide pointers to the real part of a cannam@127: complex number, and the imaginary part understood to be stored in the cannam@127: next memory location. cannam@127: @cindex interleaved format cannam@127: cannam@127: cannam@127: The @dfn{split} format allows separate pointers to the real and cannam@127: imaginary parts of a complex array. cannam@127: @cindex split format cannam@127: cannam@127: cannam@127: Technically, the interleaved format is redundant, because you can cannam@127: always express an interleaved array in terms of a split array with cannam@127: appropriate pointers and strides. On the other hand, the interleaved cannam@127: format is simpler to use, and it is common in practice. Hence, FFTW cannam@127: supports it as a special case. cannam@127: cannam@127: @c =========> cannam@127: @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface cannam@127: @subsection Guru vector and transform sizes cannam@127: cannam@127: The guru interface introduces one basic new data structure, cannam@127: @code{fftw_iodim}, that is used to specify sizes and strides for cannam@127: multi-dimensional transforms and vectors: cannam@127: cannam@127: @example cannam@127: typedef struct @{ cannam@127: int n; cannam@127: int is; cannam@127: int os; cannam@127: @} fftw_iodim; cannam@127: @end example cannam@127: @tindex fftw_iodim cannam@127: cannam@127: Here, @code{n} is the size of the dimension, and @code{is} and @code{os} cannam@127: are the strides of that dimension for the input and output arrays. (The cannam@127: stride is the separation of consecutive elements along this dimension.) cannam@127: cannam@127: The meaning of the stride parameter depends on the type of the array cannam@127: that the stride refers to. @emph{If the array is interleaved complex, cannam@127: strides are expressed in units of complex numbers cannam@127: (@code{fftw_complex}). If the array is split complex or real, strides cannam@127: are expressed in units of real numbers (@code{double}).} This cannam@127: convention is consistent with the usual pointer arithmetic in the C cannam@127: language. An interleaved array is denoted by a pointer @code{p} to cannam@127: @code{fftw_complex}, so that @code{p+1} points to the next complex cannam@127: number. Split arrays are denoted by pointers to @code{double}, in cannam@127: which case pointer arithmetic operates in units of cannam@127: @code{sizeof(double)}. cannam@127: @cindex stride cannam@127: cannam@127: cannam@127: The guru planner interfaces all take a (@code{rank}, @code{dims[rank]}) cannam@127: pair describing the transform size, and a (@code{howmany_rank}, cannam@127: @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a cannam@127: multi-dimensional loop of transforms to perform), where @code{dims} and cannam@127: @code{howmany_dims} are arrays of @code{fftw_iodim}. cannam@127: cannam@127: For example, the @code{howmany} parameter in the advanced complex-DFT cannam@127: interface corresponds to @code{howmany_rank} = 1, cannam@127: @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} = cannam@127: @code{idist}, and @code{howmany_dims[0].os} = @code{odist}. cannam@127: @cindex howmany loop cannam@127: @cindex dist cannam@127: (To compute a single transform, you can just use @code{howmany_rank} = 0.) cannam@127: cannam@127: cannam@127: A row-major multidimensional array with dimensions @code{n[rank]} cannam@127: (@pxref{Row-major Format}) corresponds to @code{dims[i].n} = cannam@127: @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] * cannam@127: dims[i+1].is} (similarly for @code{os}). The stride of the last cannam@127: (@code{i=rank-1}) dimension is the overall stride of the array. cannam@127: e.g. to be equivalent to the advanced complex-DFT interface, you would cannam@127: have @code{dims[rank-1].is} = @code{istride} and cannam@127: @code{dims[rank-1].os} = @code{ostride}. cannam@127: @cindex row-major cannam@127: cannam@127: cannam@127: In general, we only guarantee FFTW to return a non-@code{NULL} plan if cannam@127: the vector and transform dimensions correspond to a set of distinct cannam@127: indices, and for in-place transforms the input/output strides should cannam@127: be the same. cannam@127: cannam@127: @c =========> cannam@127: @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface cannam@127: @subsection Guru Complex DFTs cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_guru_dft( cannam@127: int rank, const fftw_iodim *dims, cannam@127: int howmany_rank, const fftw_iodim *howmany_dims, cannam@127: fftw_complex *in, fftw_complex *out, cannam@127: int sign, unsigned flags); cannam@127: cannam@127: fftw_plan fftw_plan_guru_split_dft( cannam@127: int rank, const fftw_iodim *dims, cannam@127: int howmany_rank, const fftw_iodim *howmany_dims, cannam@127: double *ri, double *ii, double *ro, double *io, cannam@127: unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_guru_dft cannam@127: @findex fftw_plan_guru_split_dft cannam@127: cannam@127: These two functions plan a complex-data, multi-dimensional DFT cannam@127: for the interleaved and split format, respectively. cannam@127: Transform dimensions are given by (@code{rank}, @code{dims}) over a cannam@127: multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, cannam@127: @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point cannam@127: to @code{fftw_iodim} arrays of length @code{rank} and cannam@127: @code{howmany_rank}, respectively. cannam@127: cannam@127: @cindex flags cannam@127: @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, cannam@127: as defined in @ref{Planner Flags}. cannam@127: cannam@127: In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and cannam@127: @code{out} point to the interleaved input and output arrays, cannam@127: respectively. The sign can be either @math{-1} (= cannam@127: @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). If the cannam@127: pointers are equal, the transform is in-place. cannam@127: cannam@127: In the @code{fftw_plan_guru_split_dft} function, cannam@127: @code{ri} and @code{ii} point to the real and imaginary input arrays, cannam@127: and @code{ro} and @code{io} point to the real and imaginary output cannam@127: arrays. The input and output pointers may be the same, indicating an cannam@127: in-place transform. For example, for @code{fftw_complex} pointers cannam@127: @code{in} and @code{out}, the corresponding parameters are: cannam@127: cannam@127: @example cannam@127: ri = (double *) in; cannam@127: ii = (double *) in + 1; cannam@127: ro = (double *) out; cannam@127: io = (double *) out + 1; cannam@127: @end example cannam@127: cannam@127: Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides cannam@127: are expressed in units of @code{double}. For a contiguous cannam@127: @code{fftw_complex} array, the overall stride of the transform should cannam@127: be 2, the distance between consecutive real parts or between cannam@127: consecutive imaginary parts; see @ref{Guru vector and transform cannam@127: sizes}. Note that the dimension strides are applied equally to the cannam@127: real and imaginary parts; real and imaginary arrays with different cannam@127: strides are not supported. cannam@127: cannam@127: There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}. cannam@127: This function always plans for an @code{FFTW_FORWARD} transform. To cannam@127: plan for an @code{FFTW_BACKWARD} transform, you can exploit the cannam@127: identity that the backwards DFT is equal to the forwards DFT with the cannam@127: real and imaginary parts swapped. For example, in the case of the cannam@127: @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform cannam@127: is computed by the parameters: cannam@127: cannam@127: @example cannam@127: ri = (double *) in + 1; cannam@127: ii = (double *) in; cannam@127: ro = (double *) out + 1; cannam@127: io = (double *) out; cannam@127: @end example cannam@127: cannam@127: @c =========> cannam@127: @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface cannam@127: @subsection Guru Real-data DFTs cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_guru_dft_r2c( cannam@127: int rank, const fftw_iodim *dims, cannam@127: int howmany_rank, const fftw_iodim *howmany_dims, cannam@127: double *in, fftw_complex *out, cannam@127: unsigned flags); cannam@127: cannam@127: fftw_plan fftw_plan_guru_split_dft_r2c( cannam@127: int rank, const fftw_iodim *dims, cannam@127: int howmany_rank, const fftw_iodim *howmany_dims, cannam@127: double *in, double *ro, double *io, cannam@127: unsigned flags); cannam@127: cannam@127: fftw_plan fftw_plan_guru_dft_c2r( cannam@127: int rank, const fftw_iodim *dims, cannam@127: int howmany_rank, const fftw_iodim *howmany_dims, cannam@127: fftw_complex *in, double *out, cannam@127: unsigned flags); cannam@127: cannam@127: fftw_plan fftw_plan_guru_split_dft_c2r( cannam@127: int rank, const fftw_iodim *dims, cannam@127: int howmany_rank, const fftw_iodim *howmany_dims, cannam@127: double *ri, double *ii, double *out, cannam@127: unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_guru_dft_r2c cannam@127: @findex fftw_plan_guru_split_dft_r2c cannam@127: @findex fftw_plan_guru_dft_c2r cannam@127: @findex fftw_plan_guru_split_dft_c2r cannam@127: cannam@127: Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with cannam@127: transform dimensions given by (@code{rank}, @code{dims}) over a cannam@127: multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, cannam@127: @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point cannam@127: to @code{fftw_iodim} arrays of length @code{rank} and cannam@127: @code{howmany_rank}, respectively. As for the basic and advanced cannam@127: interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform cannam@127: is @code{FFTW_BACKWARD}. cannam@127: cannam@127: The @emph{last} dimension of @code{dims} is interpreted specially: cannam@127: that dimension of the real array has size @code{dims[rank-1].n}, but cannam@127: that dimension of the complex array has size @code{dims[rank-1].n/2+1} cannam@127: (division rounded down). The strides, on the other hand, are taken to cannam@127: be exactly as specified. It is up to the user to specify the strides cannam@127: appropriately for the peculiar dimensions of the data, and we do not cannam@127: guarantee that the planner will succeed (return non-@code{NULL}) for cannam@127: any dimensions other than those described in @ref{Real-data DFT Array cannam@127: Format} and generalized in @ref{Advanced Real-data DFTs}. (That is, cannam@127: for an in-place transform, each individual dimension should be able to cannam@127: operate in place.) cannam@127: @cindex in-place cannam@127: cannam@127: cannam@127: @code{in} and @code{out} point to the input and output arrays for r2c cannam@127: and c2r transforms, respectively. For split arrays, @code{ri} and cannam@127: @code{ii} point to the real and imaginary input arrays for a c2r cannam@127: transform, and @code{ro} and @code{io} point to the real and imaginary cannam@127: output arrays for an r2c transform. @code{in} and @code{ro} or cannam@127: @code{ri} and @code{out} may be the same, indicating an in-place cannam@127: transform. (In-place transforms where @code{in} and @code{io} or cannam@127: @code{ii} and @code{out} are the same are not currently supported.) cannam@127: cannam@127: @cindex flags cannam@127: @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, cannam@127: as defined in @ref{Planner Flags}. cannam@127: cannam@127: In-place transforms of rank greater than 1 are currently only cannam@127: supported for interleaved arrays. For split arrays, the planner will cannam@127: return @code{NULL}. cannam@127: @cindex in-place cannam@127: cannam@127: @c =========> cannam@127: @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface cannam@127: @subsection Guru Real-to-real Transforms cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims, cannam@127: int howmany_rank, cannam@127: const fftw_iodim *howmany_dims, cannam@127: double *in, double *out, cannam@127: const fftw_r2r_kind *kind, cannam@127: unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_guru_r2r cannam@127: cannam@127: Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD} cannam@127: transform with transform dimensions given by (@code{rank}, @code{dims}) cannam@127: over a multi-dimensional vector (loop) of dimensions cannam@127: (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and cannam@127: @code{howmany_dims} should point to @code{fftw_iodim} arrays of length cannam@127: @code{rank} and @code{howmany_rank}, respectively. cannam@127: cannam@127: The transform kind of each dimension is given by the @code{kind} cannam@127: parameter, which should point to an array of length @code{rank}. Valid cannam@127: @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform cannam@127: Kinds}. cannam@127: cannam@127: @code{in} and @code{out} point to the real input and output arrays; they cannam@127: may be the same, indicating an in-place transform. cannam@127: cannam@127: @cindex flags cannam@127: @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, cannam@127: as defined in @ref{Planner Flags}. cannam@127: cannam@127: @c =========> cannam@127: @node 64-bit Guru Interface, , Guru Real-to-real Transforms, Guru Interface cannam@127: @subsection 64-bit Guru Interface cannam@127: @cindex 64-bit architecture cannam@127: cannam@127: When compiled in 64-bit mode on a 64-bit architecture (where addresses cannam@127: are 64 bits wide), FFTW uses 64-bit quantities internally for all cannam@127: transform sizes, strides, and so on---you don't have to do anything cannam@127: special to exploit this. However, in the ordinary FFTW interfaces, cannam@127: you specify the transform size by an @code{int} quantity, which is cannam@127: normally only 32 bits wide. This means that, even though FFTW is cannam@127: using 64-bit sizes internally, you cannot specify a single transform cannam@127: dimension larger than cannam@127: @ifinfo cannam@127: 2^31-1 cannam@127: @end ifinfo cannam@127: @html cannam@127: 231−1 cannam@127: @end html cannam@127: @tex cannam@127: $2^{31}-1$ cannam@127: @end tex cannam@127: numbers. cannam@127: cannam@127: We expect that few users will require transforms larger than this, but, cannam@127: for those who do, we provide a 64-bit version of the guru interface in cannam@127: which all sizes are specified as integers of type @code{ptrdiff_t} cannam@127: instead of @code{int}. (@code{ptrdiff_t} is a signed integer type cannam@127: defined by the C standard to be wide enough to represent address cannam@127: differences, and thus must be at least 64 bits wide on a 64-bit cannam@127: machine.) We stress that there is @emph{no performance advantage} to cannam@127: using this interface---the same internal FFTW code is employed cannam@127: regardless---and it is only necessary if you want to specify very cannam@127: large transform sizes. cannam@127: @tindex ptrdiff_t cannam@127: cannam@127: cannam@127: In particular, the 64-bit guru interface is a set of planner routines cannam@127: that are exactly the same as the guru planner routines, except that cannam@127: they are named with @samp{guru64} instead of @samp{guru} and they take cannam@127: arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}. cannam@127: For example, instead of @code{fftw_plan_guru_dft}, we have cannam@127: @code{fftw_plan_guru64_dft}. cannam@127: cannam@127: @example cannam@127: fftw_plan fftw_plan_guru64_dft( cannam@127: int rank, const fftw_iodim64 *dims, cannam@127: int howmany_rank, const fftw_iodim64 *howmany_dims, cannam@127: fftw_complex *in, fftw_complex *out, cannam@127: int sign, unsigned flags); cannam@127: @end example cannam@127: @findex fftw_plan_guru64_dft cannam@127: cannam@127: The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the cannam@127: same interpretation, except that it uses type @code{ptrdiff_t} instead cannam@127: of type @code{int}. cannam@127: cannam@127: @example cannam@127: typedef struct @{ cannam@127: ptrdiff_t n; cannam@127: ptrdiff_t is; cannam@127: ptrdiff_t os; cannam@127: @} fftw_iodim64; cannam@127: @end example cannam@127: @tindex fftw_iodim64 cannam@127: cannam@127: Every other @samp{fftw_plan_guru} function also has a cannam@127: @samp{fftw_plan_guru64} equivalent, but we do not repeat their cannam@127: documentation here since they are identical to the 32-bit versions cannam@127: except as noted above. cannam@127: cannam@127: @c ----------------------------------------------------------- cannam@127: @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference cannam@127: @section New-array Execute Functions cannam@127: @cindex execute cannam@127: @cindex new-array execution cannam@127: cannam@127: Normally, one executes a plan for the arrays with which the plan was cannam@127: created, by calling @code{fftw_execute(plan)} as described in @ref{Using cannam@127: Plans}. cannam@127: @findex fftw_execute cannam@127: However, it is possible for sophisticated users to apply a given plan cannam@127: to a @emph{different} array using the ``new-array execute'' functions cannam@127: detailed below, provided that the following conditions are met: cannam@127: cannam@127: @itemize @bullet cannam@127: cannam@127: @item cannam@127: The array size, strides, etcetera are the same (since those are set by cannam@127: the plan). cannam@127: cannam@127: @item cannam@127: The input and output arrays are the same (in-place) or different cannam@127: (out-of-place) if the plan was originally created to be in-place or cannam@127: out-of-place, respectively. cannam@127: cannam@127: @item cannam@127: For split arrays, the separations between the real and imaginary cannam@127: parts, @code{ii-ri} and @code{io-ro}, are the same as they were for cannam@127: the input and output arrays when the plan was created. (This cannam@127: condition is automatically satisfied for interleaved arrays.) cannam@127: cannam@127: @item cannam@127: The @dfn{alignment} of the new input/output arrays is the same as that cannam@127: of the input/output arrays when the plan was created, unless the plan cannam@127: was created with the @code{FFTW_UNALIGNED} flag. cannam@127: @ctindex FFTW_UNALIGNED cannam@127: Here, the alignment is a platform-dependent quantity (for example, it is cannam@127: the address modulo 16 if SSE SIMD instructions are used, but the address cannam@127: modulo 4 for non-SIMD single-precision FFTW on the same machine). In cannam@127: general, only arrays allocated with @code{fftw_malloc} are guaranteed to cannam@127: be equally aligned (@pxref{SIMD alignment and fftw_malloc}). cannam@127: cannam@127: @end itemize cannam@127: cannam@127: @cindex alignment cannam@127: The alignment issue is especially critical, because if you don't use cannam@127: @code{fftw_malloc} then you may have little control over the alignment cannam@127: of arrays in memory. For example, neither the C++ @code{new} function cannam@127: nor the Fortran @code{allocate} statement provide strong enough cannam@127: guarantees about data alignment. If you don't use @code{fftw_malloc}, cannam@127: therefore, you probably have to use @code{FFTW_UNALIGNED} (which cannam@127: disables most SIMD support). If possible, it is probably better for cannam@127: you to simply create multiple plans (creating a new plan is quick once cannam@127: one exists for a given size), or better yet re-use the same array for cannam@127: your transforms. cannam@127: cannam@127: @findex fftw_alignment_of cannam@127: For rare circumstances in which you cannot control the alignment of cannam@127: allocated memory, but wish to determine where a given array is cannam@127: aligned like the original array for which a plan was created, you can cannam@127: use the @code{fftw_alignment_of} function: cannam@127: @example cannam@127: int fftw_alignment_of(double *p); cannam@127: @end example cannam@127: Two arrays have equivalent alignment (for the purposes of applying a cannam@127: plan) if and only if @code{fftw_alignment_of} returns the same value cannam@127: for the corresponding pointers to their data (typecast to @code{double*} cannam@127: if necessary). cannam@127: cannam@127: If you are tempted to use the new-array execute interface because you cannam@127: want to transform a known bunch of arrays of the same size, you should cannam@127: probably go use the advanced interface instead (@pxref{Advanced cannam@127: Interface})). cannam@127: cannam@127: The new-array execute functions are: cannam@127: cannam@127: @example cannam@127: void fftw_execute_dft( cannam@127: const fftw_plan p, cannam@127: fftw_complex *in, fftw_complex *out); cannam@127: cannam@127: void fftw_execute_split_dft( cannam@127: const fftw_plan p, cannam@127: double *ri, double *ii, double *ro, double *io); cannam@127: cannam@127: void fftw_execute_dft_r2c( cannam@127: const fftw_plan p, cannam@127: double *in, fftw_complex *out); cannam@127: cannam@127: void fftw_execute_split_dft_r2c( cannam@127: const fftw_plan p, cannam@127: double *in, double *ro, double *io); cannam@127: cannam@127: void fftw_execute_dft_c2r( cannam@127: const fftw_plan p, cannam@127: fftw_complex *in, double *out); cannam@127: cannam@127: void fftw_execute_split_dft_c2r( cannam@127: const fftw_plan p, cannam@127: double *ri, double *ii, double *out); cannam@127: cannam@127: void fftw_execute_r2r( cannam@127: const fftw_plan p, cannam@127: double *in, double *out); cannam@127: @end example cannam@127: @findex fftw_execute_dft cannam@127: @findex fftw_execute_split_dft cannam@127: @findex fftw_execute_dft_r2c cannam@127: @findex fftw_execute_split_dft_r2c cannam@127: @findex fftw_execute_dft_c2r cannam@127: @findex fftw_execute_split_dft_c2r cannam@127: @findex fftw_execute_r2r cannam@127: cannam@127: These execute the @code{plan} to compute the corresponding transform on cannam@127: the input/output arrays specified by the subsequent arguments. The cannam@127: input/output array arguments have the same meanings as the ones passed cannam@127: to the guru planner routines in the preceding sections. The @code{plan} cannam@127: is not modified, and these routines can be called as many times as cannam@127: desired, or intermixed with calls to the ordinary @code{fftw_execute}. cannam@127: cannam@127: The @code{plan} @emph{must} have been created for the transform type cannam@127: corresponding to the execute function, e.g. it must be a complex-DFT cannam@127: plan for @code{fftw_execute_dft}. Any of the planner routines for that cannam@127: transform type, from the basic to the guru interface, could have been cannam@127: used to create the plan, however. cannam@127: cannam@127: @c ------------------------------------------------------------ cannam@127: @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference cannam@127: @section Wisdom cannam@127: @cindex wisdom cannam@127: @cindex saving plans to disk cannam@127: cannam@127: This section documents the FFTW mechanism for saving and restoring cannam@127: plans from disk. This mechanism is called @dfn{wisdom}. cannam@127: cannam@127: @menu cannam@127: * Wisdom Export:: cannam@127: * Wisdom Import:: cannam@127: * Forgetting Wisdom:: cannam@127: * Wisdom Utilities:: cannam@127: @end menu cannam@127: cannam@127: @c =========> cannam@127: @node Wisdom Export, Wisdom Import, Wisdom, Wisdom cannam@127: @subsection Wisdom Export cannam@127: cannam@127: @example cannam@127: int fftw_export_wisdom_to_filename(const char *filename); cannam@127: void fftw_export_wisdom_to_file(FILE *output_file); cannam@127: char *fftw_export_wisdom_to_string(void); cannam@127: void fftw_export_wisdom(void (*write_char)(char c, void *), void *data); cannam@127: @end example cannam@127: @findex fftw_export_wisdom cannam@127: @findex fftw_export_wisdom_to_filename cannam@127: @findex fftw_export_wisdom_to_file cannam@127: @findex fftw_export_wisdom_to_string cannam@127: cannam@127: These functions allow you to export all currently accumulated wisdom cannam@127: in a form from which it can be later imported and restored, even cannam@127: during a separate run of the program. (@xref{Words of Wisdom-Saving cannam@127: Plans}.) The current store of wisdom is not affected by calling any cannam@127: of these routines. cannam@127: cannam@127: @code{fftw_export_wisdom} exports the wisdom to any output cannam@127: medium, as specified by the callback function cannam@127: @code{write_char}. @code{write_char} is a @code{putc}-like function that cannam@127: writes the character @code{c} to some output; its second parameter is cannam@127: the @code{data} pointer passed to @code{fftw_export_wisdom}. For cannam@127: convenience, the following three ``wrapper'' routines are provided: cannam@127: cannam@127: @code{fftw_export_wisdom_to_filename} writes wisdom to a file named cannam@127: @code{filename} (which is created or overwritten), returning @code{1} cannam@127: on success and @code{0} on failure. A lower-level function, which cannam@127: requires you to open and close the file yourself (e.g. if you want to cannam@127: write wisdom to a portion of a larger file) is cannam@127: @code{fftw_export_wisdom_to_file}. This writes the wisdom to the cannam@127: current position in @code{output_file}, which should be open with cannam@127: write permission; upon exit, the file remains open and is positioned cannam@127: at the end of the wisdom data. cannam@127: cannam@127: @code{fftw_export_wisdom_to_string} returns a pointer to a cannam@127: @code{NULL}-terminated string holding the wisdom data. This string is cannam@127: dynamically allocated, and it is the responsibility of the caller to cannam@127: deallocate it with @code{free} when it is no longer needed. cannam@127: cannam@127: All of these routines export the wisdom in the same format, which we cannam@127: will not document here except to say that it is LISP-like ASCII text cannam@127: that is insensitive to white space. cannam@127: cannam@127: @c =========> cannam@127: @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom cannam@127: @subsection Wisdom Import cannam@127: cannam@127: @example cannam@127: int fftw_import_system_wisdom(void); cannam@127: int fftw_import_wisdom_from_filename(const char *filename); cannam@127: int fftw_import_wisdom_from_string(const char *input_string); cannam@127: int fftw_import_wisdom(int (*read_char)(void *), void *data); cannam@127: @end example cannam@127: @findex fftw_import_wisdom cannam@127: @findex fftw_import_system_wisdom cannam@127: @findex fftw_import_wisdom_from_filename cannam@127: @findex fftw_import_wisdom_from_file cannam@127: @findex fftw_import_wisdom_from_string cannam@127: cannam@127: These functions import wisdom into a program from data stored by the cannam@127: @code{fftw_export_wisdom} functions above. (@xref{Words of cannam@127: Wisdom-Saving Plans}.) The imported wisdom replaces any wisdom cannam@127: already accumulated by the running program. cannam@127: cannam@127: @code{fftw_import_wisdom} imports wisdom from any input medium, as cannam@127: specified by the callback function @code{read_char}. @code{read_char} is cannam@127: a @code{getc}-like function that returns the next character in the cannam@127: input; its parameter is the @code{data} pointer passed to cannam@127: @code{fftw_import_wisdom}. If the end of the input data is reached cannam@127: (which should never happen for valid data), @code{read_char} should cannam@127: return @code{EOF} (as defined in @code{}). For convenience, cannam@127: the following three ``wrapper'' routines are provided: cannam@127: cannam@127: @code{fftw_import_wisdom_from_filename} reads wisdom from a file named cannam@127: @code{filename}. A lower-level function, which requires you to open cannam@127: and close the file yourself (e.g. if you want to read wisdom from a cannam@127: portion of a larger file) is @code{fftw_import_wisdom_from_file}. This cannam@127: reads wisdom from the current position in @code{input_file} (which cannam@127: should be open with read permission); upon exit, the file remains cannam@127: open, but the position of the read pointer is unspecified. cannam@127: cannam@127: @code{fftw_import_wisdom_from_string} reads wisdom from the cannam@127: @code{NULL}-terminated string @code{input_string}. cannam@127: cannam@127: @code{fftw_import_system_wisdom} reads wisdom from an cannam@127: implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix cannam@127: and GNU systems). cannam@127: @cindex wisdom, system-wide cannam@127: cannam@127: cannam@127: The return value of these import routines is @code{1} if the wisdom was cannam@127: read successfully and @code{0} otherwise. Note that, in all of these cannam@127: functions, any data in the input stream past the end of the wisdom data cannam@127: is simply ignored. cannam@127: cannam@127: @c =========> cannam@127: @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom cannam@127: @subsection Forgetting Wisdom cannam@127: cannam@127: @example cannam@127: void fftw_forget_wisdom(void); cannam@127: @end example cannam@127: @findex fftw_forget_wisdom cannam@127: cannam@127: Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom} cannam@127: to be discarded and its associated memory to be freed. (New cannam@127: @code{wisdom} can still be gathered subsequently, however.) cannam@127: cannam@127: @c =========> cannam@127: @node Wisdom Utilities, , Forgetting Wisdom, Wisdom cannam@127: @subsection Wisdom Utilities cannam@127: cannam@127: FFTW includes two standalone utility programs that deal with wisdom. We cannam@127: merely summarize them here, since they come with their own @code{man} cannam@127: pages for Unix and GNU systems (with HTML versions on our web site). cannam@127: cannam@127: The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in cannam@127: single precision, etcetera), which can be used to create a wisdom file cannam@127: containing plans for any of the transform sizes and types supported by cannam@127: FFTW. It is preferable to create wisdom directly from your executable cannam@127: (@pxref{Caveats in Using Wisdom}), but this program is useful for cannam@127: creating global wisdom files for @code{fftw_import_system_wisdom}. cannam@127: @cindex fftw-wisdom utility cannam@127: cannam@127: cannam@127: The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom cannam@127: file as input and produces a @dfn{configuration routine} as output. The cannam@127: latter is a C subroutine that you can compile and link into your cannam@127: program, replacing a routine of the same name in the FFTW library, that cannam@127: determines which parts of FFTW are callable by your program. cannam@127: @code{fftw-wisdom-to-conf} produces a configuration routine that links cannam@127: to only those parts of FFTW needed by the saved plans in the wisdom, cannam@127: greatly reducing the size of statically linked executables (which should cannam@127: only attempt to create plans corresponding to those in the wisdom, cannam@127: however). cannam@127: @cindex fftw-wisdom-to-conf utility cannam@127: @cindex configuration routines cannam@127: cannam@127: @c ------------------------------------------------------------ cannam@127: @node What FFTW Really Computes, , Wisdom, FFTW Reference cannam@127: @section What FFTW Really Computes cannam@127: cannam@127: In this section, we provide precise mathematical definitions for the cannam@127: transforms that FFTW computes. These transform definitions are fairly cannam@127: standard, but some authors follow slightly different conventions for the cannam@127: normalization of the transform (the constant factor in front) and the cannam@127: sign of the complex exponent. We begin by presenting the cannam@127: one-dimensional (1d) transform definitions, and then give the cannam@127: straightforward extension to multi-dimensional transforms. cannam@127: cannam@127: @menu cannam@127: * The 1d Discrete Fourier Transform (DFT):: cannam@127: * The 1d Real-data DFT:: cannam@127: * 1d Real-even DFTs (DCTs):: cannam@127: * 1d Real-odd DFTs (DSTs):: cannam@127: * 1d Discrete Hartley Transforms (DHTs):: cannam@127: * Multi-dimensional Transforms:: cannam@127: @end menu cannam@127: cannam@127: @c =========> cannam@127: @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes cannam@127: @subsection The 1d Discrete Fourier Transform (DFT) cannam@127: cannam@127: @cindex discrete Fourier transform cannam@127: @cindex DFT cannam@127: The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a cannam@127: 1d complex array @math{X} of size @math{n} computes an array @math{Y}, cannam@127: where: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: The backward (@code{FFTW_BACKWARD}) DFT computes: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: cannam@127: @cindex normalization cannam@127: FFTW computes an unnormalized transform, in that there is no coefficient cannam@127: in front of the summation in the DFT. In other words, applying the cannam@127: forward and then the backward transform will multiply the input by cannam@127: @math{n}. cannam@127: cannam@127: @cindex frequency cannam@127: From above, an @code{FFTW_FORWARD} transform corresponds to a sign of cannam@127: @math{-1} in the exponent of the DFT. Note also that we use the cannam@127: standard ``in-order'' output ordering---the @math{k}-th output cannam@127: corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T} cannam@127: is your total sampling period). For those who like to think in terms of cannam@127: positive and negative frequencies, this means that the positive cannam@127: frequencies are stored in the first half of the output and the negative cannam@127: frequencies are stored in backwards order in the second half of the cannam@127: output. (The frequency @math{-k/n} is the same as the frequency cannam@127: @math{(n-k)/n}.) cannam@127: cannam@127: @c =========> cannam@127: @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes cannam@127: @subsection The 1d Real-data DFT cannam@127: cannam@127: The real-input (r2c) DFT in FFTW computes the @emph{forward} transform cannam@127: @math{Y} of the size @code{n} real array @math{X}, exactly as defined cannam@127: above, i.e. cannam@127: @tex cannam@127: $$ cannam@127: Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: This output array @math{Y} can easily be shown to possess the cannam@127: ``Hermitian'' symmetry cannam@127: @cindex Hermitian cannam@127: @tex cannam@127: $Y_k = Y_{n-k}^*$, cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = Y[n-k]*, cannam@127: @end ifinfo cannam@127: @html cannam@127: Yk = Yn-k*, cannam@127: @end html cannam@127: where we take @math{Y} to be periodic so that cannam@127: @tex cannam@127: $Y_n = Y_0$. cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[n] = Y[0]. cannam@127: @end ifinfo cannam@127: @html cannam@127: Yn = Y0. cannam@127: @end html cannam@127: cannam@127: As a result of this symmetry, half of the output @math{Y} is redundant cannam@127: (being the complex conjugate of the other half), and so the 1d r2c cannam@127: transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y} cannam@127: (@math{n/2+1} complex numbers), where the division by @math{2} is cannam@127: rounded down. cannam@127: cannam@127: Moreover, the Hermitian symmetry implies that cannam@127: @tex cannam@127: $Y_0$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[0] cannam@127: @end ifinfo cannam@127: @html cannam@127: Y0 cannam@127: @end html cannam@127: and, if @math{n} is even, the cannam@127: @tex cannam@127: $Y_{n/2}$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[n/2] cannam@127: @end ifinfo cannam@127: @html cannam@127: Yn/2 cannam@127: @end html cannam@127: element, are purely real. So, for the @code{R2HC} r2r transform, the cannam@127: halfcomplex format does not store the imaginary parts of these elements. cannam@127: @cindex r2r cannam@127: @ctindex R2HC cannam@127: @cindex halfcomplex format cannam@127: cannam@127: cannam@127: The c2r and @code{H2RC} r2r transforms compute the backward DFT of the cannam@127: @emph{complex} array @math{X} with Hermitian symmetry, stored in the cannam@127: r2c/@code{R2HC} output formats, respectively, where the backward cannam@127: transform is defined exactly as for the complex case: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: The outputs @code{Y} of this transform can easily be seen to be purely cannam@127: real, and are stored as an array of real numbers. cannam@127: cannam@127: @cindex normalization cannam@127: Like FFTW's complex DFT, these transforms are unnormalized. In other cannam@127: words, applying the real-to-complex (forward) and then the cannam@127: complex-to-real (backward) transform will multiply the input by cannam@127: @math{n}. cannam@127: cannam@127: @c =========> cannam@127: @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes cannam@127: @subsection 1d Real-even DFTs (DCTs) cannam@127: cannam@127: The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized cannam@127: forward (and backward) DFTs as defined above, where the input array cannam@127: @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry. In cannam@127: this case, the output array is likewise real and even symmetry. cannam@127: @cindex real-even DFT cannam@127: @cindex REDFT cannam@127: cannam@127: cannam@127: @ctindex REDFT00 cannam@127: For the case of @code{REDFT00}, this even symmetry means that cannam@127: @tex cannam@127: $X_j = X_{N-j}$, cannam@127: @end tex cannam@127: @ifinfo cannam@127: X[j] = X[N-j], cannam@127: @end ifinfo cannam@127: @html cannam@127: Xj = XN-j, cannam@127: @end html cannam@127: where we take @math{X} to be periodic so that cannam@127: @tex cannam@127: $X_N = X_0$. cannam@127: @end tex cannam@127: @ifinfo cannam@127: X[N] = X[0]. cannam@127: @end ifinfo cannam@127: @html cannam@127: XN = X0. cannam@127: @end html cannam@127: Because of this redundancy, only the first @math{n} real numbers are cannam@127: actually stored, where @math{N = 2(n-1)}. cannam@127: cannam@127: The proper definition of even symmetry for @code{REDFT10}, cannam@127: @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate cannam@127: because of the shifts by @math{1/2} of the input and/or output, although cannam@127: the corresponding boundary conditions are given in @ref{Real even/odd cannam@127: DFTs (cosine/sine transforms)}. Because of the even symmetry, however, cannam@127: the sine terms in the DFT all cancel and the remaining cosine terms are cannam@127: written explicitly below. This formulation often leads people to call cannam@127: such a transform a @dfn{discrete cosine transform} (DCT), although it is cannam@127: really just a special case of the DFT. cannam@127: @cindex discrete cosine transform cannam@127: @cindex DCT cannam@127: cannam@127: cannam@127: In each of the definitions below, we transform a real array @math{X} of cannam@127: length @math{n} to a real array @math{Y} of length @math{n}: cannam@127: cannam@127: @subsubheading REDFT00 (DCT-I) cannam@127: @ctindex REDFT00 cannam@127: An @code{REDFT00} transform (type-I DCT) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = X_0 + (-1)^k X_{n-1} cannam@127: + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: Note that this transform is not defined for @math{n=1}. For @math{n=2}, cannam@127: the summation term above is dropped as you might expect. cannam@127: cannam@127: @subsubheading REDFT10 (DCT-II) cannam@127: @ctindex REDFT10 cannam@127: An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: cannam@127: @subsubheading REDFT01 (DCT-III) cannam@127: @ctindex REDFT01 cannam@127: An @code{REDFT01} transform (type-III DCT) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: In the case of @math{n=1}, this reduces to cannam@127: @tex cannam@127: $Y_0 = X_0$. cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[0] = X[0]. cannam@127: @end ifinfo cannam@127: @html cannam@127: Y0 = X0. cannam@127: @end html cannam@127: Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''. cannam@127: @cindex IDCT cannam@127: cannam@127: @subsubheading REDFT11 (DCT-IV) cannam@127: @ctindex REDFT11 cannam@127: An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: cannam@127: @subsubheading Inverses and Normalization cannam@127: cannam@127: These definitions correspond directly to the unnormalized DFTs used cannam@127: elsewhere in FFTW (hence the factors of @math{2} in front of the cannam@127: summations). The unnormalized inverse of @code{REDFT00} is cannam@127: @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and cannam@127: of @code{REDFT11} is @code{REDFT11}. Each unnormalized inverse results cannam@127: in the original array multiplied by @math{N}, where @math{N} is the cannam@127: @emph{logical} DFT size. For @code{REDFT00}, @math{N=2(n-1)} (note that cannam@127: @math{n=1} is not defined); otherwise, @math{N=2n}. cannam@127: @cindex normalization cannam@127: cannam@127: cannam@127: In defining the discrete cosine transform, some authors also include cannam@127: additional factors of cannam@127: @ifinfo cannam@127: sqrt(2) cannam@127: @end ifinfo cannam@127: @html cannam@127: √2 cannam@127: @end html cannam@127: @tex cannam@127: $\sqrt{2}$ cannam@127: @end tex cannam@127: (or its inverse) multiplying selected inputs and/or outputs. This is a cannam@127: mostly cosmetic change that makes the transform orthogonal, but cannam@127: sacrifices the direct equivalence to a symmetric DFT. cannam@127: cannam@127: @c =========> cannam@127: @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes cannam@127: @subsection 1d Real-odd DFTs (DSTs) cannam@127: cannam@127: The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized cannam@127: forward (and backward) DFTs as defined above, where the input array cannam@127: @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry. In cannam@127: this case, the output is odd symmetry and purely imaginary. cannam@127: @cindex real-odd DFT cannam@127: @cindex RODFT cannam@127: cannam@127: cannam@127: @ctindex RODFT00 cannam@127: For the case of @code{RODFT00}, this odd symmetry means that cannam@127: @tex cannam@127: $X_j = -X_{N-j}$, cannam@127: @end tex cannam@127: @ifinfo cannam@127: X[j] = -X[N-j], cannam@127: @end ifinfo cannam@127: @html cannam@127: Xj = -XN-j, cannam@127: @end html cannam@127: where we take @math{X} to be periodic so that cannam@127: @tex cannam@127: $X_N = X_0$. cannam@127: @end tex cannam@127: @ifinfo cannam@127: X[N] = X[0]. cannam@127: @end ifinfo cannam@127: @html cannam@127: XN = X0. cannam@127: @end html cannam@127: Because of this redundancy, only the first @math{n} real numbers cannam@127: starting at @math{j=1} are actually stored (the @math{j=0} element is cannam@127: zero), where @math{N = 2(n+1)}. cannam@127: cannam@127: The proper definition of odd symmetry for @code{RODFT10}, cannam@127: @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate cannam@127: because of the shifts by @math{1/2} of the input and/or output, although cannam@127: the corresponding boundary conditions are given in @ref{Real even/odd cannam@127: DFTs (cosine/sine transforms)}. Because of the odd symmetry, however, cannam@127: the cosine terms in the DFT all cancel and the remaining sine terms are cannam@127: written explicitly below. This formulation often leads people to call cannam@127: such a transform a @dfn{discrete sine transform} (DST), although it is cannam@127: really just a special case of the DFT. cannam@127: @cindex discrete sine transform cannam@127: @cindex DST cannam@127: cannam@127: cannam@127: In each of the definitions below, we transform a real array @math{X} of cannam@127: length @math{n} to a real array @math{Y} of length @math{n}: cannam@127: cannam@127: @subsubheading RODFT00 (DST-I) cannam@127: @ctindex RODFT00 cannam@127: An @code{RODFT00} transform (type-I DST) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: cannam@127: @subsubheading RODFT10 (DST-II) cannam@127: @ctindex RODFT10 cannam@127: An @code{RODFT10} transform (type-II DST) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: cannam@127: @subsubheading RODFT01 (DST-III) cannam@127: @ctindex RODFT01 cannam@127: An @code{RODFT01} transform (type-III DST) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: In the case of @math{n=1}, this reduces to cannam@127: @tex cannam@127: $Y_0 = X_0$. cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[0] = X[0]. cannam@127: @end ifinfo cannam@127: @html cannam@127: Y0 = X0. cannam@127: @end html cannam@127: cannam@127: @subsubheading RODFT11 (DST-IV) cannam@127: @ctindex RODFT11 cannam@127: An @code{RODFT11} transform (type-IV DST) in FFTW is defined by: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)). cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: cannam@127: @subsubheading Inverses and Normalization cannam@127: cannam@127: These definitions correspond directly to the unnormalized DFTs used cannam@127: elsewhere in FFTW (hence the factors of @math{2} in front of the cannam@127: summations). The unnormalized inverse of @code{RODFT00} is cannam@127: @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and cannam@127: of @code{RODFT11} is @code{RODFT11}. Each unnormalized inverse results cannam@127: in the original array multiplied by @math{N}, where @math{N} is the cannam@127: @emph{logical} DFT size. For @code{RODFT00}, @math{N=2(n+1)}; cannam@127: otherwise, @math{N=2n}. cannam@127: @cindex normalization cannam@127: cannam@127: cannam@127: In defining the discrete sine transform, some authors also include cannam@127: additional factors of cannam@127: @ifinfo cannam@127: sqrt(2) cannam@127: @end ifinfo cannam@127: @html cannam@127: √2 cannam@127: @end html cannam@127: @tex cannam@127: $\sqrt{2}$ cannam@127: @end tex cannam@127: (or its inverse) multiplying selected inputs and/or outputs. This is a cannam@127: mostly cosmetic change that makes the transform orthogonal, but cannam@127: sacrifices the direct equivalence to an antisymmetric DFT. cannam@127: cannam@127: @c =========> cannam@127: @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes cannam@127: @subsection 1d Discrete Hartley Transforms (DHTs) cannam@127: cannam@127: @cindex discrete Hartley transform cannam@127: @cindex DHT cannam@127: The discrete Hartley transform (DHT) of a 1d real array @math{X} of size cannam@127: @math{n} computes a real array @math{Y} of the same size, where: cannam@127: @tex cannam@127: $$ cannam@127: Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)]. cannam@127: $$ cannam@127: @end tex cannam@127: @ifinfo cannam@127: @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)]. cannam@127: @end ifinfo cannam@127: @html cannam@127:
.
cannam@127: @end html cannam@127: cannam@127: @cindex normalization cannam@127: FFTW computes an unnormalized transform, in that there is no coefficient cannam@127: in front of the summation in the DHT. In other words, applying the cannam@127: transform twice (the DHT is its own inverse) will multiply the input by cannam@127: @math{n}. cannam@127: cannam@127: @c =========> cannam@127: @node Multi-dimensional Transforms, , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes cannam@127: @subsection Multi-dimensional Transforms cannam@127: cannam@127: The multi-dimensional transforms of FFTW, in general, compute simply the cannam@127: separable product of the given 1d transform along each dimension of the cannam@127: array. Since each of these transforms is unnormalized, computing the cannam@127: forward followed by the backward/inverse multi-dimensional transform cannam@127: will result in the original array scaled by the product of the cannam@127: normalization factors for each dimension (e.g. the product of the cannam@127: dimension sizes, for a multi-dimensional DFT). cannam@127: cannam@127: @tex cannam@127: As an explicit example, consider the following exact mathematical cannam@127: definition of our multi-dimensional DFT. Let $X$ be a $d$-dimensional cannam@127: complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0 cannam@127: \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$. Let also cannam@127: $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d cannam@127: \}$. cannam@127: cannam@127: The forward transform computes a complex array~$Y$, whose cannam@127: structure is the same as that of~$X$, defined by cannam@127: cannam@127: $$ cannam@127: Y[k_1, k_2, \ldots, k_d] = cannam@127: \sum_{j_1 = 0}^{n_1 - 1} cannam@127: \sum_{j_2 = 0}^{n_2 - 1} cannam@127: \cdots cannam@127: \sum_{j_d = 0}^{n_d - 1} cannam@127: X[j_1, j_2, \ldots, j_d] cannam@127: \omega_1^{-j_1 k_1} cannam@127: \omega_2^{-j_2 k_2} cannam@127: \cdots cannam@127: \omega_d^{-j_d k_d} \ . cannam@127: $$ cannam@127: cannam@127: The backward transform computes cannam@127: $$ cannam@127: Y[k_1, k_2, \ldots, k_d] = cannam@127: \sum_{j_1 = 0}^{n_1 - 1} cannam@127: \sum_{j_2 = 0}^{n_2 - 1} cannam@127: \cdots cannam@127: \sum_{j_d = 0}^{n_d - 1} cannam@127: X[j_1, j_2, \ldots, j_d] cannam@127: \omega_1^{j_1 k_1} cannam@127: \omega_2^{j_2 k_2} cannam@127: \cdots cannam@127: \omega_d^{j_d k_d} \ . cannam@127: $$ cannam@127: cannam@127: Computing the forward transform followed by the backward transform cannam@127: will multiply the array by $\prod_{s=1}^{d} n_d$. cannam@127: @end tex cannam@127: cannam@127: @cindex r2c cannam@127: The definition of FFTW's multi-dimensional DFT of real data (r2c) cannam@127: deserves special attention. In this case, we logically compute the full cannam@127: multi-dimensional DFT of the input data; since the input data are purely cannam@127: real, the output data have the Hermitian symmetry and therefore only one cannam@127: non-redundant half need be stored. More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array cannam@127: @tex cannam@127: $Y[k_0, k_1, \ldots, k_{d-1}]$ cannam@127: @end tex cannam@127: @html cannam@127: Y[k0, k1, ..., cannam@127: kd-1] cannam@127: @end html cannam@127: @ifinfo cannam@127: Y[k[0], k[1], ..., k[d-1]] cannam@127: @end ifinfo cannam@127: has the symmetry: cannam@127: @tex cannam@127: $$ cannam@127: Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^* cannam@127: $$ cannam@127: @end tex cannam@127: @html cannam@127: Y[k0, k1, ..., cannam@127: kd-1] = Y[n0 - cannam@127: k0, n1 - k1, ..., cannam@127: nd-1 - kd-1]* cannam@127: @end html cannam@127: @ifinfo cannam@127: Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]* cannam@127: @end ifinfo cannam@127: (where each dimension is periodic). Because of this symmetry, we only cannam@127: store the cannam@127: @tex cannam@127: $k_{d-1} = 0 \cdots n_{d-1}/2$ cannam@127: @end tex cannam@127: @html cannam@127: kd-1 = 0...nd-1/2+1 cannam@127: @end html cannam@127: @ifinfo cannam@127: k[d-1] = 0...n[d-1]/2 cannam@127: @end ifinfo cannam@127: elements of the @emph{last} dimension (division by @math{2} is rounded cannam@127: down). (We could instead have cut any other dimension in half, but the cannam@127: last dimension proved computationally convenient.) This results in the cannam@127: peculiar array format described in more detail by @ref{Real-data DFT cannam@127: Array Format}. cannam@127: cannam@127: The multi-dimensional c2r transform is simply the unnormalized inverse cannam@127: of the r2c transform. i.e. it is the same as FFTW's complex backward cannam@127: multi-dimensional DFT, operating on a Hermitian input array in the cannam@127: peculiar format mentioned above and outputting a real array (since the cannam@127: DFT output is purely real). cannam@127: cannam@127: We should remind the user that the separable product of 1d transforms cannam@127: along each dimension, as computed by FFTW, is not always the same thing cannam@127: as the usual multi-dimensional transform. A multi-dimensional cannam@127: @code{R2HC} (or @code{HC2R}) transform is not identical to the cannam@127: multi-dimensional DFT, requiring some post-processing to combine the cannam@127: requisite real and imaginary parts, as was described in @ref{The cannam@127: Halfcomplex-format DFT}. Likewise, FFTW's multidimensional cannam@127: @code{FFTW_DHT} r2r transform is not the same thing as the logical cannam@127: multi-dimensional discrete Hartley transform defined in the literature, cannam@127: as discussed in @ref{The Discrete Hartley Transform}. cannam@127: