js-dsp-test: fft/fftw/fftw-3.3.4/doc/reference.texi annotate

annotate fft/fftw/fftw-3.3.4/doc/reference.texi @ 40:223f770b5341 kissfft-double tip

Try a double-precision kissfft

author	Chris Cannam
date	Wed, 07 Sep 2016 10:40:32 +0100
parents	26056e866c29
children

rev	line source
Chris@19	1 @node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top
Chris@19	2 @chapter FFTW Reference
Chris@19	3
Chris@19	4 This chapter provides a complete reference for all sequential (i.e.,
Chris@19	5 one-processor) FFTW functions. Parallel transforms are described in
Chris@19	6 later chapters.
Chris@19	7
Chris@19	8 @menu
Chris@19	9 * Data Types and Files::
Chris@19	10 * Using Plans::
Chris@19	11 * Basic Interface::
Chris@19	12 * Advanced Interface::
Chris@19	13 * Guru Interface::
Chris@19	14 * New-array Execute Functions::
Chris@19	15 * Wisdom::
Chris@19	16 * What FFTW Really Computes::
Chris@19	17 @end menu
Chris@19	18
Chris@19	19 @c ------------------------------------------------------------
Chris@19	20 @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference
Chris@19	21 @section Data Types and Files
Chris@19	22
Chris@19	23 All programs using FFTW should include its header file:
Chris@19	24
Chris@19	25 @example
Chris@19	26 #include <fftw3.h>
Chris@19	27 @end example
Chris@19	28
Chris@19	29 You must also link to the FFTW library. On Unix, this
Chris@19	30 means adding @code{-lfftw3 -lm} at the @emph{end} of the link command.
Chris@19	31
Chris@19	32 @menu
Chris@19	33 * Complex numbers::
Chris@19	34 * Precision::
Chris@19	35 * Memory Allocation::
Chris@19	36 @end menu
Chris@19	37
Chris@19	38 @c =========>
Chris@19	39 @node Complex numbers, Precision, Data Types and Files, Data Types and Files
Chris@19	40 @subsection Complex numbers
Chris@19	41
Chris@19	42 The default FFTW interface uses @code{double} precision for all
Chris@19	43 floating-point numbers, and defines a @code{fftw_complex} type to hold
Chris@19	44 complex numbers as:
Chris@19	45
Chris@19	46 @example
Chris@19	47 typedef double fftw_complex[2];
Chris@19	48 @end example
Chris@19	49 @tindex fftw_complex
Chris@19	50
Chris@19	51 Here, the @code{[0]} element holds the real part and the @code{[1]}
Chris@19	52 element holds the imaginary part.
Chris@19	53
Chris@19	54 Alternatively, if you have a C compiler (such as @code{gcc}) that
Chris@19	55 supports the C99 revision of the ANSI C standard, you can use C's new
Chris@19	56 native complex type (which is binary-compatible with the typedef above).
Chris@19	57 In particular, if you @code{#include <complex.h>} @emph{before}
Chris@19	58 @code{<fftw3.h>}, then @code{fftw_complex} is defined to be the native
Chris@19	59 complex type and you can manipulate it with ordinary arithmetic
Chris@19	60 (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are
Chris@19	61 @code{fftw_complex} and @code{I} is the standard symbol for the
Chris@19	62 imaginary unit);
Chris@19	63 @cindex C99
Chris@19	64
Chris@19	65
Chris@19	66 C++ has its own @code{complex<T>} template class, defined in the
Chris@19	67 standard @code{<complex>} header file. Reportedly, the C++ standards
Chris@19	68 committee has recently agreed to mandate that the storage format used
Chris@19	69 for this type be binary-compatible with the C99 type, i.e. an array
Chris@19	70 @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]}
Chris@19	71 parts. (See report
Chris@19	72 @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf
Chris@19	73 WG21/N1388}.) Although not part of the official standard as of this
Chris@19	74 writing, the proposal stated that: ``This solution has been tested with
Chris@19	75 all current major implementations of the standard library and shown to
Chris@19	76 be working.'' To the extent that this is true, if you have a variable
Chris@19	77 @code{complex<double> *x}, you can pass it directly to FFTW via
Chris@19	78 @code{reinterpret_cast<fftw_complex*>(x)}.
Chris@19	79 @cindex C++
Chris@19	80 @cindex portability
Chris@19	81
Chris@19	82 @c =========>
Chris@19	83 @node Precision, Memory Allocation, Complex numbers, Data Types and Files
Chris@19	84 @subsection Precision
Chris@19	85 @cindex precision
Chris@19	86
Chris@19	87 You can install single and long-double precision versions of FFTW,
Chris@19	88 which replace @code{double} with @code{float} and @code{long double},
Chris@19	89 respectively (@pxref{Installation and Customization}). To use these
Chris@19	90 interfaces, you:
Chris@19	91
Chris@19	92 @itemize @bullet
Chris@19	93
Chris@19	94 @item
Chris@19	95 Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or
Chris@19	96 @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}. (You
Chris@19	97 can link to the different-precision libraries simultaneously.)
Chris@19	98
Chris@19	99 @item
Chris@19	100 Include the @emph{same} @code{<fftw3.h>} header file.
Chris@19	101
Chris@19	102 @item
Chris@19	103 Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or
Chris@19	104 @samp{fftwl_} for single or long-double precision, respectively.
Chris@19	105 (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute}
Chris@19	106 becomes @code{fftwf_execute}, etcetera.)
Chris@19	107
Chris@19	108 @item
Chris@19	109 Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the
Chris@19	110 same.
Chris@19	111
Chris@19	112 @item
Chris@19	113 Replace @code{double} with @code{float} or @code{long double} for
Chris@19	114 subroutine parameters.
Chris@19	115
Chris@19	116 @end itemize
Chris@19	117
Chris@19	118 Depending upon your compiler and/or hardware, @code{long double} may not
Chris@19	119 be any more precise than @code{double} (or may not be supported at all,
Chris@19	120 although it is standard in C99).
Chris@19	121 @cindex C99
Chris@19	122
Chris@19	123
Chris@19	124 We also support using the nonstandard @code{__float128}
Chris@19	125 quadruple-precision type provided by recent versions of @code{gcc} on
Chris@19	126 32- and 64-bit x86 hardware (@pxref{Installation and Customization}).
Chris@19	127 To use this type, link with @code{-lfftw3q -lquadmath -lm} (the
Chris@19	128 @code{libquadmath} library provided by @code{gcc} is needed for
Chris@19	129 quadruple-precision trigonometric functions) and use @samp{fftwq_}
Chris@19	130 identifiers.
Chris@19	131
Chris@19	132 @c =========>
Chris@19	133 @node Memory Allocation, , Precision, Data Types and Files
Chris@19	134 @subsection Memory Allocation
Chris@19	135
Chris@19	136 @example
Chris@19	137 void *fftw_malloc(size_t n);
Chris@19	138 void fftw_free(void *p);
Chris@19	139 @end example
Chris@19	140 @findex fftw_malloc
Chris@19	141 @findex fftw_free
Chris@19	142
Chris@19	143 These are functions that behave identically to @code{malloc} and
Chris@19	144 @code{free}, except that they guarantee that the returned pointer obeys
Chris@19	145 any special alignment restrictions imposed by any algorithm in FFTW
Chris@19	146 (e.g. for SIMD acceleration). @xref{SIMD alignment and fftw_malloc}.
Chris@19	147 @cindex alignment
Chris@19	148
Chris@19	149
Chris@19	150 Data allocated by @code{fftw_malloc} @emph{must} be deallocated by
Chris@19	151 @code{fftw_free} and not by the ordinary @code{free}.
Chris@19	152
Chris@19	153 These routines simply call through to your operating system's
Chris@19	154 @code{malloc} or, if necessary, its aligned equivalent
Chris@19	155 (e.g. @code{memalign}), so you normally need not worry about any
Chris@19	156 significant time or space overhead. You are @emph{not required} to use
Chris@19	157 them to allocate your data, but we strongly recommend it.
Chris@19	158
Chris@19	159 Note: in C++, just as with ordinary @code{malloc}, you must typecast
Chris@19	160 the output of @code{fftw_malloc} to whatever pointer type you are
Chris@19	161 allocating.
Chris@19	162 @cindex C++
Chris@19	163
Chris@19	164
Chris@19	165 We also provide the following two convenience functions to allocate
Chris@19	166 real and complex arrays with @code{n} elements, which are equivalent
Chris@19	167 to @code{(double ) fftw_malloc(sizeof(double) n)} and
Chris@19	168 @code{(fftw_complex ) fftw_malloc(sizeof(fftw_complex) n)},
Chris@19	169 respectively:
Chris@19	170
Chris@19	171 @example
Chris@19	172 double *fftw_alloc_real(size_t n);
Chris@19	173 fftw_complex *fftw_alloc_complex(size_t n);
Chris@19	174 @end example
Chris@19	175 @findex fftw_alloc_real
Chris@19	176 @findex fftw_alloc_complex
Chris@19	177
Chris@19	178 The equivalent functions in other precisions allocate arrays of @code{n}
Chris@19	179 elements in that precision. e.g. @code{fftwf_alloc_real(n)} is
Chris@19	180 equivalent to @code{(float ) fftwf_malloc(sizeof(float) n)}.
Chris@19	181 @cindex precision
Chris@19	182
Chris@19	183 @c ------------------------------------------------------------
Chris@19	184 @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference
Chris@19	185 @section Using Plans
Chris@19	186
Chris@19	187 Plans for all transform types in FFTW are stored as type
Chris@19	188 @code{fftw_plan} (an opaque pointer type), and are created by one of the
Chris@19	189 various planning routines described in the following sections.
Chris@19	190 @tindex fftw_plan
Chris@19	191 An @code{fftw_plan} contains all information necessary to compute the
Chris@19	192 transform, including the pointers to the input and output arrays.
Chris@19	193
Chris@19	194 @example
Chris@19	195 void fftw_execute(const fftw_plan plan);
Chris@19	196 @end example
Chris@19	197 @findex fftw_execute
Chris@19	198
Chris@19	199 This executes the @code{plan}, to compute the corresponding transform on
Chris@19	200 the arrays for which it was planned (which must still exist). The plan
Chris@19	201 is not modified, and @code{fftw_execute} can be called as many times as
Chris@19	202 desired.
Chris@19	203
Chris@19	204 To apply a given plan to a different array, you can use the new-array execute
Chris@19	205 interface. @xref{New-array Execute Functions}.
Chris@19	206
Chris@19	207 @code{fftw_execute} (and equivalents) is the only function in FFTW
Chris@19	208 guaranteed to be thread-safe; see @ref{Thread safety}.
Chris@19	209
Chris@19	210 This function:
Chris@19	211 @example
Chris@19	212 void fftw_destroy_plan(fftw_plan plan);
Chris@19	213 @end example
Chris@19	214 @findex fftw_destroy_plan
Chris@19	215 deallocates the @code{plan} and all its associated data.
Chris@19	216
Chris@19	217 FFTW's planner saves some other persistent data, such as the
Chris@19	218 accumulated wisdom and a list of algorithms available in the current
Chris@19	219 configuration. If you want to deallocate all of that and reset FFTW
Chris@19	220 to the pristine state it was in when you started your program, you can
Chris@19	221 call:
Chris@19	222
Chris@19	223 @example
Chris@19	224 void fftw_cleanup(void);
Chris@19	225 @end example
Chris@19	226 @findex fftw_cleanup
Chris@19	227
Chris@19	228 After calling @code{fftw_cleanup}, all existing plans become undefined,
Chris@19	229 and you should not attempt to execute them nor to destroy them. You can
Chris@19	230 however create and execute/destroy new plans, in which case FFTW starts
Chris@19	231 accumulating wisdom information again.
Chris@19	232
Chris@19	233 @code{fftw_cleanup} does not deallocate your plans, however. To prevent
Chris@19	234 memory leaks, you must still call @code{fftw_destroy_plan} before
Chris@19	235 executing @code{fftw_cleanup}.
Chris@19	236
Chris@19	237 Occasionally, it may useful to know FFTW's internal ``cost'' metric
Chris@19	238 that it uses to compare plans to one another; this cost is
Chris@19	239 proportional to an execution time of the plan, in undocumented units,
Chris@19	240 if the plan was created with the @code{FFTW_MEASURE} or other
Chris@19	241 timing-based options, or alternatively is a heuristic cost function
Chris@19	242 for @code{FFTW_ESTIMATE} plans. (The cost values of measured and
Chris@19	243 estimated plans are not comparable, being in different units. Also,
Chris@19	244 costs from different FFTW versions or the same version compiled
Chris@19	245 differently may not be in the same units. Plans created from wisdom
Chris@19	246 have a cost of 0 since no timing measurement is performed for them.
Chris@19	247 Finally, certain problems for which only one top-level algorithm was
Chris@19	248 possible may have required no measurements of the cost of the whole
Chris@19	249 plan, in which case @code{fftw_cost} will also return 0.) The cost
Chris@19	250 metric for a given plan is returned by:
Chris@19	251
Chris@19	252 @example
Chris@19	253 double fftw_cost(const fftw_plan plan);
Chris@19	254 @end example
Chris@19	255 @findex fftw_cost
Chris@19	256
Chris@19	257 The following two routines are provided purely for academic purposes
Chris@19	258 (that is, for entertainment).
Chris@19	259
Chris@19	260 @example
Chris@19	261 void fftw_flops(const fftw_plan plan,
Chris@19	262 double add, double mul, double *fma);
Chris@19	263 @end example
Chris@19	264 @findex fftw_flops
Chris@19	265
Chris@19	266 Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an
Chris@19	267 exact count of the number of floating-point additions, multiplications,
Chris@19	268 and fused multiply-add operations involved in the plan's execution. The
Chris@19	269 total number of floating-point operations (flops) is @code{add + mul +
Chris@19	270 2*fma}, or @code{add + mul + fma} if the hardware supports fused
Chris@19	271 multiply-add instructions (although the number of FMA operations is only
Chris@19	272 approximate because of compiler voodoo). (The number of operations
Chris@19	273 should be an integer, but we use @code{double} to avoid overflowing
Chris@19	274 @code{int} for large transforms; the arguments are of type @code{double}
Chris@19	275 even for single and long-double precision versions of FFTW.)
Chris@19	276
Chris@19	277 @example
Chris@19	278 void fftw_fprint_plan(const fftw_plan plan, FILE *output_file);
Chris@19	279 void fftw_print_plan(const fftw_plan plan);
Chris@19	280 char *fftw_sprint_plan(const fftw_plan plan);
Chris@19	281 @end example
Chris@19	282 @findex fftw_fprint_plan
Chris@19	283 @findex fftw_print_plan
Chris@19	284
Chris@19	285 This outputs a ``nerd-readable'' representation of the @code{plan} to
Chris@19	286 the given file, to @code{stdout}, or two a newly allocated
Chris@19	287 NUL-terminated string (which the caller is responsible for deallocating
Chris@19	288 with @code{free}), respectively.
Chris@19	289
Chris@19	290 @c ------------------------------------------------------------
Chris@19	291 @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference
Chris@19	292 @section Basic Interface
Chris@19	293 @cindex basic interface
Chris@19	294
Chris@19	295 Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est
Chris@19	296 omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface}
Chris@19	297 computes a single transform of contiguous data, the @dfn{advanced
Chris@19	298 interface} computes transforms of multiple or strided arrays, and the
Chris@19	299 @dfn{guru interface} supports the most general data layouts,
Chris@19	300 multiplicities, and strides. This section describes the the basic
Chris@19	301 interface, which we expect to satisfy the needs of most users.
Chris@19	302
Chris@19	303 @menu
Chris@19	304 * Complex DFTs::
Chris@19	305 * Planner Flags::
Chris@19	306 * Real-data DFTs::
Chris@19	307 * Real-data DFT Array Format::
Chris@19	308 * Real-to-Real Transforms::
Chris@19	309 * Real-to-Real Transform Kinds::
Chris@19	310 @end menu
Chris@19	311
Chris@19	312 @c =========>
Chris@19	313 @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface
Chris@19	314 @subsection Complex DFTs
Chris@19	315
Chris@19	316 @example
Chris@19	317 fftw_plan fftw_plan_dft_1d(int n0,
Chris@19	318 fftw_complex in, fftw_complex out,
Chris@19	319 int sign, unsigned flags);
Chris@19	320 fftw_plan fftw_plan_dft_2d(int n0, int n1,
Chris@19	321 fftw_complex in, fftw_complex out,
Chris@19	322 int sign, unsigned flags);
Chris@19	323 fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2,
Chris@19	324 fftw_complex in, fftw_complex out,
Chris@19	325 int sign, unsigned flags);
Chris@19	326 fftw_plan fftw_plan_dft(int rank, const int *n,
Chris@19	327 fftw_complex in, fftw_complex out,
Chris@19	328 int sign, unsigned flags);
Chris@19	329 @end example
Chris@19	330 @findex fftw_plan_dft_1d
Chris@19	331 @findex fftw_plan_dft_2d
Chris@19	332 @findex fftw_plan_dft_3d
Chris@19	333 @findex fftw_plan_dft
Chris@19	334
Chris@19	335 Plan a complex input/output discrete Fourier transform (DFT) in zero or
Chris@19	336 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
Chris@19	337
Chris@19	338 Once you have created a plan for a certain transform type and
Chris@19	339 parameters, then creating another plan of the same type and parameters,
Chris@19	340 but for different arrays, is fast and shares constant data with the
Chris@19	341 first plan (if it still exists).
Chris@19	342
Chris@19	343 The planner returns @code{NULL} if the plan cannot be created. In the
Chris@19	344 standard FFTW distribution, the basic interface is guaranteed to return
Chris@19	345 a non-@code{NULL} plan. A plan may be @code{NULL}, however, if you are
Chris@19	346 using a customized FFTW configuration supporting a restricted set of
Chris@19	347 transforms.
Chris@19	348
Chris@19	349 @subsubheading Arguments
Chris@19	350 @itemize @bullet
Chris@19	351
Chris@19	352 @item
Chris@19	353 @code{rank} is the rank of the transform (it should be the size of the
Chris@19	354 array @code{*n}), and can be any non-negative integer. (@xref{Complex
Chris@19	355 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
Chris@19	356 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
Chris@19	357 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
Chris@19	358 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
Chris@19	359 copy of one number from input to output.
Chris@19	360
Chris@19	361 @item
Chris@19	362 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate
Chris@19	363 for each routine) specify the size of the transform dimensions. They
Chris@19	364 can be any positive integer.
Chris@19	365
Chris@19	366 @itemize @minus
Chris@19	367 @item
Chris@19	368 @cindex row-major
Chris@19	369 Multi-dimensional arrays are stored in row-major order with dimensions:
Chris@19	370 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
Chris@19	371 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
Chris@19	372 @xref{Multi-dimensional Array Format}.
Chris@19	373 @item
Chris@19	374 FFTW is best at handling sizes of the form
Chris@19	375 @ifinfo
Chris@19	376 @math{2^a 3^b 5^c 7^d 11^e 13^f},
Chris@19	377 @end ifinfo
Chris@19	378 @tex
Chris@19	379 $2^a 3^b 5^c 7^d 11^e 13^f$,
Chris@19	380 @end tex
Chris@19	381 @html
Chris@19	382 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
Chris@19	383 11<sup>e</sup> 13<sup>f</sup>,
Chris@19	384 @end html
Chris@19	385 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
Chris@19	386 are arbitrary. Other sizes are computed by means of a slow,
Chris@19	387 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). It is possible to customize FFTW
Chris@19	388 for different array sizes; see @ref{Installation and Customization}.
Chris@19	389 Transforms whose sizes are powers of @math{2} are especially fast.
Chris@19	390 @end itemize
Chris@19	391
Chris@19	392 @item
Chris@19	393 @code{in} and @code{out} point to the input and output arrays of the
Chris@19	394 transform, which may be the same (yielding an in-place transform).
Chris@19	395 @cindex in-place
Chris@19	396 These arrays are overwritten during planning, unless
Chris@19	397 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
Chris@19	398 initialized, but they must be allocated.)
Chris@19	399
Chris@19	400 If @code{in == out}, the transform is @dfn{in-place} and the input
Chris@19	401 array is overwritten. If @code{in != out}, the two arrays must
Chris@19	402 not overlap (but FFTW does not check for this condition).
Chris@19	403
Chris@19	404 @item
Chris@19	405 @ctindex FFTW_FORWARD
Chris@19	406 @ctindex FFTW_BACKWARD
Chris@19	407 @code{sign} is the sign of the exponent in the formula that defines the
Chris@19	408 Fourier transform. It can be @math{-1} (= @code{FFTW_FORWARD}) or
Chris@19	409 @math{+1} (= @code{FFTW_BACKWARD}).
Chris@19	410
Chris@19	411 @item
Chris@19	412 @cindex flags
Chris@19	413 @code{flags} is a bitwise OR (@samp{\|}) of zero or more planner flags,
Chris@19	414 as defined in @ref{Planner Flags}.
Chris@19	415
Chris@19	416 @end itemize
Chris@19	417
Chris@19	418 FFTW computes an unnormalized transform: computing a forward followed by
Chris@19	419 a backward transform (or vice versa) will result in the original data
Chris@19	420 multiplied by the size of the transform (the product of the dimensions).
Chris@19	421 @cindex normalization
Chris@19	422 For more information, see @ref{What FFTW Really Computes}.
Chris@19	423
Chris@19	424 @c =========>
Chris@19	425 @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface
Chris@19	426 @subsection Planner Flags
Chris@19	427
Chris@19	428 All of the planner routines in FFTW accept an integer @code{flags}
Chris@19	429 argument, which is a bitwise OR (@samp{\|}) of zero or more of the flag
Chris@19	430 constants defined below. These flags control the rigor (and time) of
Chris@19	431 the planning process, and can also impose (or lift) restrictions on the
Chris@19	432 type of transform algorithm that is employed.
Chris@19	433
Chris@19	434 @emph{Important:} the planner overwrites the input array during
Chris@19	435 planning unless a saved plan (@pxref{Wisdom}) is available for that
Chris@19	436 problem, so you should initialize your input data after creating the
Chris@19	437 plan. The only exceptions to this are the @code{FFTW_ESTIMATE} and
Chris@19	438 @code{FFTW_WISDOM_ONLY} flags, as mentioned below.
Chris@19	439
Chris@19	440 In all cases, if wisdom is available for the given problem that was
Chris@19	441 created with equal-or-greater planning rigor, then the more rigorous
Chris@19	442 wisdom is used. For example, in @code{FFTW_ESTIMATE} mode any available
Chris@19	443 wisdom is used, whereas in @code{FFTW_PATIENT} mode only wisdom created
Chris@19	444 in patient or exhaustive mode can be used. @xref{Words of Wisdom-Saving
Chris@19	445 Plans}.
Chris@19	446
Chris@19	447 @subsubheading Planning-rigor flags
Chris@19	448 @itemize @bullet
Chris@19	449
Chris@19	450 @item
Chris@19	451 @ctindex FFTW_ESTIMATE
Chris@19	452 @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of
Chris@19	453 different algorithms, a simple heuristic is used to pick a (probably
Chris@19	454 sub-optimal) plan quickly. With this flag, the input/output arrays are
Chris@19	455 not overwritten during planning.
Chris@19	456
Chris@19	457 @item
Chris@19	458 @ctindex FFTW_MEASURE
Chris@19	459 @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually
Chris@19	460 @emph{computing} several FFTs and measuring their execution time.
Chris@19	461 Depending on your machine, this can take some time (often a few
Chris@19	462 seconds). @code{FFTW_MEASURE} is the default planning option.
Chris@19	463
Chris@19	464 @item
Chris@19	465 @ctindex FFTW_PATIENT
Chris@19	466 @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider
Chris@19	467 range of algorithms and often produces a ``more optimal'' plan
Chris@19	468 (especially for large transforms), but at the expense of several times
Chris@19	469 longer planning time (especially for large transforms).
Chris@19	470
Chris@19	471 @item
Chris@19	472 @ctindex FFTW_EXHAUSTIVE
Chris@19	473 @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an
Chris@19	474 even wider range of algorithms, including many that we think are
Chris@19	475 unlikely to be fast, to produce the most optimal plan but with a
Chris@19	476 substantially increased planning time.
Chris@19	477
Chris@19	478 @item
Chris@19	479 @ctindex FFTW_WISDOM_ONLY
Chris@19	480 @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan
Chris@19	481 is only created if wisdom is available for the given problem, and
Chris@19	482 otherwise a @code{NULL} plan is returned. This can be combined with
Chris@19	483 other flags, e.g. @samp{FFTW_WISDOM_ONLY \| FFTW_PATIENT} creates a
Chris@19	484 plan only if wisdom is available that was created in
Chris@19	485 @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode. The
Chris@19	486 @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect
Chris@19	487 whether wisdom is available; for example, if wisdom is not available
Chris@19	488 one may wish to allocate new arrays for planning so that user data is
Chris@19	489 not overwritten.
Chris@19	490
Chris@19	491 @end itemize
Chris@19	492
Chris@19	493 @subsubheading Algorithm-restriction flags
Chris@19	494 @itemize @bullet
Chris@19	495
Chris@19	496 @item
Chris@19	497 @ctindex FFTW_DESTROY_INPUT
Chris@19	498 @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is
Chris@19	499 allowed to @emph{overwrite its input} array with arbitrary data; this
Chris@19	500 can sometimes allow more efficient algorithms to be employed.
Chris@19	501 @cindex out-of-place
Chris@19	502
Chris@19	503 @item
Chris@19	504 @ctindex FFTW_PRESERVE_INPUT
Chris@19	505 @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must
Chris@19	506 @emph{not change its input} array. This is ordinarily the
Chris@19	507 @emph{default}, except for c2r and hc2r (i.e. complex-to-real)
Chris@19	508 transforms for which @code{FFTW_DESTROY_INPUT} is the default. In the
Chris@19	509 latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use
Chris@19	510 algorithms that do not destroy the input, at the expense of worse
Chris@19	511 performance; for multi-dimensional c2r transforms, however, no
Chris@19	512 input-preserving algorithms are implemented and the planner will return
Chris@19	513 @code{NULL} if one is requested.
Chris@19	514 @cindex c2r
Chris@19	515 @cindex hc2r
Chris@19	516
Chris@19	517 @item
Chris@19	518 @ctindex FFTW_UNALIGNED
Chris@19	519 @cindex alignment
Chris@19	520 @findex fftw_malloc
Chris@19	521 @findex fftw_alignment_of
Chris@19	522 @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any
Chris@19	523 unusual alignment requirements on the input/output arrays (i.e. no
Chris@19	524 SIMD may be used). This flag is normally @emph{not necessary}, since
Chris@19	525 the planner automatically detects misaligned arrays. The only use for
Chris@19	526 this flag is if you want to use the new-array execute interface to
Chris@19	527 execute a given plan on a different array that may not be aligned like
Chris@19	528 the original. (Using @code{fftw_malloc} makes this flag unnecessary
Chris@19	529 even then. You can also use @code{fftw_alignment_of} to detect
Chris@19	530 whether two arrays are equivalently aligned.)
Chris@19	531
Chris@19	532 @end itemize
Chris@19	533
Chris@19	534 @subsubheading Limiting planning time
Chris@19	535
Chris@19	536 @example
Chris@19	537 extern void fftw_set_timelimit(double seconds);
Chris@19	538 @end example
Chris@19	539 @findex fftw_set_timelimit
Chris@19	540
Chris@19	541 This function instructs FFTW to spend at most @code{seconds} seconds
Chris@19	542 (approximately) in the planner. If @code{seconds ==
Chris@19	543 FFTW_NO_TIMELIMIT} (the default value, which is negative), then
Chris@19	544 planning time is unbounded. Otherwise, FFTW plans with a
Chris@19	545 progressively wider range of algorithms until the the given time limit
Chris@19	546 is reached or the given range of algorithms is explored, returning the
Chris@19	547 best available plan.
Chris@19	548 @ctindex FFTW_NO_TIMELIMIT
Chris@19	549
Chris@19	550
Chris@19	551 For example, specifying @code{FFTW_PATIENT} first plans in
Chris@19	552 @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then
Chris@19	553 finally (time permitting) in @code{FFTW_PATIENT}. If
Chris@19	554 @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further
Chris@19	555 progress to @code{FFTW_EXHAUSTIVE} mode.
Chris@19	556
Chris@19	557 Note that the @code{seconds} argument specifies only a rough limit; in
Chris@19	558 practice, the planner may use somewhat more time if the time limit is
Chris@19	559 reached when the planner is in the middle of an operation that cannot
Chris@19	560 be interrupted. At the very least, the planner will complete planning
Chris@19	561 in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit
Chris@19	562 of 0).
Chris@19	563
Chris@19	564
Chris@19	565 @c =========>
Chris@19	566 @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface
Chris@19	567 @subsection Real-data DFTs
Chris@19	568
Chris@19	569 @example
Chris@19	570 fftw_plan fftw_plan_dft_r2c_1d(int n0,
Chris@19	571 double in, fftw_complex out,
Chris@19	572 unsigned flags);
Chris@19	573 fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1,
Chris@19	574 double in, fftw_complex out,
Chris@19	575 unsigned flags);
Chris@19	576 fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2,
Chris@19	577 double in, fftw_complex out,
Chris@19	578 unsigned flags);
Chris@19	579 fftw_plan fftw_plan_dft_r2c(int rank, const int *n,
Chris@19	580 double in, fftw_complex out,
Chris@19	581 unsigned flags);
Chris@19	582 @end example
Chris@19	583 @findex fftw_plan_dft_r2c_1d
Chris@19	584 @findex fftw_plan_dft_r2c_2d
Chris@19	585 @findex fftw_plan_dft_r2c_3d
Chris@19	586 @findex fftw_plan_dft_r2c
Chris@19	587 @cindex r2c
Chris@19	588
Chris@19	589 Plan a real-input/complex-output discrete Fourier transform (DFT) in
Chris@19	590 zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using
Chris@19	591 Plans}).
Chris@19	592
Chris@19	593 Once you have created a plan for a certain transform type and
Chris@19	594 parameters, then creating another plan of the same type and parameters,
Chris@19	595 but for different arrays, is fast and shares constant data with the
Chris@19	596 first plan (if it still exists).
Chris@19	597
Chris@19	598 The planner returns @code{NULL} if the plan cannot be created. A
Chris@19	599 non-@code{NULL} plan is always returned by the basic interface unless
Chris@19	600 you are using a customized FFTW configuration supporting a restricted
Chris@19	601 set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag
Chris@19	602 with a multi-dimensional out-of-place c2r transform (see below).
Chris@19	603
Chris@19	604 @subsubheading Arguments
Chris@19	605 @itemize @bullet
Chris@19	606
Chris@19	607 @item
Chris@19	608 @code{rank} is the rank of the transform (it should be the size of the
Chris@19	609 array @code{*n}), and can be any non-negative integer. (@xref{Complex
Chris@19	610 Multi-Dimensional DFTs}, for the definition of ``rank''.) The
Chris@19	611 @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a
Chris@19	612 @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank
Chris@19	613 may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a
Chris@19	614 copy of one real number (with zero imaginary part) from input to output.
Chris@19	615
Chris@19	616 @item
Chris@19	617 @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate
Chris@19	618 for each routine) specify the size of the transform dimensions. They
Chris@19	619 can be any positive integer. This is different in general from the
Chris@19	620 @emph{physical} array dimensions, which are described in @ref{Real-data
Chris@19	621 DFT Array Format}.
Chris@19	622
Chris@19	623 @itemize @minus
Chris@19	624 @item
Chris@19	625 FFTW is best at handling sizes of the form
Chris@19	626 @ifinfo
Chris@19	627 @math{2^a 3^b 5^c 7^d 11^e 13^f},
Chris@19	628 @end ifinfo
Chris@19	629 @tex
Chris@19	630 $2^a 3^b 5^c 7^d 11^e 13^f$,
Chris@19	631 @end tex
Chris@19	632 @html
Chris@19	633 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
Chris@19	634 11<sup>e</sup> 13<sup>f</sup>,
Chris@19	635 @end html
Chris@19	636 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
Chris@19	637 are arbitrary. Other sizes are computed by means of a slow,
Chris@19	638 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
Chris@19	639 for different array sizes; see @ref{Installation and Customization}.)
Chris@19	640 Transforms whose sizes are powers of @math{2} are especially fast, and
Chris@19	641 it is generally beneficial for the @emph{last} dimension of an r2c/c2r
Chris@19	642 transform to be @emph{even}.
Chris@19	643 @end itemize
Chris@19	644
Chris@19	645 @item
Chris@19	646 @code{in} and @code{out} point to the input and output arrays of the
Chris@19	647 transform, which may be the same (yielding an in-place transform).
Chris@19	648 @cindex in-place
Chris@19	649 These arrays are overwritten during planning, unless
Chris@19	650 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
Chris@19	651 initialized, but they must be allocated.) For an in-place transform, it
Chris@19	652 is important to remember that the real array will require padding,
Chris@19	653 described in @ref{Real-data DFT Array Format}.
Chris@19	654 @cindex padding
Chris@19	655
Chris@19	656 @item
Chris@19	657 @cindex flags
Chris@19	658 @code{flags} is a bitwise OR (@samp{\|}) of zero or more planner flags,
Chris@19	659 as defined in @ref{Planner Flags}.
Chris@19	660
Chris@19	661 @end itemize
Chris@19	662
Chris@19	663 The inverse transforms, taking complex input (storing the non-redundant
Chris@19	664 half of a logically Hermitian array) to real output, are given by:
Chris@19	665
Chris@19	666 @example
Chris@19	667 fftw_plan fftw_plan_dft_c2r_1d(int n0,
Chris@19	668 fftw_complex in, double out,
Chris@19	669 unsigned flags);
Chris@19	670 fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1,
Chris@19	671 fftw_complex in, double out,
Chris@19	672 unsigned flags);
Chris@19	673 fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2,
Chris@19	674 fftw_complex in, double out,
Chris@19	675 unsigned flags);
Chris@19	676 fftw_plan fftw_plan_dft_c2r(int rank, const int *n,
Chris@19	677 fftw_complex in, double out,
Chris@19	678 unsigned flags);
Chris@19	679 @end example
Chris@19	680 @findex fftw_plan_dft_c2r_1d
Chris@19	681 @findex fftw_plan_dft_c2r_2d
Chris@19	682 @findex fftw_plan_dft_c2r_3d
Chris@19	683 @findex fftw_plan_dft_c2r
Chris@19	684 @cindex c2r
Chris@19	685
Chris@19	686 The arguments are the same as for the r2c transforms, except that the
Chris@19	687 input and output data formats are reversed.
Chris@19	688
Chris@19	689 FFTW computes an unnormalized transform: computing an r2c followed by a
Chris@19	690 c2r transform (or vice versa) will result in the original data
Chris@19	691 multiplied by the size of the transform (the product of the logical
Chris@19	692 dimensions).
Chris@19	693 @cindex normalization
Chris@19	694 An r2c transform produces the same output as a @code{FFTW_FORWARD}
Chris@19	695 complex DFT of the same input, and a c2r transform is correspondingly
Chris@19	696 equivalent to @code{FFTW_BACKWARD}. For more information, see @ref{What
Chris@19	697 FFTW Really Computes}.
Chris@19	698
Chris@19	699 @c =========>
Chris@19	700 @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface
Chris@19	701 @subsection Real-data DFT Array Format
Chris@19	702 @cindex r2c/c2r multi-dimensional array format
Chris@19	703
Chris@19	704 The output of a DFT of real data (r2c) contains symmetries that, in
Chris@19	705 principle, make half of the outputs redundant (@pxref{What FFTW Really
Chris@19	706 Computes}). (Similarly for the input of an inverse c2r transform.) In
Chris@19	707 practice, it is not possible to entirely realize these savings in an
Chris@19	708 efficient and understandable format that generalizes to
Chris@19	709 multi-dimensional transforms. Instead, the output of the r2c
Chris@19	710 transforms is @emph{slightly} over half of the output of the
Chris@19	711 corresponding complex transform. We do not ``pack'' the data in any
Chris@19	712 way, but store it as an ordinary array of @code{fftw_complex} values.
Chris@19	713 In fact, this data is simply a subsection of what would be the array in
Chris@19	714 the corresponding complex transform.
Chris@19	715
Chris@19	716 Specifically, for a real transform of @math{d} (= @code{rank})
Chris@19	717 dimensions @ndims{}, the complex data is an @ndimshalf array of
Chris@19	718 @code{fftw_complex} values in row-major order (with the division rounded
Chris@19	719 down). That is, we only store the @emph{lower} half (non-negative
Chris@19	720 frequencies), plus one element, of the last dimension of the data from
Chris@19	721 the ordinary complex transform. (We could have instead taken half of
Chris@19	722 any other dimension, but implementation turns out to be simpler if the
Chris@19	723 last, contiguous, dimension is used.)
Chris@19	724
Chris@19	725 @cindex out-of-place
Chris@19	726 For an out-of-place transform, the real data is simply an array with
Chris@19	727 physical dimensions @ndims in row-major order.
Chris@19	728
Chris@19	729 @cindex in-place
Chris@19	730 @cindex padding
Chris@19	731 For an in-place transform, some complications arise since the complex data
Chris@19	732 is slightly larger than the real data. In this case, the final
Chris@19	733 dimension of the real data must be @emph{padded} with extra values to
Chris@19	734 accommodate the size of the complex data---two extra if the last
Chris@19	735 dimension is even and one if it is odd. That is, the last dimension of
Chris@19	736 the real data must physically contain
Chris@19	737 @tex
Chris@19	738 $2 (n_{d-1}/2+1)$
Chris@19	739 @end tex
Chris@19	740 @ifinfo
Chris@19	741 2 * (n[d-1]/2+1)
Chris@19	742 @end ifinfo
Chris@19	743 @html
Chris@19	744 2 * (n<sub>d-1</sub>/2+1)
Chris@19	745 @end html
Chris@19	746 @code{double} values (exactly enough to hold the complex data). This
Chris@19	747 physical array size does not, however, change the @emph{logical} array
Chris@19	748 size---only
Chris@19	749 @tex
Chris@19	750 $n_{d-1}$
Chris@19	751 @end tex
Chris@19	752 @ifinfo
Chris@19	753 n[d-1]
Chris@19	754 @end ifinfo
Chris@19	755 @html
Chris@19	756 n<sub>d-1</sub>
Chris@19	757 @end html
Chris@19	758 values are actually stored in the last dimension, and
Chris@19	759 @tex
Chris@19	760 $n_{d-1}$
Chris@19	761 @end tex
Chris@19	762 @ifinfo
Chris@19	763 n[d-1]
Chris@19	764 @end ifinfo
Chris@19	765 @html
Chris@19	766 n<sub>d-1</sub>
Chris@19	767 @end html
Chris@19	768 is the last dimension passed to the planner.
Chris@19	769
Chris@19	770 @c =========>
Chris@19	771 @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface
Chris@19	772 @subsection Real-to-Real Transforms
Chris@19	773 @cindex r2r
Chris@19	774
Chris@19	775 @example
Chris@19	776 fftw_plan fftw_plan_r2r_1d(int n, double in, double out,
Chris@19	777 fftw_r2r_kind kind, unsigned flags);
Chris@19	778 fftw_plan fftw_plan_r2r_2d(int n0, int n1, double in, double out,
Chris@19	779 fftw_r2r_kind kind0, fftw_r2r_kind kind1,
Chris@19	780 unsigned flags);
Chris@19	781 fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2,
Chris@19	782 double in, double out,
Chris@19	783 fftw_r2r_kind kind0,
Chris@19	784 fftw_r2r_kind kind1,
Chris@19	785 fftw_r2r_kind kind2,
Chris@19	786 unsigned flags);
Chris@19	787 fftw_plan fftw_plan_r2r(int rank, const int n, double in, double *out,
Chris@19	788 const fftw_r2r_kind *kind, unsigned flags);
Chris@19	789 @end example
Chris@19	790 @findex fftw_plan_r2r_1d
Chris@19	791 @findex fftw_plan_r2r_2d
Chris@19	792 @findex fftw_plan_r2r_3d
Chris@19	793 @findex fftw_plan_r2r
Chris@19	794
Chris@19	795 Plan a real input/output (r2r) transform of various kinds in zero or
Chris@19	796 more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}).
Chris@19	797
Chris@19	798 Once you have created a plan for a certain transform type and
Chris@19	799 parameters, then creating another plan of the same type and parameters,
Chris@19	800 but for different arrays, is fast and shares constant data with the
Chris@19	801 first plan (if it still exists).
Chris@19	802
Chris@19	803 The planner returns @code{NULL} if the plan cannot be created. A
Chris@19	804 non-@code{NULL} plan is always returned by the basic interface unless
Chris@19	805 you are using a customized FFTW configuration supporting a restricted
Chris@19	806 set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are
Chris@19	807 not defined).
Chris@19	808 @ctindex FFTW_REDFT00
Chris@19	809
Chris@19	810 @subsubheading Arguments
Chris@19	811 @itemize @bullet
Chris@19	812
Chris@19	813 @item
Chris@19	814 @code{rank} is the dimensionality of the transform (it should be the
Chris@19	815 size of the arrays @code{n} and @code{kind}), and can be any
Chris@19	816 non-negative integer. The @samp{_1d}, @samp{_2d}, and @samp{_3d}
Chris@19	817 planners correspond to a @code{rank} of @code{1}, @code{2}, and
Chris@19	818 @code{3}, respectively. A @code{rank} of zero is equivalent to a copy
Chris@19	819 of one number from input to output.
Chris@19	820
Chris@19	821 @item
Chris@19	822 @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]},
Chris@19	823 respectively, gives the (physical) size of the transform dimensions.
Chris@19	824 They can be any positive integer.
Chris@19	825
Chris@19	826 @itemize @minus
Chris@19	827 @item
Chris@19	828 @cindex row-major
Chris@19	829 Multi-dimensional arrays are stored in row-major order with dimensions:
Chris@19	830 @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or
Chris@19	831 @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}.
Chris@19	832 @xref{Multi-dimensional Array Format}.
Chris@19	833 @item
Chris@19	834 FFTW is generally best at handling sizes of the form
Chris@19	835 @ifinfo
Chris@19	836 @math{2^a 3^b 5^c 7^d 11^e 13^f},
Chris@19	837 @end ifinfo
Chris@19	838 @tex
Chris@19	839 $2^a 3^b 5^c 7^d 11^e 13^f$,
Chris@19	840 @end tex
Chris@19	841 @html
Chris@19	842 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup>
Chris@19	843 11<sup>e</sup> 13<sup>f</sup>,
Chris@19	844 @end html
Chris@19	845 where @math{e+f} is either @math{0} or @math{1}, and the other exponents
Chris@19	846 are arbitrary. Other sizes are computed by means of a slow,
Chris@19	847 general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW
Chris@19	848 for different array sizes; see @ref{Installation and Customization}.)
Chris@19	849 Transforms whose sizes are powers of @math{2} are especially fast.
Chris@19	850 @item
Chris@19	851 For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of
Chris@19	852 size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that
Chris@19	853 should be factorizable in the above form.
Chris@19	854 @end itemize
Chris@19	855
Chris@19	856 @item
Chris@19	857 @code{in} and @code{out} point to the input and output arrays of the
Chris@19	858 transform, which may be the same (yielding an in-place transform).
Chris@19	859 @cindex in-place
Chris@19	860 These arrays are overwritten during planning, unless
Chris@19	861 @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be
Chris@19	862 initialized, but they must be allocated.)
Chris@19	863
Chris@19	864 @item
Chris@19	865 @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or
Chris@19	866 @code{kind[rank]}, is the kind of r2r transform used for the
Chris@19	867 corresponding dimension. The valid kind constants are described in
Chris@19	868 @ref{Real-to-Real Transform Kinds}. In a multi-dimensional transform,
Chris@19	869 what is computed is the separable product formed by taking each
Chris@19	870 transform kind along the corresponding dimension, one dimension after
Chris@19	871 another.
Chris@19	872
Chris@19	873 @item
Chris@19	874 @cindex flags
Chris@19	875 @code{flags} is a bitwise OR (@samp{\|}) of zero or more planner flags,
Chris@19	876 as defined in @ref{Planner Flags}.
Chris@19	877
Chris@19	878 @end itemize
Chris@19	879
Chris@19	880 @c =========>
Chris@19	881 @node Real-to-Real Transform Kinds, , Real-to-Real Transforms, Basic Interface
Chris@19	882 @subsection Real-to-Real Transform Kinds
Chris@19	883 @cindex kind (r2r)
Chris@19	884
Chris@19	885 FFTW currently supports 11 different r2r transform kinds, specified by
Chris@19	886 one of the constants below. For the precise definitions of these
Chris@19	887 transforms, see @ref{What FFTW Really Computes}. For a more colloquial
Chris@19	888 introduction to these transform kinds, see @ref{More DFTs of Real Data}.
Chris@19	889
Chris@19	890 For dimension of size @code{n}, there is a corresponding ``logical''
Chris@19	891 dimension @code{N} that determines the normalization (and the optimal
Chris@19	892 factorization); the formula for @code{N} is given for each kind below.
Chris@19	893 Also, with each transform kind is listed its corrsponding inverse
Chris@19	894 transform. FFTW computes unnormalized transforms: a transform followed
Chris@19	895 by its inverse will result in the original data multiplied by @code{N}
Chris@19	896 (or the product of the @code{N}'s for each dimension, in
Chris@19	897 multi-dimensions).
Chris@19	898 @cindex normalization
Chris@19	899
Chris@19	900 @itemize @bullet
Chris@19	901
Chris@19	902 @item
Chris@19	903 @ctindex FFTW_R2HC
Chris@19	904 @code{FFTW_R2HC} computes a real-input DFT with output in
Chris@19	905 ``halfcomplex'' format, i.e. real and imaginary parts for a transform of
Chris@19	906 size @code{n} stored as:
Chris@19	907 @tex
Chris@19	908 $$
Chris@19	909 r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1
Chris@19	910 $$
Chris@19	911 @end tex
Chris@19	912 @ifinfo
Chris@19	913 r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1
Chris@19	914 @end ifinfo
Chris@19	915 @html
Chris@19	916 <p align=center>
Chris@19	917 r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub>
Chris@19	918 </p>
Chris@19	919 @end html
Chris@19	920 (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.)
Chris@19	921
Chris@19	922 @item
Chris@19	923 @ctindex FFTW_HC2R
Chris@19	924 @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above.
Chris@19	925 (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.)
Chris@19	926
Chris@19	927 @item
Chris@19	928 @ctindex FFTW_DHT
Chris@19	929 @code{FFTW_DHT} computes a discrete Hartley transform.
Chris@19	930 (Logical @code{N=n}, inverse is @code{FFTW_DHT}.)
Chris@19	931 @cindex discrete Hartley transform
Chris@19	932
Chris@19	933 @item
Chris@19	934 @ctindex FFTW_REDFT00
Chris@19	935 @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I.
Chris@19	936 (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.)
Chris@19	937 @cindex discrete cosine transform
Chris@19	938 @cindex DCT
Chris@19	939
Chris@19	940 @item
Chris@19	941 @ctindex FFTW_REDFT10
Chris@19	942 @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT).
Chris@19	943 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.)
Chris@19	944
Chris@19	945 @item
Chris@19	946 @ctindex FFTW_REDFT01
Chris@19	947 @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II).
Chris@19	948 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.)
Chris@19	949 @cindex IDCT
Chris@19	950
Chris@19	951 @item
Chris@19	952 @ctindex FFTW_REDFT11
Chris@19	953 @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV.
Chris@19	954 (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.)
Chris@19	955
Chris@19	956 @item
Chris@19	957 @ctindex FFTW_RODFT00
Chris@19	958 @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I.
Chris@19	959 (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.)
Chris@19	960 @cindex discrete sine transform
Chris@19	961 @cindex DST
Chris@19	962
Chris@19	963 @item
Chris@19	964 @ctindex FFTW_RODFT10
Chris@19	965 @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II.
Chris@19	966 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.)
Chris@19	967
Chris@19	968 @item
Chris@19	969 @ctindex FFTW_RODFT01
Chris@19	970 @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III.
Chris@19	971 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.)
Chris@19	972
Chris@19	973 @item
Chris@19	974 @ctindex FFTW_RODFT11
Chris@19	975 @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV.
Chris@19	976 (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.)
Chris@19	977
Chris@19	978 @end itemize
Chris@19	979
Chris@19	980 @c ------------------------------------------------------------
Chris@19	981 @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference
Chris@19	982 @section Advanced Interface
Chris@19	983 @cindex advanced interface
Chris@19	984
Chris@19	985 FFTW's ``advanced'' interface supplements the basic interface with four
Chris@19	986 new planner routines, providing a new level of flexibility: you can plan
Chris@19	987 a transform of multiple arrays simultaneously, operate on non-contiguous
Chris@19	988 (strided) data, and transform a subset of a larger multi-dimensional
Chris@19	989 array. Other than these additional features, the planner operates in
Chris@19	990 the same fashion as in the basic interface, and the resulting
Chris@19	991 @code{fftw_plan} is used in the same way (@pxref{Using Plans}).
Chris@19	992
Chris@19	993 @menu
Chris@19	994 * Advanced Complex DFTs::
Chris@19	995 * Advanced Real-data DFTs::
Chris@19	996 * Advanced Real-to-real Transforms::
Chris@19	997 @end menu
Chris@19	998
Chris@19	999 @c =========>
Chris@19	1000 @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface
Chris@19	1001 @subsection Advanced Complex DFTs
Chris@19	1002
Chris@19	1003 @example
Chris@19	1004 fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany,
Chris@19	1005 fftw_complex in, const int inembed,
Chris@19	1006 int istride, int idist,
Chris@19	1007 fftw_complex out, const int onembed,
Chris@19	1008 int ostride, int odist,
Chris@19	1009 int sign, unsigned flags);
Chris@19	1010 @end example
Chris@19	1011 @findex fftw_plan_many_dft
Chris@19	1012
Chris@19	1013 This routine plans multiple multidimensional complex DFTs, and it
Chris@19	1014 extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to
Chris@19	1015 compute @code{howmany} transforms, each having rank @code{rank} and size
Chris@19	1016 @code{n}. In addition, the transform data need not be contiguous, but
Chris@19	1017 it may be laid out in memory with an arbitrary stride. To account for
Chris@19	1018 these possibilities, @code{fftw_plan_many_dft} adds the new parameters
Chris@19	1019 @code{howmany}, @{@code{i},@code{o}@}@code{nembed},
Chris@19	1020 @{@code{i},@code{o}@}@code{stride}, and
Chris@19	1021 @{@code{i},@code{o}@}@code{dist}. The FFTW basic interface
Chris@19	1022 (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2,
Chris@19	1023 and@tie{}3, but the advanced interface handles only the general-rank
Chris@19	1024 case.
Chris@19	1025
Chris@19	1026 @code{howmany} is the number of transforms to compute. The resulting
Chris@19	1027 plan computes @code{howmany} transforms, where the input of the
Chris@19	1028 @code{k}-th transform is at location @code{in+k*idist} (in C pointer
Chris@19	1029 arithmetic), and its output is at location @code{out+k*odist}. Plans
Chris@19	1030 obtained in this way can often be faster than calling FFTW multiple
Chris@19	1031 times for the individual transforms. The basic @code{fftw_plan_dft}
Chris@19	1032 interface corresponds to @code{howmany=1} (in which case the @code{dist}
Chris@19	1033 parameters are ignored).
Chris@19	1034 @cindex howmany parameter
Chris@19	1035 @cindex dist
Chris@19	1036
Chris@19	1037
Chris@19	1038 Each of the @code{howmany} transforms has rank @code{rank} and size
Chris@19	1039 @code{n}, as in the basic interface. In addition, the advanced
Chris@19	1040 interface allows the input and output arrays of each transform to be
Chris@19	1041 row-major subarrays of larger rank-@code{rank} arrays, described by
Chris@19	1042 @code{inembed} and @code{onembed} parameters, respectively.
Chris@19	1043 @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank},
Chris@19	1044 and @code{n} should be elementwise less than or equal to
Chris@19	1045 @{@code{i},@code{o}@}@code{nembed}. Passing @code{NULL} for an
Chris@19	1046 @code{nembed} parameter is equivalent to passing @code{n} (i.e. same
Chris@19	1047 physical and logical dimensions, as in the basic interface.)
Chris@19	1048
Chris@19	1049 The @code{stride} parameters indicate that the @code{j}-th element of
Chris@19	1050 the input or output arrays is located at @code{j*istride} or
Chris@19	1051 @code{j*ostride}, respectively. (For a multi-dimensional array,
Chris@19	1052 @code{j} is the ordinary row-major index.) When combined with the
Chris@19	1053 @code{k}-th transform in a @code{howmany} loop, from above, this means
Chris@19	1054 that the (@code{j},@code{k})-th element is at @code{jstride+kdist}.
Chris@19	1055 (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.)
Chris@19	1056 @cindex stride
Chris@19	1057
Chris@19	1058
Chris@19	1059 For in-place transforms, the input and output @code{stride} and
Chris@19	1060 @code{dist} parameters should be the same; otherwise, the planner may
Chris@19	1061 return @code{NULL}.
Chris@19	1062
Chris@19	1063 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
Chris@19	1064 this function returns. You can safely free or reuse them.
Chris@19	1065
Chris@19	1066 @strong{Examples}:
Chris@19	1067 One transform of one 5 by 6 array contiguous in memory:
Chris@19	1068 @example
Chris@19	1069 int rank = 2;
Chris@19	1070 int n[] = @{5, 6@};
Chris@19	1071 int howmany = 1;
Chris@19	1072 int idist = odist = 0; /* unused because howmany = 1 */
Chris@19	1073 int istride = ostride = 1; /* array is contiguous in memory */
Chris@19	1074 int inembed = n, onembed = n;
Chris@19	1075 @end example
Chris@19	1076
Chris@19	1077 Transform of three 5 by 6 arrays, each contiguous in memory,
Chris@19	1078 stored in memory one after another:
Chris@19	1079 @example
Chris@19	1080 int rank = 2;
Chris@19	1081 int n[] = @{5, 6@};
Chris@19	1082 int howmany = 3;
Chris@19	1083 int idist = odist = n[0]n[1]; / = 30, the distance in memory
Chris@19	1084 between the first element
Chris@19	1085 of the first array and the
Chris@19	1086 first element of the second array */
Chris@19	1087 int istride = ostride = 1; /* array is contiguous in memory */
Chris@19	1088 int inembed = n, onembed = n;
Chris@19	1089 @end example
Chris@19	1090
Chris@19	1091 Transform each column of a 2d array with 10 rows and 3 columns:
Chris@19	1092 @example
Chris@19	1093 int rank = 1; /* not 2: we are computing 1d transforms */
Chris@19	1094 int n[] = @{10@}; /* 1d transforms of length 10 */
Chris@19	1095 int howmany = 3;
Chris@19	1096 int idist = odist = 1;
Chris@19	1097 int istride = ostride = 3; /* distance between two elements in
Chris@19	1098 the same column */
Chris@19	1099 int inembed = n, onembed = n;
Chris@19	1100 @end example
Chris@19	1101
Chris@19	1102 @c =========>
Chris@19	1103 @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface
Chris@19	1104 @subsection Advanced Real-data DFTs
Chris@19	1105
Chris@19	1106 @example
Chris@19	1107 fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany,
Chris@19	1108 double in, const int inembed,
Chris@19	1109 int istride, int idist,
Chris@19	1110 fftw_complex out, const int onembed,
Chris@19	1111 int ostride, int odist,
Chris@19	1112 unsigned flags);
Chris@19	1113 fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany,
Chris@19	1114 fftw_complex in, const int inembed,
Chris@19	1115 int istride, int idist,
Chris@19	1116 double out, const int onembed,
Chris@19	1117 int ostride, int odist,
Chris@19	1118 unsigned flags);
Chris@19	1119 @end example
Chris@19	1120 @findex fftw_plan_many_dft_r2c
Chris@19	1121 @findex fftw_plan_many_dft_c2r
Chris@19	1122
Chris@19	1123 Like @code{fftw_plan_many_dft}, these two functions add @code{howmany},
Chris@19	1124 @code{nembed}, @code{stride}, and @code{dist} parameters to the
Chris@19	1125 @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but
Chris@19	1126 otherwise behave the same as the basic interface.
Chris@19	1127
Chris@19	1128 The interpretation of @code{howmany}, @code{stride}, and @code{dist} are
Chris@19	1129 the same as for @code{fftw_plan_many_dft}, above. Note that the
Chris@19	1130 @code{stride} and @code{dist} for the real array are in units of
Chris@19	1131 @code{double}, and for the complex array are in units of
Chris@19	1132 @code{fftw_complex}.
Chris@19	1133
Chris@19	1134 If an @code{nembed} parameter is @code{NULL}, it is interpreted as what
Chris@19	1135 it would be in the basic interface, as described in @ref{Real-data DFT
Chris@19	1136 Array Format}. That is, for the complex array the size is assumed to be
Chris@19	1137 the same as @code{n}, but with the last dimension cut roughly in half.
Chris@19	1138 For the real array, the size is assumed to be @code{n} if the transform
Chris@19	1139 is out-of-place, or @code{n} with the last dimension ``padded'' if the
Chris@19	1140 transform is in-place.
Chris@19	1141
Chris@19	1142 If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as
Chris@19	1143 the physical size of the corresponding array, in row-major order, just
Chris@19	1144 as for @code{fftw_plan_many_dft}. In this case, each dimension of
Chris@19	1145 @code{nembed} should be @code{>=} what it would be in the basic
Chris@19	1146 interface (e.g. the halved or padded @code{n}).
Chris@19	1147
Chris@19	1148 Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after
Chris@19	1149 this function returns. You can safely free or reuse them.
Chris@19	1150
Chris@19	1151 @c =========>
Chris@19	1152 @node Advanced Real-to-real Transforms, , Advanced Real-data DFTs, Advanced Interface
Chris@19	1153 @subsection Advanced Real-to-real Transforms
Chris@19	1154
Chris@19	1155 @example
Chris@19	1156 fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany,
Chris@19	1157 double in, const int inembed,
Chris@19	1158 int istride, int idist,
Chris@19	1159 double out, const int onembed,
Chris@19	1160 int ostride, int odist,
Chris@19	1161 const fftw_r2r_kind *kind, unsigned flags);
Chris@19	1162 @end example
Chris@19	1163 @findex fftw_plan_many_r2r
Chris@19	1164
Chris@19	1165 Like @code{fftw_plan_many_dft}, this functions adds @code{howmany},
Chris@19	1166 @code{nembed}, @code{stride}, and @code{dist} parameters to the
Chris@19	1167 @code{fftw_plan_r2r} function, but otherwise behave the same as the
Chris@19	1168 basic interface. The interpretation of those additional parameters are
Chris@19	1169 the same as for @code{fftw_plan_many_dft}. (Of course, the
Chris@19	1170 @code{stride} and @code{dist} parameters are now in units of
Chris@19	1171 @code{double}, not @code{fftw_complex}.)
Chris@19	1172
Chris@19	1173 Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not
Chris@19	1174 used after this function returns. You can safely free or reuse them.
Chris@19	1175
Chris@19	1176 @c ------------------------------------------------------------
Chris@19	1177 @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference
Chris@19	1178 @section Guru Interface
Chris@19	1179 @cindex guru interface
Chris@19	1180
Chris@19	1181 The ``guru'' interface to FFTW is intended to expose as much as possible
Chris@19	1182 of the flexibility in the underlying FFTW architecture. It allows one
Chris@19	1183 to compute multi-dimensional ``vectors'' (loops) of multi-dimensional
Chris@19	1184 transforms, where each vector/transform dimension has an independent
Chris@19	1185 size and stride.
Chris@19	1186 @cindex vector
Chris@19	1187 One can also use more general complex-number formats, e.g. separate real
Chris@19	1188 and imaginary arrays.
Chris@19	1189
Chris@19	1190 For those users who require the flexibility of the guru interface, it is
Chris@19	1191 important that they pay special attention to the documentation lest they
Chris@19	1192 shoot themselves in the foot.
Chris@19	1193
Chris@19	1194 @menu
Chris@19	1195 * Interleaved and split arrays::
Chris@19	1196 * Guru vector and transform sizes::
Chris@19	1197 * Guru Complex DFTs::
Chris@19	1198 * Guru Real-data DFTs::
Chris@19	1199 * Guru Real-to-real Transforms::
Chris@19	1200 * 64-bit Guru Interface::
Chris@19	1201 @end menu
Chris@19	1202
Chris@19	1203 @c =========>
Chris@19	1204 @node Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface
Chris@19	1205 @subsection Interleaved and split arrays
Chris@19	1206
Chris@19	1207 The guru interface supports two representations of complex numbers,
Chris@19	1208 which we call the interleaved and the split format.
Chris@19	1209
Chris@19	1210 The @dfn{interleaved} format is the same one used by the basic and
Chris@19	1211 advanced interfaces, and it is documented in @ref{Complex numbers}.
Chris@19	1212 In the interleaved format, you provide pointers to the real part of a
Chris@19	1213 complex number, and the imaginary part understood to be stored in the
Chris@19	1214 next memory location.
Chris@19	1215 @cindex interleaved format
Chris@19	1216
Chris@19	1217
Chris@19	1218 The @dfn{split} format allows separate pointers to the real and
Chris@19	1219 imaginary parts of a complex array.
Chris@19	1220 @cindex split format
Chris@19	1221
Chris@19	1222
Chris@19	1223 Technically, the interleaved format is redundant, because you can
Chris@19	1224 always express an interleaved array in terms of a split array with
Chris@19	1225 appropriate pointers and strides. On the other hand, the interleaved
Chris@19	1226 format is simpler to use, and it is common in practice. Hence, FFTW
Chris@19	1227 supports it as a special case.
Chris@19	1228
Chris@19	1229 @c =========>
Chris@19	1230 @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface
Chris@19	1231 @subsection Guru vector and transform sizes
Chris@19	1232
Chris@19	1233 The guru interface introduces one basic new data structure,
Chris@19	1234 @code{fftw_iodim}, that is used to specify sizes and strides for
Chris@19	1235 multi-dimensional transforms and vectors:
Chris@19	1236
Chris@19	1237 @example
Chris@19	1238 typedef struct @{
Chris@19	1239 int n;
Chris@19	1240 int is;
Chris@19	1241 int os;
Chris@19	1242 @} fftw_iodim;
Chris@19	1243 @end example
Chris@19	1244 @tindex fftw_iodim
Chris@19	1245
Chris@19	1246 Here, @code{n} is the size of the dimension, and @code{is} and @code{os}
Chris@19	1247 are the strides of that dimension for the input and output arrays. (The
Chris@19	1248 stride is the separation of consecutive elements along this dimension.)
Chris@19	1249
Chris@19	1250 The meaning of the stride parameter depends on the type of the array
Chris@19	1251 that the stride refers to. @emph{If the array is interleaved complex,
Chris@19	1252 strides are expressed in units of complex numbers
Chris@19	1253 (@code{fftw_complex}). If the array is split complex or real, strides
Chris@19	1254 are expressed in units of real numbers (@code{double}).} This
Chris@19	1255 convention is consistent with the usual pointer arithmetic in the C
Chris@19	1256 language. An interleaved array is denoted by a pointer @code{p} to
Chris@19	1257 @code{fftw_complex}, so that @code{p+1} points to the next complex
Chris@19	1258 number. Split arrays are denoted by pointers to @code{double}, in
Chris@19	1259 which case pointer arithmetic operates in units of
Chris@19	1260 @code{sizeof(double)}.
Chris@19	1261 @cindex stride
Chris@19	1262
Chris@19	1263
Chris@19	1264 The guru planner interfaces all take a (@code{rank}, @code{dims[rank]})
Chris@19	1265 pair describing the transform size, and a (@code{howmany_rank},
Chris@19	1266 @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a
Chris@19	1267 multi-dimensional loop of transforms to perform), where @code{dims} and
Chris@19	1268 @code{howmany_dims} are arrays of @code{fftw_iodim}.
Chris@19	1269
Chris@19	1270 For example, the @code{howmany} parameter in the advanced complex-DFT
Chris@19	1271 interface corresponds to @code{howmany_rank} = 1,
Chris@19	1272 @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} =
Chris@19	1273 @code{idist}, and @code{howmany_dims[0].os} = @code{odist}.
Chris@19	1274 @cindex howmany loop
Chris@19	1275 @cindex dist
Chris@19	1276 (To compute a single transform, you can just use @code{howmany_rank} = 0.)
Chris@19	1277
Chris@19	1278
Chris@19	1279 A row-major multidimensional array with dimensions @code{n[rank]}
Chris@19	1280 (@pxref{Row-major Format}) corresponds to @code{dims[i].n} =
Chris@19	1281 @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] *
Chris@19	1282 dims[i+1].is} (similarly for @code{os}). The stride of the last
Chris@19	1283 (@code{i=rank-1}) dimension is the overall stride of the array.
Chris@19	1284 e.g. to be equivalent to the advanced complex-DFT interface, you would
Chris@19	1285 have @code{dims[rank-1].is} = @code{istride} and
Chris@19	1286 @code{dims[rank-1].os} = @code{ostride}.
Chris@19	1287 @cindex row-major
Chris@19	1288
Chris@19	1289
Chris@19	1290 In general, we only guarantee FFTW to return a non-@code{NULL} plan if
Chris@19	1291 the vector and transform dimensions correspond to a set of distinct
Chris@19	1292 indices, and for in-place transforms the input/output strides should
Chris@19	1293 be the same.
Chris@19	1294
Chris@19	1295 @c =========>
Chris@19	1296 @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface
Chris@19	1297 @subsection Guru Complex DFTs
Chris@19	1298
Chris@19	1299 @example
Chris@19	1300 fftw_plan fftw_plan_guru_dft(
Chris@19	1301 int rank, const fftw_iodim *dims,
Chris@19	1302 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@19	1303 fftw_complex in, fftw_complex out,
Chris@19	1304 int sign, unsigned flags);
Chris@19	1305
Chris@19	1306 fftw_plan fftw_plan_guru_split_dft(
Chris@19	1307 int rank, const fftw_iodim *dims,
Chris@19	1308 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@19	1309 double ri, double ii, double ro, double io,
Chris@19	1310 unsigned flags);
Chris@19	1311 @end example
Chris@19	1312 @findex fftw_plan_guru_dft
Chris@19	1313 @findex fftw_plan_guru_split_dft
Chris@19	1314
Chris@19	1315 These two functions plan a complex-data, multi-dimensional DFT
Chris@19	1316 for the interleaved and split format, respectively.
Chris@19	1317 Transform dimensions are given by (@code{rank}, @code{dims}) over a
Chris@19	1318 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
Chris@19	1319 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
Chris@19	1320 to @code{fftw_iodim} arrays of length @code{rank} and
Chris@19	1321 @code{howmany_rank}, respectively.
Chris@19	1322
Chris@19	1323 @cindex flags
Chris@19	1324 @code{flags} is a bitwise OR (@samp{\|}) of zero or more planner flags,
Chris@19	1325 as defined in @ref{Planner Flags}.
Chris@19	1326
Chris@19	1327 In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and
Chris@19	1328 @code{out} point to the interleaved input and output arrays,
Chris@19	1329 respectively. The sign can be either @math{-1} (=
Chris@19	1330 @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). If the
Chris@19	1331 pointers are equal, the transform is in-place.
Chris@19	1332
Chris@19	1333 In the @code{fftw_plan_guru_split_dft} function,
Chris@19	1334 @code{ri} and @code{ii} point to the real and imaginary input arrays,
Chris@19	1335 and @code{ro} and @code{io} point to the real and imaginary output
Chris@19	1336 arrays. The input and output pointers may be the same, indicating an
Chris@19	1337 in-place transform. For example, for @code{fftw_complex} pointers
Chris@19	1338 @code{in} and @code{out}, the corresponding parameters are:
Chris@19	1339
Chris@19	1340 @example
Chris@19	1341 ri = (double *) in;
Chris@19	1342 ii = (double *) in + 1;
Chris@19	1343 ro = (double *) out;
Chris@19	1344 io = (double *) out + 1;
Chris@19	1345 @end example
Chris@19	1346
Chris@19	1347 Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides
Chris@19	1348 are expressed in units of @code{double}. For a contiguous
Chris@19	1349 @code{fftw_complex} array, the overall stride of the transform should
Chris@19	1350 be 2, the distance between consecutive real parts or between
Chris@19	1351 consecutive imaginary parts; see @ref{Guru vector and transform
Chris@19	1352 sizes}. Note that the dimension strides are applied equally to the
Chris@19	1353 real and imaginary parts; real and imaginary arrays with different
Chris@19	1354 strides are not supported.
Chris@19	1355
Chris@19	1356 There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}.
Chris@19	1357 This function always plans for an @code{FFTW_FORWARD} transform. To
Chris@19	1358 plan for an @code{FFTW_BACKWARD} transform, you can exploit the
Chris@19	1359 identity that the backwards DFT is equal to the forwards DFT with the
Chris@19	1360 real and imaginary parts swapped. For example, in the case of the
Chris@19	1361 @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform
Chris@19	1362 is computed by the parameters:
Chris@19	1363
Chris@19	1364 @example
Chris@19	1365 ri = (double *) in + 1;
Chris@19	1366 ii = (double *) in;
Chris@19	1367 ro = (double *) out + 1;
Chris@19	1368 io = (double *) out;
Chris@19	1369 @end example
Chris@19	1370
Chris@19	1371 @c =========>
Chris@19	1372 @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface
Chris@19	1373 @subsection Guru Real-data DFTs
Chris@19	1374
Chris@19	1375 @example
Chris@19	1376 fftw_plan fftw_plan_guru_dft_r2c(
Chris@19	1377 int rank, const fftw_iodim *dims,
Chris@19	1378 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@19	1379 double in, fftw_complex out,
Chris@19	1380 unsigned flags);
Chris@19	1381
Chris@19	1382 fftw_plan fftw_plan_guru_split_dft_r2c(
Chris@19	1383 int rank, const fftw_iodim *dims,
Chris@19	1384 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@19	1385 double in, double ro, double *io,
Chris@19	1386 unsigned flags);
Chris@19	1387
Chris@19	1388 fftw_plan fftw_plan_guru_dft_c2r(
Chris@19	1389 int rank, const fftw_iodim *dims,
Chris@19	1390 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@19	1391 fftw_complex in, double out,
Chris@19	1392 unsigned flags);
Chris@19	1393
Chris@19	1394 fftw_plan fftw_plan_guru_split_dft_c2r(
Chris@19	1395 int rank, const fftw_iodim *dims,
Chris@19	1396 int howmany_rank, const fftw_iodim *howmany_dims,
Chris@19	1397 double ri, double ii, double *out,
Chris@19	1398 unsigned flags);
Chris@19	1399 @end example
Chris@19	1400 @findex fftw_plan_guru_dft_r2c
Chris@19	1401 @findex fftw_plan_guru_split_dft_r2c
Chris@19	1402 @findex fftw_plan_guru_dft_c2r
Chris@19	1403 @findex fftw_plan_guru_split_dft_c2r
Chris@19	1404
Chris@19	1405 Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with
Chris@19	1406 transform dimensions given by (@code{rank}, @code{dims}) over a
Chris@19	1407 multi-dimensional vector (loop) of dimensions (@code{howmany_rank},
Chris@19	1408 @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point
Chris@19	1409 to @code{fftw_iodim} arrays of length @code{rank} and
Chris@19	1410 @code{howmany_rank}, respectively. As for the basic and advanced
Chris@19	1411 interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform
Chris@19	1412 is @code{FFTW_BACKWARD}.
Chris@19	1413
Chris@19	1414 The @emph{last} dimension of @code{dims} is interpreted specially:
Chris@19	1415 that dimension of the real array has size @code{dims[rank-1].n}, but
Chris@19	1416 that dimension of the complex array has size @code{dims[rank-1].n/2+1}
Chris@19	1417 (division rounded down). The strides, on the other hand, are taken to
Chris@19	1418 be exactly as specified. It is up to the user to specify the strides
Chris@19	1419 appropriately for the peculiar dimensions of the data, and we do not
Chris@19	1420 guarantee that the planner will succeed (return non-@code{NULL}) for
Chris@19	1421 any dimensions other than those described in @ref{Real-data DFT Array
Chris@19	1422 Format} and generalized in @ref{Advanced Real-data DFTs}. (That is,
Chris@19	1423 for an in-place transform, each individual dimension should be able to
Chris@19	1424 operate in place.)
Chris@19	1425 @cindex in-place
Chris@19	1426
Chris@19	1427
Chris@19	1428 @code{in} and @code{out} point to the input and output arrays for r2c
Chris@19	1429 and c2r transforms, respectively. For split arrays, @code{ri} and
Chris@19	1430 @code{ii} point to the real and imaginary input arrays for a c2r
Chris@19	1431 transform, and @code{ro} and @code{io} point to the real and imaginary
Chris@19	1432 output arrays for an r2c transform. @code{in} and @code{ro} or
Chris@19	1433 @code{ri} and @code{out} may be the same, indicating an in-place
Chris@19	1434 transform. (In-place transforms where @code{in} and @code{io} or
Chris@19	1435 @code{ii} and @code{out} are the same are not currently supported.)
Chris@19	1436
Chris@19	1437 @cindex flags
Chris@19	1438 @code{flags} is a bitwise OR (@samp{\|}) of zero or more planner flags,
Chris@19	1439 as defined in @ref{Planner Flags}.
Chris@19	1440
Chris@19	1441 In-place transforms of rank greater than 1 are currently only
Chris@19	1442 supported for interleaved arrays. For split arrays, the planner will
Chris@19	1443 return @code{NULL}.
Chris@19	1444 @cindex in-place
Chris@19	1445
Chris@19	1446 @c =========>
Chris@19	1447 @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface
Chris@19	1448 @subsection Guru Real-to-real Transforms
Chris@19	1449
Chris@19	1450 @example
Chris@19	1451 fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims,
Chris@19	1452 int howmany_rank,
Chris@19	1453 const fftw_iodim *howmany_dims,
Chris@19	1454 double in, double out,
Chris@19	1455 const fftw_r2r_kind *kind,
Chris@19	1456 unsigned flags);
Chris@19	1457 @end example
Chris@19	1458 @findex fftw_plan_guru_r2r
Chris@19	1459
Chris@19	1460 Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD}
Chris@19	1461 transform with transform dimensions given by (@code{rank}, @code{dims})
Chris@19	1462 over a multi-dimensional vector (loop) of dimensions
Chris@19	1463 (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and
Chris@19	1464 @code{howmany_dims} should point to @code{fftw_iodim} arrays of length
Chris@19	1465 @code{rank} and @code{howmany_rank}, respectively.
Chris@19	1466
Chris@19	1467 The transform kind of each dimension is given by the @code{kind}
Chris@19	1468 parameter, which should point to an array of length @code{rank}. Valid
Chris@19	1469 @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform
Chris@19	1470 Kinds}.
Chris@19	1471
Chris@19	1472 @code{in} and @code{out} point to the real input and output arrays; they
Chris@19	1473 may be the same, indicating an in-place transform.
Chris@19	1474
Chris@19	1475 @cindex flags
Chris@19	1476 @code{flags} is a bitwise OR (@samp{\|}) of zero or more planner flags,
Chris@19	1477 as defined in @ref{Planner Flags}.
Chris@19	1478
Chris@19	1479 @c =========>
Chris@19	1480 @node 64-bit Guru Interface, , Guru Real-to-real Transforms, Guru Interface
Chris@19	1481 @subsection 64-bit Guru Interface
Chris@19	1482 @cindex 64-bit architecture
Chris@19	1483
Chris@19	1484 When compiled in 64-bit mode on a 64-bit architecture (where addresses
Chris@19	1485 are 64 bits wide), FFTW uses 64-bit quantities internally for all
Chris@19	1486 transform sizes, strides, and so on---you don't have to do anything
Chris@19	1487 special to exploit this. However, in the ordinary FFTW interfaces,
Chris@19	1488 you specify the transform size by an @code{int} quantity, which is
Chris@19	1489 normally only 32 bits wide. This means that, even though FFTW is
Chris@19	1490 using 64-bit sizes internally, you cannot specify a single transform
Chris@19	1491 dimension larger than
Chris@19	1492 @ifinfo
Chris@19	1493 2^31-1
Chris@19	1494 @end ifinfo
Chris@19	1495 @html
Chris@19	1496 2<sup><small>31</small></sup>−1
Chris@19	1497 @end html
Chris@19	1498 @tex
Chris@19	1499 $2^31-1$
Chris@19	1500 @end tex
Chris@19	1501 numbers.
Chris@19	1502
Chris@19	1503 We expect that few users will require transforms larger than this, but,
Chris@19	1504 for those who do, we provide a 64-bit version of the guru interface in
Chris@19	1505 which all sizes are specified as integers of type @code{ptrdiff_t}
Chris@19	1506 instead of @code{int}. (@code{ptrdiff_t} is a signed integer type
Chris@19	1507 defined by the C standard to be wide enough to represent address
Chris@19	1508 differences, and thus must be at least 64 bits wide on a 64-bit
Chris@19	1509 machine.) We stress that there is @emph{no performance advantage} to
Chris@19	1510 using this interface---the same internal FFTW code is employed
Chris@19	1511 regardless---and it is only necessary if you want to specify very
Chris@19	1512 large transform sizes.
Chris@19	1513 @tindex ptrdiff_t
Chris@19	1514
Chris@19	1515
Chris@19	1516 In particular, the 64-bit guru interface is a set of planner routines
Chris@19	1517 that are exactly the same as the guru planner routines, except that
Chris@19	1518 they are named with @samp{guru64} instead of @samp{guru} and they take
Chris@19	1519 arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}.
Chris@19	1520 For example, instead of @code{fftw_plan_guru_dft}, we have
Chris@19	1521 @code{fftw_plan_guru64_dft}.
Chris@19	1522
Chris@19	1523 @example
Chris@19	1524 fftw_plan fftw_plan_guru64_dft(
Chris@19	1525 int rank, const fftw_iodim64 *dims,
Chris@19	1526 int howmany_rank, const fftw_iodim64 *howmany_dims,
Chris@19	1527 fftw_complex in, fftw_complex out,
Chris@19	1528 int sign, unsigned flags);
Chris@19	1529 @end example
Chris@19	1530 @findex fftw_plan_guru64_dft
Chris@19	1531
Chris@19	1532 The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the
Chris@19	1533 same interpretation, except that it uses type @code{ptrdiff_t} instead
Chris@19	1534 of type @code{int}.
Chris@19	1535
Chris@19	1536 @example
Chris@19	1537 typedef struct @{
Chris@19	1538 ptrdiff_t n;
Chris@19	1539 ptrdiff_t is;
Chris@19	1540 ptrdiff_t os;
Chris@19	1541 @} fftw_iodim64;
Chris@19	1542 @end example
Chris@19	1543 @tindex fftw_iodim64
Chris@19	1544
Chris@19	1545 Every other @samp{fftw_plan_guru} function also has a
Chris@19	1546 @samp{fftw_plan_guru64} equivalent, but we do not repeat their
Chris@19	1547 documentation here since they are identical to the 32-bit versions
Chris@19	1548 except as noted above.
Chris@19	1549
Chris@19	1550 @c -----------------------------------------------------------
Chris@19	1551 @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference
Chris@19	1552 @section New-array Execute Functions
Chris@19	1553 @cindex execute
Chris@19	1554 @cindex new-array execution
Chris@19	1555
Chris@19	1556 Normally, one executes a plan for the arrays with which the plan was
Chris@19	1557 created, by calling @code{fftw_execute(plan)} as described in @ref{Using
Chris@19	1558 Plans}.
Chris@19	1559 @findex fftw_execute
Chris@19	1560 However, it is possible for sophisticated users to apply a given plan
Chris@19	1561 to a @emph{different} array using the ``new-array execute'' functions
Chris@19	1562 detailed below, provided that the following conditions are met:
Chris@19	1563
Chris@19	1564 @itemize @bullet
Chris@19	1565
Chris@19	1566 @item
Chris@19	1567 The array size, strides, etcetera are the same (since those are set by
Chris@19	1568 the plan).
Chris@19	1569
Chris@19	1570 @item
Chris@19	1571 The input and output arrays are the same (in-place) or different
Chris@19	1572 (out-of-place) if the plan was originally created to be in-place or
Chris@19	1573 out-of-place, respectively.
Chris@19	1574
Chris@19	1575 @item
Chris@19	1576 For split arrays, the separations between the real and imaginary
Chris@19	1577 parts, @code{ii-ri} and @code{io-ro}, are the same as they were for
Chris@19	1578 the input and output arrays when the plan was created. (This
Chris@19	1579 condition is automatically satisfied for interleaved arrays.)
Chris@19	1580
Chris@19	1581 @item
Chris@19	1582 The @dfn{alignment} of the new input/output arrays is the same as that
Chris@19	1583 of the input/output arrays when the plan was created, unless the plan
Chris@19	1584 was created with the @code{FFTW_UNALIGNED} flag.
Chris@19	1585 @ctindex FFTW_UNALIGNED
Chris@19	1586 Here, the alignment is a platform-dependent quantity (for example, it is
Chris@19	1587 the address modulo 16 if SSE SIMD instructions are used, but the address
Chris@19	1588 modulo 4 for non-SIMD single-precision FFTW on the same machine). In
Chris@19	1589 general, only arrays allocated with @code{fftw_malloc} are guaranteed to
Chris@19	1590 be equally aligned (@pxref{SIMD alignment and fftw_malloc}).
Chris@19	1591
Chris@19	1592 @end itemize
Chris@19	1593
Chris@19	1594 @cindex alignment
Chris@19	1595 The alignment issue is especially critical, because if you don't use
Chris@19	1596 @code{fftw_malloc} then you may have little control over the alignment
Chris@19	1597 of arrays in memory. For example, neither the C++ @code{new} function
Chris@19	1598 nor the Fortran @code{allocate} statement provide strong enough
Chris@19	1599 guarantees about data alignment. If you don't use @code{fftw_malloc},
Chris@19	1600 therefore, you probably have to use @code{FFTW_UNALIGNED} (which
Chris@19	1601 disables most SIMD support). If possible, it is probably better for
Chris@19	1602 you to simply create multiple plans (creating a new plan is quick once
Chris@19	1603 one exists for a given size), or better yet re-use the same array for
Chris@19	1604 your transforms.
Chris@19	1605
Chris@19	1606 @findex fftw_alignment_of
Chris@19	1607 For rare circumstances in which you cannot control the alignment of
Chris@19	1608 allocated memory, but wish to determine where a given array is
Chris@19	1609 aligned like the original array for which a plan was created, you can
Chris@19	1610 use the @code{fftw_alignment_of} function:
Chris@19	1611 @example
Chris@19	1612 int fftw_alignment_of(double *p);
Chris@19	1613 @end example
Chris@19	1614 Two arrays have equivalent alignment (for the purposes of applying a
Chris@19	1615 plan) if and only if @code{fftw_alignment_of} returns the same value
Chris@19	1616 for the corresponding pointers to their data (typecast to @code{double*}
Chris@19	1617 if necessary).
Chris@19	1618
Chris@19	1619 If you are tempted to use the new-array execute interface because you
Chris@19	1620 want to transform a known bunch of arrays of the same size, you should
Chris@19	1621 probably go use the advanced interface instead (@pxref{Advanced
Chris@19	1622 Interface})).
Chris@19	1623
Chris@19	1624 The new-array execute functions are:
Chris@19	1625
Chris@19	1626 @example
Chris@19	1627 void fftw_execute_dft(
Chris@19	1628 const fftw_plan p,
Chris@19	1629 fftw_complex in, fftw_complex out);
Chris@19	1630
Chris@19	1631 void fftw_execute_split_dft(
Chris@19	1632 const fftw_plan p,
Chris@19	1633 double ri, double ii, double ro, double io);
Chris@19	1634
Chris@19	1635 void fftw_execute_dft_r2c(
Chris@19	1636 const fftw_plan p,
Chris@19	1637 double in, fftw_complex out);
Chris@19	1638
Chris@19	1639 void fftw_execute_split_dft_r2c(
Chris@19	1640 const fftw_plan p,
Chris@19	1641 double in, double ro, double *io);
Chris@19	1642
Chris@19	1643 void fftw_execute_dft_c2r(
Chris@19	1644 const fftw_plan p,
Chris@19	1645 fftw_complex in, double out);
Chris@19	1646
Chris@19	1647 void fftw_execute_split_dft_c2r(
Chris@19	1648 const fftw_plan p,
Chris@19	1649 double ri, double ii, double *out);
Chris@19	1650
Chris@19	1651 void fftw_execute_r2r(
Chris@19	1652 const fftw_plan p,
Chris@19	1653 double in, double out);
Chris@19	1654 @end example
Chris@19	1655 @findex fftw_execute_dft
Chris@19	1656 @findex fftw_execute_split_dft
Chris@19	1657 @findex fftw_execute_dft_r2c
Chris@19	1658 @findex fftw_execute_split_dft_r2c
Chris@19	1659 @findex fftw_execute_dft_c2r
Chris@19	1660 @findex fftw_execute_split_dft_c2r
Chris@19	1661 @findex fftw_execute_r2r
Chris@19	1662
Chris@19	1663 These execute the @code{plan} to compute the corresponding transform on
Chris@19	1664 the input/output arrays specified by the subsequent arguments. The
Chris@19	1665 input/output array arguments have the same meanings as the ones passed
Chris@19	1666 to the guru planner routines in the preceding sections. The @code{plan}
Chris@19	1667 is not modified, and these routines can be called as many times as
Chris@19	1668 desired, or intermixed with calls to the ordinary @code{fftw_execute}.
Chris@19	1669
Chris@19	1670 The @code{plan} @emph{must} have been created for the transform type
Chris@19	1671 corresponding to the execute function, e.g. it must be a complex-DFT
Chris@19	1672 plan for @code{fftw_execute_dft}. Any of the planner routines for that
Chris@19	1673 transform type, from the basic to the guru interface, could have been
Chris@19	1674 used to create the plan, however.
Chris@19	1675
Chris@19	1676 @c ------------------------------------------------------------
Chris@19	1677 @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference
Chris@19	1678 @section Wisdom
Chris@19	1679 @cindex wisdom
Chris@19	1680 @cindex saving plans to disk
Chris@19	1681
Chris@19	1682 This section documents the FFTW mechanism for saving and restoring
Chris@19	1683 plans from disk. This mechanism is called @dfn{wisdom}.
Chris@19	1684
Chris@19	1685 @menu
Chris@19	1686 * Wisdom Export::
Chris@19	1687 * Wisdom Import::
Chris@19	1688 * Forgetting Wisdom::
Chris@19	1689 * Wisdom Utilities::
Chris@19	1690 @end menu
Chris@19	1691
Chris@19	1692 @c =========>
Chris@19	1693 @node Wisdom Export, Wisdom Import, Wisdom, Wisdom
Chris@19	1694 @subsection Wisdom Export
Chris@19	1695
Chris@19	1696 @example
Chris@19	1697 int fftw_export_wisdom_to_filename(const char *filename);
Chris@19	1698 void fftw_export_wisdom_to_file(FILE *output_file);
Chris@19	1699 char *fftw_export_wisdom_to_string(void);
Chris@19	1700 void fftw_export_wisdom(void (write_char)(char c, void ), void *data);
Chris@19	1701 @end example
Chris@19	1702 @findex fftw_export_wisdom
Chris@19	1703 @findex fftw_export_wisdom_to_filename
Chris@19	1704 @findex fftw_export_wisdom_to_file
Chris@19	1705 @findex fftw_export_wisdom_to_string
Chris@19	1706
Chris@19	1707 These functions allow you to export all currently accumulated wisdom
Chris@19	1708 in a form from which it can be later imported and restored, even
Chris@19	1709 during a separate run of the program. (@xref{Words of Wisdom-Saving
Chris@19	1710 Plans}.) The current store of wisdom is not affected by calling any
Chris@19	1711 of these routines.
Chris@19	1712
Chris@19	1713 @code{fftw_export_wisdom} exports the wisdom to any output
Chris@19	1714 medium, as specified by the callback function
Chris@19	1715 @code{write_char}. @code{write_char} is a @code{putc}-like function that
Chris@19	1716 writes the character @code{c} to some output; its second parameter is
Chris@19	1717 the @code{data} pointer passed to @code{fftw_export_wisdom}. For
Chris@19	1718 convenience, the following three ``wrapper'' routines are provided:
Chris@19	1719
Chris@19	1720 @code{fftw_export_wisdom_to_filename} writes wisdom to a file named
Chris@19	1721 @code{filename} (which is created or overwritten), returning @code{1}
Chris@19	1722 on success and @code{0} on failure. A lower-level function, which
Chris@19	1723 requires you to open and close the file yourself (e.g. if you want to
Chris@19	1724 write wisdom to a portion of a larger file) is
Chris@19	1725 @code{fftw_export_wisdom_to_file}. This writes the wisdom to the
Chris@19	1726 current position in @code{output_file}, which should be open with
Chris@19	1727 write permission; upon exit, the file remains open and is positioned
Chris@19	1728 at the end of the wisdom data.
Chris@19	1729
Chris@19	1730 @code{fftw_export_wisdom_to_string} returns a pointer to a
Chris@19	1731 @code{NULL}-terminated string holding the wisdom data. This string is
Chris@19	1732 dynamically allocated, and it is the responsibility of the caller to
Chris@19	1733 deallocate it with @code{free} when it is no longer needed.
Chris@19	1734
Chris@19	1735 All of these routines export the wisdom in the same format, which we
Chris@19	1736 will not document here except to say that it is LISP-like ASCII text
Chris@19	1737 that is insensitive to white space.
Chris@19	1738
Chris@19	1739 @c =========>
Chris@19	1740 @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom
Chris@19	1741 @subsection Wisdom Import
Chris@19	1742
Chris@19	1743 @example
Chris@19	1744 int fftw_import_system_wisdom(void);
Chris@19	1745 int fftw_import_wisdom_from_filename(const char *filename);
Chris@19	1746 int fftw_import_wisdom_from_string(const char *input_string);
Chris@19	1747 int fftw_import_wisdom(int (read_char)(void ), void *data);
Chris@19	1748 @end example
Chris@19	1749 @findex fftw_import_wisdom
Chris@19	1750 @findex fftw_import_system_wisdom
Chris@19	1751 @findex fftw_import_wisdom_from_filename
Chris@19	1752 @findex fftw_import_wisdom_from_file
Chris@19	1753 @findex fftw_import_wisdom_from_string
Chris@19	1754
Chris@19	1755 These functions import wisdom into a program from data stored by the
Chris@19	1756 @code{fftw_export_wisdom} functions above. (@xref{Words of
Chris@19	1757 Wisdom-Saving Plans}.) The imported wisdom replaces any wisdom
Chris@19	1758 already accumulated by the running program.
Chris@19	1759
Chris@19	1760 @code{fftw_import_wisdom} imports wisdom from any input medium, as
Chris@19	1761 specified by the callback function @code{read_char}. @code{read_char} is
Chris@19	1762 a @code{getc}-like function that returns the next character in the
Chris@19	1763 input; its parameter is the @code{data} pointer passed to
Chris@19	1764 @code{fftw_import_wisdom}. If the end of the input data is reached
Chris@19	1765 (which should never happen for valid data), @code{read_char} should
Chris@19	1766 return @code{EOF} (as defined in @code{<stdio.h>}). For convenience,
Chris@19	1767 the following three ``wrapper'' routines are provided:
Chris@19	1768
Chris@19	1769 @code{fftw_import_wisdom_from_filename} reads wisdom from a file named
Chris@19	1770 @code{filename}. A lower-level function, which requires you to open
Chris@19	1771 and close the file yourself (e.g. if you want to read wisdom from a
Chris@19	1772 portion of a larger file) is @code{fftw_import_wisdom_from_file}. This
Chris@19	1773 reads wisdom from the current position in @code{input_file} (which
Chris@19	1774 should be open with read permission); upon exit, the file remains
Chris@19	1775 open, but the position of the read pointer is unspecified.
Chris@19	1776
Chris@19	1777 @code{fftw_import_wisdom_from_string} reads wisdom from the
Chris@19	1778 @code{NULL}-terminated string @code{input_string}.
Chris@19	1779
Chris@19	1780 @code{fftw_import_system_wisdom} reads wisdom from an
Chris@19	1781 implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix
Chris@19	1782 and GNU systems).
Chris@19	1783 @cindex wisdom, system-wide
Chris@19	1784
Chris@19	1785
Chris@19	1786 The return value of these import routines is @code{1} if the wisdom was
Chris@19	1787 read successfully and @code{0} otherwise. Note that, in all of these
Chris@19	1788 functions, any data in the input stream past the end of the wisdom data
Chris@19	1789 is simply ignored.
Chris@19	1790
Chris@19	1791 @c =========>
Chris@19	1792 @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom
Chris@19	1793 @subsection Forgetting Wisdom
Chris@19	1794
Chris@19	1795 @example
Chris@19	1796 void fftw_forget_wisdom(void);
Chris@19	1797 @end example
Chris@19	1798 @findex fftw_forget_wisdom
Chris@19	1799
Chris@19	1800 Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom}
Chris@19	1801 to be discarded and its associated memory to be freed. (New
Chris@19	1802 @code{wisdom} can still be gathered subsequently, however.)
Chris@19	1803
Chris@19	1804 @c =========>
Chris@19	1805 @node Wisdom Utilities, , Forgetting Wisdom, Wisdom
Chris@19	1806 @subsection Wisdom Utilities
Chris@19	1807
Chris@19	1808 FFTW includes two standalone utility programs that deal with wisdom. We
Chris@19	1809 merely summarize them here, since they come with their own @code{man}
Chris@19	1810 pages for Unix and GNU systems (with HTML versions on our web site).
Chris@19	1811
Chris@19	1812 The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in
Chris@19	1813 single precision, etcetera), which can be used to create a wisdom file
Chris@19	1814 containing plans for any of the transform sizes and types supported by
Chris@19	1815 FFTW. It is preferable to create wisdom directly from your executable
Chris@19	1816 (@pxref{Caveats in Using Wisdom}), but this program is useful for
Chris@19	1817 creating global wisdom files for @code{fftw_import_system_wisdom}.
Chris@19	1818 @cindex fftw-wisdom utility
Chris@19	1819
Chris@19	1820
Chris@19	1821 The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom
Chris@19	1822 file as input and produces a @dfn{configuration routine} as output. The
Chris@19	1823 latter is a C subroutine that you can compile and link into your
Chris@19	1824 program, replacing a routine of the same name in the FFTW library, that
Chris@19	1825 determines which parts of FFTW are callable by your program.
Chris@19	1826 @code{fftw-wisdom-to-conf} produces a configuration routine that links
Chris@19	1827 to only those parts of FFTW needed by the saved plans in the wisdom,
Chris@19	1828 greatly reducing the size of statically linked executables (which should
Chris@19	1829 only attempt to create plans corresponding to those in the wisdom,
Chris@19	1830 however).
Chris@19	1831 @cindex fftw-wisdom-to-conf utility
Chris@19	1832 @cindex configuration routines
Chris@19	1833
Chris@19	1834 @c ------------------------------------------------------------
Chris@19	1835 @node What FFTW Really Computes, , Wisdom, FFTW Reference
Chris@19	1836 @section What FFTW Really Computes
Chris@19	1837
Chris@19	1838 In this section, we provide precise mathematical definitions for the
Chris@19	1839 transforms that FFTW computes. These transform definitions are fairly
Chris@19	1840 standard, but some authors follow slightly different conventions for the
Chris@19	1841 normalization of the transform (the constant factor in front) and the
Chris@19	1842 sign of the complex exponent. We begin by presenting the
Chris@19	1843 one-dimensional (1d) transform definitions, and then give the
Chris@19	1844 straightforward extension to multi-dimensional transforms.
Chris@19	1845
Chris@19	1846 @menu
Chris@19	1847 * The 1d Discrete Fourier Transform (DFT)::
Chris@19	1848 * The 1d Real-data DFT::
Chris@19	1849 * 1d Real-even DFTs (DCTs)::
Chris@19	1850 * 1d Real-odd DFTs (DSTs)::
Chris@19	1851 * 1d Discrete Hartley Transforms (DHTs)::
Chris@19	1852 * Multi-dimensional Transforms::
Chris@19	1853 @end menu
Chris@19	1854
Chris@19	1855 @c =========>
Chris@19	1856 @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes
Chris@19	1857 @subsection The 1d Discrete Fourier Transform (DFT)
Chris@19	1858
Chris@19	1859 @cindex discrete Fourier transform
Chris@19	1860 @cindex DFT
Chris@19	1861 The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a
Chris@19	1862 1d complex array @math{X} of size @math{n} computes an array @math{Y},
Chris@19	1863 where:
Chris@19	1864 @tex
Chris@19	1865 $$
Chris@19	1866 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
Chris@19	1867 $$
Chris@19	1868 @end tex
Chris@19	1869 @ifinfo
Chris@19	1870 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
Chris@19	1871 @end ifinfo
Chris@19	1872 @html
Chris@19	1873 <center><img src="equation-dft.png" align="top">.</center>
Chris@19	1874 @end html
Chris@19	1875 The backward (@code{FFTW_BACKWARD}) DFT computes:
Chris@19	1876 @tex
Chris@19	1877 $$
Chris@19	1878 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
Chris@19	1879 $$
Chris@19	1880 @end tex
Chris@19	1881 @ifinfo
Chris@19	1882 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
Chris@19	1883 @end ifinfo
Chris@19	1884 @html
Chris@19	1885 <center><img src="equation-idft.png" align="top">.</center>
Chris@19	1886 @end html
Chris@19	1887
Chris@19	1888 @cindex normalization
Chris@19	1889 FFTW computes an unnormalized transform, in that there is no coefficient
Chris@19	1890 in front of the summation in the DFT. In other words, applying the
Chris@19	1891 forward and then the backward transform will multiply the input by
Chris@19	1892 @math{n}.
Chris@19	1893
Chris@19	1894 @cindex frequency
Chris@19	1895 From above, an @code{FFTW_FORWARD} transform corresponds to a sign of
Chris@19	1896 @math{-1} in the exponent of the DFT. Note also that we use the
Chris@19	1897 standard ``in-order'' output ordering---the @math{k}-th output
Chris@19	1898 corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T}
Chris@19	1899 is your total sampling period). For those who like to think in terms of
Chris@19	1900 positive and negative frequencies, this means that the positive
Chris@19	1901 frequencies are stored in the first half of the output and the negative
Chris@19	1902 frequencies are stored in backwards order in the second half of the
Chris@19	1903 output. (The frequency @math{-k/n} is the same as the frequency
Chris@19	1904 @math{(n-k)/n}.)
Chris@19	1905
Chris@19	1906 @c =========>
Chris@19	1907 @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes
Chris@19	1908 @subsection The 1d Real-data DFT
Chris@19	1909
Chris@19	1910 The real-input (r2c) DFT in FFTW computes the @emph{forward} transform
Chris@19	1911 @math{Y} of the size @code{n} real array @math{X}, exactly as defined
Chris@19	1912 above, i.e.
Chris@19	1913 @tex
Chris@19	1914 $$
Chris@19	1915 Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ .
Chris@19	1916 $$
Chris@19	1917 @end tex
Chris@19	1918 @ifinfo
Chris@19	1919 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) .
Chris@19	1920 @end ifinfo
Chris@19	1921 @html
Chris@19	1922 <center><img src="equation-dft.png" align="top">.</center>
Chris@19	1923 @end html
Chris@19	1924 This output array @math{Y} can easily be shown to possess the
Chris@19	1925 ``Hermitian'' symmetry
Chris@19	1926 @cindex Hermitian
Chris@19	1927 @tex
Chris@19	1928 $Y_k = Y_{n-k}^*$,
Chris@19	1929 @end tex
Chris@19	1930 @ifinfo
Chris@19	1931 Y[k] = Y[n-k]*,
Chris@19	1932 @end ifinfo
Chris@19	1933 @html
Chris@19	1934 <i>Y<sub>k</sub> = Y<sub>n-k</sub></i><sup>*</sup>,
Chris@19	1935 @end html
Chris@19	1936 where we take @math{Y} to be periodic so that
Chris@19	1937 @tex
Chris@19	1938 $Y_n = Y_0$.
Chris@19	1939 @end tex
Chris@19	1940 @ifinfo
Chris@19	1941 Y[n] = Y[0].
Chris@19	1942 @end ifinfo
Chris@19	1943 @html
Chris@19	1944 <i>Y<sub>n</sub> = Y</i><sub>0</sub>.
Chris@19	1945 @end html
Chris@19	1946
Chris@19	1947 As a result of this symmetry, half of the output @math{Y} is redundant
Chris@19	1948 (being the complex conjugate of the other half), and so the 1d r2c
Chris@19	1949 transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y}
Chris@19	1950 (@math{n/2+1} complex numbers), where the division by @math{2} is
Chris@19	1951 rounded down.
Chris@19	1952
Chris@19	1953 Moreover, the Hermitian symmetry implies that
Chris@19	1954 @tex
Chris@19	1955 $Y_0$
Chris@19	1956 @end tex
Chris@19	1957 @ifinfo
Chris@19	1958 Y[0]
Chris@19	1959 @end ifinfo
Chris@19	1960 @html
Chris@19	1961 <i>Y</i><sub>0</sub>
Chris@19	1962 @end html
Chris@19	1963 and, if @math{n} is even, the
Chris@19	1964 @tex
Chris@19	1965 $Y_{n/2}$
Chris@19	1966 @end tex
Chris@19	1967 @ifinfo
Chris@19	1968 Y[n/2]
Chris@19	1969 @end ifinfo
Chris@19	1970 @html
Chris@19	1971 <i>Y</i><sub><i>n</i>/2</sub>
Chris@19	1972 @end html
Chris@19	1973 element, are purely real. So, for the @code{R2HC} r2r transform, these
Chris@19	1974 elements are not stored in the halfcomplex output format.
Chris@19	1975 @cindex r2r
Chris@19	1976 @ctindex R2HC
Chris@19	1977 @cindex halfcomplex format
Chris@19	1978
Chris@19	1979
Chris@19	1980 The c2r and @code{H2RC} r2r transforms compute the backward DFT of the
Chris@19	1981 @emph{complex} array @math{X} with Hermitian symmetry, stored in the
Chris@19	1982 r2c/@code{R2HC} output formats, respectively, where the backward
Chris@19	1983 transform is defined exactly as for the complex case:
Chris@19	1984 @tex
Chris@19	1985 $$
Chris@19	1986 Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ .
Chris@19	1987 $$
Chris@19	1988 @end tex
Chris@19	1989 @ifinfo
Chris@19	1990 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) .
Chris@19	1991 @end ifinfo
Chris@19	1992 @html
Chris@19	1993 <center><img src="equation-idft.png" align="top">.</center>
Chris@19	1994 @end html
Chris@19	1995 The outputs @code{Y} of this transform can easily be seen to be purely
Chris@19	1996 real, and are stored as an array of real numbers.
Chris@19	1997
Chris@19	1998 @cindex normalization
Chris@19	1999 Like FFTW's complex DFT, these transforms are unnormalized. In other
Chris@19	2000 words, applying the real-to-complex (forward) and then the
Chris@19	2001 complex-to-real (backward) transform will multiply the input by
Chris@19	2002 @math{n}.
Chris@19	2003
Chris@19	2004 @c =========>
Chris@19	2005 @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes
Chris@19	2006 @subsection 1d Real-even DFTs (DCTs)
Chris@19	2007
Chris@19	2008 The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized
Chris@19	2009 forward (and backward) DFTs as defined above, where the input array
Chris@19	2010 @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry. In
Chris@19	2011 this case, the output array is likewise real and even symmetry.
Chris@19	2012 @cindex real-even DFT
Chris@19	2013 @cindex REDFT
Chris@19	2014
Chris@19	2015
Chris@19	2016 @ctindex REDFT00
Chris@19	2017 For the case of @code{REDFT00}, this even symmetry means that
Chris@19	2018 @tex
Chris@19	2019 $X_j = X_{N-j}$,
Chris@19	2020 @end tex
Chris@19	2021 @ifinfo
Chris@19	2022 X[j] = X[N-j],
Chris@19	2023 @end ifinfo
Chris@19	2024 @html
Chris@19	2025 <i>X<sub>j</sub> = X<sub>N-j</sub></i>,
Chris@19	2026 @end html
Chris@19	2027 where we take @math{X} to be periodic so that
Chris@19	2028 @tex
Chris@19	2029 $X_N = X_0$.
Chris@19	2030 @end tex
Chris@19	2031 @ifinfo
Chris@19	2032 X[N] = X[0].
Chris@19	2033 @end ifinfo
Chris@19	2034 @html
Chris@19	2035 <i>X<sub>N</sub> = X</i><sub>0</sub>.
Chris@19	2036 @end html
Chris@19	2037 Because of this redundancy, only the first @math{n} real numbers are
Chris@19	2038 actually stored, where @math{N = 2(n-1)}.
Chris@19	2039
Chris@19	2040 The proper definition of even symmetry for @code{REDFT10},
Chris@19	2041 @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate
Chris@19	2042 because of the shifts by @math{1/2} of the input and/or output, although
Chris@19	2043 the corresponding boundary conditions are given in @ref{Real even/odd
Chris@19	2044 DFTs (cosine/sine transforms)}. Because of the even symmetry, however,
Chris@19	2045 the sine terms in the DFT all cancel and the remaining cosine terms are
Chris@19	2046 written explicitly below. This formulation often leads people to call
Chris@19	2047 such a transform a @dfn{discrete cosine transform} (DCT), although it is
Chris@19	2048 really just a special case of the DFT.
Chris@19	2049 @cindex discrete cosine transform
Chris@19	2050 @cindex DCT
Chris@19	2051
Chris@19	2052
Chris@19	2053 In each of the definitions below, we transform a real array @math{X} of
Chris@19	2054 length @math{n} to a real array @math{Y} of length @math{n}:
Chris@19	2055
Chris@19	2056 @subsubheading REDFT00 (DCT-I)
Chris@19	2057 @ctindex REDFT00
Chris@19	2058 An @code{REDFT00} transform (type-I DCT) in FFTW is defined by:
Chris@19	2059 @tex
Chris@19	2060 $$
Chris@19	2061 Y_k = X_0 + (-1)^k X_{n-1}
Chris@19	2062 + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)].
Chris@19	2063 $$
Chris@19	2064 @end tex
Chris@19	2065 @ifinfo
Chris@19	2066 Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))).
Chris@19	2067 @end ifinfo
Chris@19	2068 @html
Chris@19	2069 <center><img src="equation-redft00.png" align="top">.</center>
Chris@19	2070 @end html
Chris@19	2071 Note that this transform is not defined for @math{n=1}. For @math{n=2},
Chris@19	2072 the summation term above is dropped as you might expect.
Chris@19	2073
Chris@19	2074 @subsubheading REDFT10 (DCT-II)
Chris@19	2075 @ctindex REDFT10
Chris@19	2076 An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by:
Chris@19	2077 @tex
Chris@19	2078 $$
Chris@19	2079 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n].
Chris@19	2080 $$
Chris@19	2081 @end tex
Chris@19	2082 @ifinfo
Chris@19	2083 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)).
Chris@19	2084 @end ifinfo
Chris@19	2085 @html
Chris@19	2086 <center><img src="equation-redft10.png" align="top">.</center>
Chris@19	2087 @end html
Chris@19	2088
Chris@19	2089 @subsubheading REDFT01 (DCT-III)
Chris@19	2090 @ctindex REDFT01
Chris@19	2091 An @code{REDFT01} transform (type-III DCT) in FFTW is defined by:
Chris@19	2092 @tex
Chris@19	2093 $$
Chris@19	2094 Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n].
Chris@19	2095 $$
Chris@19	2096 @end tex
Chris@19	2097 @ifinfo
Chris@19	2098 Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)).
Chris@19	2099 @end ifinfo
Chris@19	2100 @html
Chris@19	2101 <center><img src="equation-redft01.png" align="top">.</center>
Chris@19	2102 @end html
Chris@19	2103 In the case of @math{n=1}, this reduces to
Chris@19	2104 @tex
Chris@19	2105 $Y_0 = X_0$.
Chris@19	2106 @end tex
Chris@19	2107 @ifinfo
Chris@19	2108 Y[0] = X[0].
Chris@19	2109 @end ifinfo
Chris@19	2110 @html
Chris@19	2111 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
Chris@19	2112 @end html
Chris@19	2113 Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''.
Chris@19	2114 @cindex IDCT
Chris@19	2115
Chris@19	2116 @subsubheading REDFT11 (DCT-IV)
Chris@19	2117 @ctindex REDFT11
Chris@19	2118 An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by:
Chris@19	2119 @tex
Chris@19	2120 $$
Chris@19	2121 Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n].
Chris@19	2122 $$
Chris@19	2123 @end tex
Chris@19	2124 @ifinfo
Chris@19	2125 Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)).
Chris@19	2126 @end ifinfo
Chris@19	2127 @html
Chris@19	2128 <center><img src="equation-redft11.png" align="top">.</center>
Chris@19	2129 @end html
Chris@19	2130
Chris@19	2131 @subsubheading Inverses and Normalization
Chris@19	2132
Chris@19	2133 These definitions correspond directly to the unnormalized DFTs used
Chris@19	2134 elsewhere in FFTW (hence the factors of @math{2} in front of the
Chris@19	2135 summations). The unnormalized inverse of @code{REDFT00} is
Chris@19	2136 @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and
Chris@19	2137 of @code{REDFT11} is @code{REDFT11}. Each unnormalized inverse results
Chris@19	2138 in the original array multiplied by @math{N}, where @math{N} is the
Chris@19	2139 @emph{logical} DFT size. For @code{REDFT00}, @math{N=2(n-1)} (note that
Chris@19	2140 @math{n=1} is not defined); otherwise, @math{N=2n}.
Chris@19	2141 @cindex normalization
Chris@19	2142
Chris@19	2143
Chris@19	2144 In defining the discrete cosine transform, some authors also include
Chris@19	2145 additional factors of
Chris@19	2146 @ifinfo
Chris@19	2147 sqrt(2)
Chris@19	2148 @end ifinfo
Chris@19	2149 @html
Chris@19	2150 √2
Chris@19	2151 @end html
Chris@19	2152 @tex
Chris@19	2153 $\sqrt{2}$
Chris@19	2154 @end tex
Chris@19	2155 (or its inverse) multiplying selected inputs and/or outputs. This is a
Chris@19	2156 mostly cosmetic change that makes the transform orthogonal, but
Chris@19	2157 sacrifices the direct equivalence to a symmetric DFT.
Chris@19	2158
Chris@19	2159 @c =========>
Chris@19	2160 @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes
Chris@19	2161 @subsection 1d Real-odd DFTs (DSTs)
Chris@19	2162
Chris@19	2163 The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized
Chris@19	2164 forward (and backward) DFTs as defined above, where the input array
Chris@19	2165 @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry. In
Chris@19	2166 this case, the output is odd symmetry and purely imaginary.
Chris@19	2167 @cindex real-odd DFT
Chris@19	2168 @cindex RODFT
Chris@19	2169
Chris@19	2170
Chris@19	2171 @ctindex RODFT00
Chris@19	2172 For the case of @code{RODFT00}, this odd symmetry means that
Chris@19	2173 @tex
Chris@19	2174 $X_j = -X_{N-j}$,
Chris@19	2175 @end tex
Chris@19	2176 @ifinfo
Chris@19	2177 X[j] = -X[N-j],
Chris@19	2178 @end ifinfo
Chris@19	2179 @html
Chris@19	2180 <i>X<sub>j</sub> = -X<sub>N-j</sub></i>,
Chris@19	2181 @end html
Chris@19	2182 where we take @math{X} to be periodic so that
Chris@19	2183 @tex
Chris@19	2184 $X_N = X_0$.
Chris@19	2185 @end tex
Chris@19	2186 @ifinfo
Chris@19	2187 X[N] = X[0].
Chris@19	2188 @end ifinfo
Chris@19	2189 @html
Chris@19	2190 <i>X<sub>N</sub> = X</i><sub>0</sub>.
Chris@19	2191 @end html
Chris@19	2192 Because of this redundancy, only the first @math{n} real numbers
Chris@19	2193 starting at @math{j=1} are actually stored (the @math{j=0} element is
Chris@19	2194 zero), where @math{N = 2(n+1)}.
Chris@19	2195
Chris@19	2196 The proper definition of odd symmetry for @code{RODFT10},
Chris@19	2197 @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate
Chris@19	2198 because of the shifts by @math{1/2} of the input and/or output, although
Chris@19	2199 the corresponding boundary conditions are given in @ref{Real even/odd
Chris@19	2200 DFTs (cosine/sine transforms)}. Because of the odd symmetry, however,
Chris@19	2201 the cosine terms in the DFT all cancel and the remaining sine terms are
Chris@19	2202 written explicitly below. This formulation often leads people to call
Chris@19	2203 such a transform a @dfn{discrete sine transform} (DST), although it is
Chris@19	2204 really just a special case of the DFT.
Chris@19	2205 @cindex discrete sine transform
Chris@19	2206 @cindex DST
Chris@19	2207
Chris@19	2208
Chris@19	2209 In each of the definitions below, we transform a real array @math{X} of
Chris@19	2210 length @math{n} to a real array @math{Y} of length @math{n}:
Chris@19	2211
Chris@19	2212 @subsubheading RODFT00 (DST-I)
Chris@19	2213 @ctindex RODFT00
Chris@19	2214 An @code{RODFT00} transform (type-I DST) in FFTW is defined by:
Chris@19	2215 @tex
Chris@19	2216 $$
Chris@19	2217 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)].
Chris@19	2218 $$
Chris@19	2219 @end tex
Chris@19	2220 @ifinfo
Chris@19	2221 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))).
Chris@19	2222 @end ifinfo
Chris@19	2223 @html
Chris@19	2224 <center><img src="equation-rodft00.png" align="top">.</center>
Chris@19	2225 @end html
Chris@19	2226
Chris@19	2227 @subsubheading RODFT10 (DST-II)
Chris@19	2228 @ctindex RODFT10
Chris@19	2229 An @code{RODFT10} transform (type-II DST) in FFTW is defined by:
Chris@19	2230 @tex
Chris@19	2231 $$
Chris@19	2232 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n].
Chris@19	2233 $$
Chris@19	2234 @end tex
Chris@19	2235 @ifinfo
Chris@19	2236 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)).
Chris@19	2237 @end ifinfo
Chris@19	2238 @html
Chris@19	2239 <center><img src="equation-rodft10.png" align="top">.</center>
Chris@19	2240 @end html
Chris@19	2241
Chris@19	2242 @subsubheading RODFT01 (DST-III)
Chris@19	2243 @ctindex RODFT01
Chris@19	2244 An @code{RODFT01} transform (type-III DST) in FFTW is defined by:
Chris@19	2245 @tex
Chris@19	2246 $$
Chris@19	2247 Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n].
Chris@19	2248 $$
Chris@19	2249 @end tex
Chris@19	2250 @ifinfo
Chris@19	2251 Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)).
Chris@19	2252 @end ifinfo
Chris@19	2253 @html
Chris@19	2254 <center><img src="equation-rodft01.png" align="top">.</center>
Chris@19	2255 @end html
Chris@19	2256 In the case of @math{n=1}, this reduces to
Chris@19	2257 @tex
Chris@19	2258 $Y_0 = X_0$.
Chris@19	2259 @end tex
Chris@19	2260 @ifinfo
Chris@19	2261 Y[0] = X[0].
Chris@19	2262 @end ifinfo
Chris@19	2263 @html
Chris@19	2264 <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>.
Chris@19	2265 @end html
Chris@19	2266
Chris@19	2267 @subsubheading RODFT11 (DST-IV)
Chris@19	2268 @ctindex RODFT11
Chris@19	2269 An @code{RODFT11} transform (type-IV DST) in FFTW is defined by:
Chris@19	2270 @tex
Chris@19	2271 $$
Chris@19	2272 Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n].
Chris@19	2273 $$
Chris@19	2274 @end tex
Chris@19	2275 @ifinfo
Chris@19	2276 Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)).
Chris@19	2277 @end ifinfo
Chris@19	2278 @html
Chris@19	2279 <center><img src="equation-rodft11.png" align="top">.</center>
Chris@19	2280 @end html
Chris@19	2281
Chris@19	2282 @subsubheading Inverses and Normalization
Chris@19	2283
Chris@19	2284 These definitions correspond directly to the unnormalized DFTs used
Chris@19	2285 elsewhere in FFTW (hence the factors of @math{2} in front of the
Chris@19	2286 summations). The unnormalized inverse of @code{RODFT00} is
Chris@19	2287 @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and
Chris@19	2288 of @code{RODFT11} is @code{RODFT11}. Each unnormalized inverse results
Chris@19	2289 in the original array multiplied by @math{N}, where @math{N} is the
Chris@19	2290 @emph{logical} DFT size. For @code{RODFT00}, @math{N=2(n+1)};
Chris@19	2291 otherwise, @math{N=2n}.
Chris@19	2292 @cindex normalization
Chris@19	2293
Chris@19	2294
Chris@19	2295 In defining the discrete sine transform, some authors also include
Chris@19	2296 additional factors of
Chris@19	2297 @ifinfo
Chris@19	2298 sqrt(2)
Chris@19	2299 @end ifinfo
Chris@19	2300 @html
Chris@19	2301 √2
Chris@19	2302 @end html
Chris@19	2303 @tex
Chris@19	2304 $\sqrt{2}$
Chris@19	2305 @end tex
Chris@19	2306 (or its inverse) multiplying selected inputs and/or outputs. This is a
Chris@19	2307 mostly cosmetic change that makes the transform orthogonal, but
Chris@19	2308 sacrifices the direct equivalence to an antisymmetric DFT.
Chris@19	2309
Chris@19	2310 @c =========>
Chris@19	2311 @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes
Chris@19	2312 @subsection 1d Discrete Hartley Transforms (DHTs)
Chris@19	2313
Chris@19	2314 @cindex discrete Hartley transform
Chris@19	2315 @cindex DHT
Chris@19	2316 The discrete Hartley transform (DHT) of a 1d real array @math{X} of size
Chris@19	2317 @math{n} computes a real array @math{Y} of the same size, where:
Chris@19	2318 @tex
Chris@19	2319 $$
Chris@19	2320 Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)].
Chris@19	2321 $$
Chris@19	2322 @end tex
Chris@19	2323 @ifinfo
Chris@19	2324 @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)].
Chris@19	2325 @end ifinfo
Chris@19	2326 @html
Chris@19	2327 <center><img src="equation-dht.png" align="top">.</center>
Chris@19	2328 @end html
Chris@19	2329
Chris@19	2330 @cindex normalization
Chris@19	2331 FFTW computes an unnormalized transform, in that there is no coefficient
Chris@19	2332 in front of the summation in the DHT. In other words, applying the
Chris@19	2333 transform twice (the DHT is its own inverse) will multiply the input by
Chris@19	2334 @math{n}.
Chris@19	2335
Chris@19	2336 @c =========>
Chris@19	2337 @node Multi-dimensional Transforms, , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes
Chris@19	2338 @subsection Multi-dimensional Transforms
Chris@19	2339
Chris@19	2340 The multi-dimensional transforms of FFTW, in general, compute simply the
Chris@19	2341 separable product of the given 1d transform along each dimension of the
Chris@19	2342 array. Since each of these transforms is unnormalized, computing the
Chris@19	2343 forward followed by the backward/inverse multi-dimensional transform
Chris@19	2344 will result in the original array scaled by the product of the
Chris@19	2345 normalization factors for each dimension (e.g. the product of the
Chris@19	2346 dimension sizes, for a multi-dimensional DFT).
Chris@19	2347
Chris@19	2348 @tex
Chris@19	2349 As an explicit example, consider the following exact mathematical
Chris@19	2350 definition of our multi-dimensional DFT. Let $X$ be a $d$-dimensional
Chris@19	2351 complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0
Chris@19	2352 \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$. Let also
Chris@19	2353 $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d
Chris@19	2354 \}$.
Chris@19	2355
Chris@19	2356 The forward transform computes a complex array~$Y$, whose
Chris@19	2357 structure is the same as that of~$X$, defined by
Chris@19	2358
Chris@19	2359 $$
Chris@19	2360 Y[k_1, k_2, \ldots, k_d] =
Chris@19	2361 \sum_{j_1 = 0}^{n_1 - 1}
Chris@19	2362 \sum_{j_2 = 0}^{n_2 - 1}
Chris@19	2363 \cdots
Chris@19	2364 \sum_{j_d = 0}^{n_d - 1}
Chris@19	2365 X[j_1, j_2, \ldots, j_d]
Chris@19	2366 \omega_1^{-j_1 k_1}
Chris@19	2367 \omega_2^{-j_2 k_2}
Chris@19	2368 \cdots
Chris@19	2369 \omega_d^{-j_d k_d} \ .
Chris@19	2370 $$
Chris@19	2371
Chris@19	2372 The backward transform computes
Chris@19	2373 $$
Chris@19	2374 Y[k_1, k_2, \ldots, k_d] =
Chris@19	2375 \sum_{j_1 = 0}^{n_1 - 1}
Chris@19	2376 \sum_{j_2 = 0}^{n_2 - 1}
Chris@19	2377 \cdots
Chris@19	2378 \sum_{j_d = 0}^{n_d - 1}
Chris@19	2379 X[j_1, j_2, \ldots, j_d]
Chris@19	2380 \omega_1^{j_1 k_1}
Chris@19	2381 \omega_2^{j_2 k_2}
Chris@19	2382 \cdots
Chris@19	2383 \omega_d^{j_d k_d} \ .
Chris@19	2384 $$
Chris@19	2385
Chris@19	2386 Computing the forward transform followed by the backward transform
Chris@19	2387 will multiply the array by $\prod_{s=1}^{d} n_d$.
Chris@19	2388 @end tex
Chris@19	2389
Chris@19	2390 @cindex r2c
Chris@19	2391 The definition of FFTW's multi-dimensional DFT of real data (r2c)
Chris@19	2392 deserves special attention. In this case, we logically compute the full
Chris@19	2393 multi-dimensional DFT of the input data; since the input data are purely
Chris@19	2394 real, the output data have the Hermitian symmetry and therefore only one
Chris@19	2395 non-redundant half need be stored. More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array
Chris@19	2396 @tex
Chris@19	2397 $Y[k_0, k_1, \ldots, k_{d-1}]$
Chris@19	2398 @end tex
Chris@19	2399 @html
Chris@19	2400 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
Chris@19	2401 <i>k</i><sub><i>d-1</i></sub>]
Chris@19	2402 @end html
Chris@19	2403 @ifinfo
Chris@19	2404 Y[k[0], k[1], ..., k[d-1]]
Chris@19	2405 @end ifinfo
Chris@19	2406 has the symmetry:
Chris@19	2407 @tex
Chris@19	2408 $$
Chris@19	2409 Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^*
Chris@19	2410 $$
Chris@19	2411 @end tex
Chris@19	2412 @html
Chris@19	2413 <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ...,
Chris@19	2414 <i>k</i><sub><i>d-1</i></sub>] = <i>Y</i>[<i>n</i><sub>0</sub> -
Chris@19	2415 <i>k</i><sub>0</sub>, <i>n</i><sub>1</sub> - <i>k</i><sub>1</sub>, ...,
Chris@19	2416 <i>n</i><sub><i>d-1</i></sub> - <i>k</i><sub><i>d-1</i></sub>]<sup>*</sup>
Chris@19	2417 @end html
Chris@19	2418 @ifinfo
Chris@19	2419 Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]*
Chris@19	2420 @end ifinfo
Chris@19	2421 (where each dimension is periodic). Because of this symmetry, we only
Chris@19	2422 store the
Chris@19	2423 @tex
Chris@19	2424 $k_{d-1} = 0 \cdots n_{d-1}/2$
Chris@19	2425 @end tex
Chris@19	2426 @html
Chris@19	2427 <i>k</i><sub><i>d-1</i></sub> = 0...<i>n</i><sub><i>d-1</i></sub>/2+1
Chris@19	2428 @end html
Chris@19	2429 @ifinfo
Chris@19	2430 k[d-1] = 0...n[d-1]/2
Chris@19	2431 @end ifinfo
Chris@19	2432 elements of the @emph{last} dimension (division by @math{2} is rounded
Chris@19	2433 down). (We could instead have cut any other dimension in half, but the
Chris@19	2434 last dimension proved computationally convenient.) This results in the
Chris@19	2435 peculiar array format described in more detail by @ref{Real-data DFT
Chris@19	2436 Array Format}.
Chris@19	2437
Chris@19	2438 The multi-dimensional c2r transform is simply the unnormalized inverse
Chris@19	2439 of the r2c transform. i.e. it is the same as FFTW's complex backward
Chris@19	2440 multi-dimensional DFT, operating on a Hermitian input array in the
Chris@19	2441 peculiar format mentioned above and outputting a real array (since the
Chris@19	2442 DFT output is purely real).
Chris@19	2443
Chris@19	2444 We should remind the user that the separable product of 1d transforms
Chris@19	2445 along each dimension, as computed by FFTW, is not always the same thing
Chris@19	2446 as the usual multi-dimensional transform. A multi-dimensional
Chris@19	2447 @code{R2HC} (or @code{HC2R}) transform is not identical to the
Chris@19	2448 multi-dimensional DFT, requiring some post-processing to combine the
Chris@19	2449 requisite real and imaginary parts, as was described in @ref{The
Chris@19	2450 Halfcomplex-format DFT}. Likewise, FFTW's multidimensional
Chris@19	2451 @code{FFTW_DHT} r2r transform is not the same thing as the logical
Chris@19	2452 multi-dimensional discrete Hartley transform defined in the literature,
Chris@19	2453 as discussed in @ref{The Discrete Hartley Transform}.
Chris@19	2454

Mercurial > hg > js-dsp-test

annotate fft/fftw/fftw-3.3.4/doc/reference.texi @ 40:223f770b5341 kissfft-double tip