sv-dependency-builds: src/fftw-3.3.5/doc/tutorial.texi annotate

annotate src/fftw-3.3.5/doc/tutorial.texi @ 169:223a55898ab9 tip default

Add null config files

author	Chris Cannam <cannam@all-day-breakfast.com>
date	Mon, 02 Mar 2020 14:03:47 +0000
parents	7867fa7e1b6b
children

rev	line source
cannam@127	1 @node Tutorial, Other Important Topics, Introduction, Top
cannam@127	2 @chapter Tutorial
cannam@127	3 @menu
cannam@127	4 * Complex One-Dimensional DFTs::
cannam@127	5 * Complex Multi-Dimensional DFTs::
cannam@127	6 * One-Dimensional DFTs of Real Data::
cannam@127	7 * Multi-Dimensional DFTs of Real Data::
cannam@127	8 * More DFTs of Real Data::
cannam@127	9 @end menu
cannam@127	10
cannam@127	11 This chapter describes the basic usage of FFTW, i.e., how to compute
cannam@127	12 @cindex basic interface
cannam@127	13 the Fourier transform of a single array. This chapter tells the
cannam@127	14 truth, but not the @emph{whole} truth. Specifically, FFTW implements
cannam@127	15 additional routines and flags that are not documented here, although
cannam@127	16 in many cases we try to indicate where added capabilities exist. For
cannam@127	17 more complete information, see @ref{FFTW Reference}. (Note that you
cannam@127	18 need to compile and install FFTW before you can use it in a program.
cannam@127	19 For the details of the installation, see @ref{Installation and
cannam@127	20 Customization}.)
cannam@127	21
cannam@127	22 We recommend that you read this tutorial in order.@footnote{You can
cannam@127	23 read the tutorial in bit-reversed order after computing your first
cannam@127	24 transform.} At the least, read the first section (@pxref{Complex
cannam@127	25 One-Dimensional DFTs}) before reading any of the others, even if your
cannam@127	26 main interest lies in one of the other transform types.
cannam@127	27
cannam@127	28 Users of FFTW version 2 and earlier may also want to read @ref{Upgrading
cannam@127	29 from FFTW version 2}.
cannam@127	30
cannam@127	31 @c ------------------------------------------------------------
cannam@127	32 @node Complex One-Dimensional DFTs, Complex Multi-Dimensional DFTs, Tutorial, Tutorial
cannam@127	33 @section Complex One-Dimensional DFTs
cannam@127	34
cannam@127	35 @quotation
cannam@127	36 Plan: To bother about the best method of accomplishing an accidental result.
cannam@127	37 [Ambrose Bierce, @cite{The Enlarged Devil's Dictionary}.]
cannam@127	38 @cindex Devil
cannam@127	39 @end quotation
cannam@127	40
cannam@127	41 @iftex
cannam@127	42 @medskip
cannam@127	43 @end iftex
cannam@127	44
cannam@127	45 The basic usage of FFTW to compute a one-dimensional DFT of size
cannam@127	46 @code{N} is simple, and it typically looks something like this code:
cannam@127	47
cannam@127	48 @example
cannam@127	49 #include <fftw3.h>
cannam@127	50 ...
cannam@127	51 @{
cannam@127	52 fftw_complex in, out;
cannam@127	53 fftw_plan p;
cannam@127	54 ...
cannam@127	55 in = (fftw_complex) fftw_malloc(sizeof(fftw_complex) N);
cannam@127	56 out = (fftw_complex) fftw_malloc(sizeof(fftw_complex) N);
cannam@127	57 p = fftw_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
cannam@127	58 ...
cannam@127	59 fftw_execute(p); /* @r{repeat as needed} */
cannam@127	60 ...
cannam@127	61 fftw_destroy_plan(p);
cannam@127	62 fftw_free(in); fftw_free(out);
cannam@127	63 @}
cannam@127	64 @end example
cannam@127	65
cannam@127	66 You must link this code with the @code{fftw3} library. On Unix systems,
cannam@127	67 link with @code{-lfftw3 -lm}.
cannam@127	68
cannam@127	69 The example code first allocates the input and output arrays. You can
cannam@127	70 allocate them in any way that you like, but we recommend using
cannam@127	71 @code{fftw_malloc}, which behaves like
cannam@127	72 @findex fftw_malloc
cannam@127	73 @code{malloc} except that it properly aligns the array when SIMD
cannam@127	74 instructions (such as SSE and Altivec) are available (@pxref{SIMD
cannam@127	75 alignment and fftw_malloc}). [Alternatively, we provide a convenient wrapper function @code{fftw_alloc_complex(N)} which has the same effect.]
cannam@127	76 @findex fftw_alloc_complex
cannam@127	77 @cindex SIMD
cannam@127	78
cannam@127	79
cannam@127	80 The data is an array of type @code{fftw_complex}, which is by default a
cannam@127	81 @code{double[2]} composed of the real (@code{in[i][0]}) and imaginary
cannam@127	82 (@code{in[i][1]}) parts of a complex number.
cannam@127	83 @tindex fftw_complex
cannam@127	84
cannam@127	85 The next step is to create a @dfn{plan}, which is an object
cannam@127	86 @cindex plan
cannam@127	87 that contains all the data that FFTW needs to compute the FFT.
cannam@127	88 This function creates the plan:
cannam@127	89
cannam@127	90 @example
cannam@127	91 fftw_plan fftw_plan_dft_1d(int n, fftw_complex in, fftw_complex out,
cannam@127	92 int sign, unsigned flags);
cannam@127	93 @end example
cannam@127	94 @findex fftw_plan_dft_1d
cannam@127	95 @tindex fftw_plan
cannam@127	96
cannam@127	97 The first argument, @code{n}, is the size of the transform you are
cannam@127	98 trying to compute. The size @code{n} can be any positive integer, but
cannam@127	99 sizes that are products of small factors are transformed most
cannam@127	100 efficiently (although prime sizes still use an @Onlogn{} algorithm).
cannam@127	101
cannam@127	102 The next two arguments are pointers to the input and output arrays of
cannam@127	103 the transform. These pointers can be equal, indicating an
cannam@127	104 @dfn{in-place} transform.
cannam@127	105 @cindex in-place
cannam@127	106
cannam@127	107
cannam@127	108 The fourth argument, @code{sign}, can be either @code{FFTW_FORWARD}
cannam@127	109 (@code{-1}) or @code{FFTW_BACKWARD} (@code{+1}),
cannam@127	110 @ctindex FFTW_FORWARD
cannam@127	111 @ctindex FFTW_BACKWARD
cannam@127	112 and indicates the direction of the transform you are interested in;
cannam@127	113 technically, it is the sign of the exponent in the transform.
cannam@127	114
cannam@127	115 The @code{flags} argument is usually either @code{FFTW_MEASURE} or
cannam@127	116 @cindex flags
cannam@127	117 @code{FFTW_ESTIMATE}. @code{FFTW_MEASURE} instructs FFTW to run
cannam@127	118 @ctindex FFTW_MEASURE
cannam@127	119 and measure the execution time of several FFTs in order to find the
cannam@127	120 best way to compute the transform of size @code{n}. This process takes
cannam@127	121 some time (usually a few seconds), depending on your machine and on
cannam@127	122 the size of the transform. @code{FFTW_ESTIMATE}, on the contrary,
cannam@127	123 does not run any computation and just builds a
cannam@127	124 @ctindex FFTW_ESTIMATE
cannam@127	125 reasonable plan that is probably sub-optimal. In short, if your
cannam@127	126 program performs many transforms of the same size and initialization
cannam@127	127 time is not important, use @code{FFTW_MEASURE}; otherwise use the
cannam@127	128 estimate.
cannam@127	129
cannam@127	130 @emph{You must create the plan before initializing the input}, because
cannam@127	131 @code{FFTW_MEASURE} overwrites the @code{in}/@code{out} arrays.
cannam@127	132 (Technically, @code{FFTW_ESTIMATE} does not touch your arrays, but you
cannam@127	133 should always create plans first just to be sure.)
cannam@127	134
cannam@127	135 Once the plan has been created, you can use it as many times as you
cannam@127	136 like for transforms on the specified @code{in}/@code{out} arrays,
cannam@127	137 computing the actual transforms via @code{fftw_execute(plan)}:
cannam@127	138 @example
cannam@127	139 void fftw_execute(const fftw_plan plan);
cannam@127	140 @end example
cannam@127	141 @findex fftw_execute
cannam@127	142
cannam@127	143 The DFT results are stored in-order in the array @code{out}, with the
cannam@127	144 zero-frequency (DC) component in @code{out[0]}.
cannam@127	145 @cindex frequency
cannam@127	146 If @code{in != out}, the transform is @dfn{out-of-place} and the input
cannam@127	147 array @code{in} is not modified. Otherwise, the input array is
cannam@127	148 overwritten with the transform.
cannam@127	149
cannam@127	150 @cindex execute
cannam@127	151 If you want to transform a @emph{different} array of the same size, you
cannam@127	152 can create a new plan with @code{fftw_plan_dft_1d} and FFTW
cannam@127	153 automatically reuses the information from the previous plan, if
cannam@127	154 possible. Alternatively, with the ``guru'' interface you can apply a
cannam@127	155 given plan to a different array, if you are careful.
cannam@127	156 @xref{FFTW Reference}.
cannam@127	157
cannam@127	158 When you are done with the plan, you deallocate it by calling
cannam@127	159 @code{fftw_destroy_plan(plan)}:
cannam@127	160 @example
cannam@127	161 void fftw_destroy_plan(fftw_plan plan);
cannam@127	162 @end example
cannam@127	163 @findex fftw_destroy_plan
cannam@127	164 If you allocate an array with @code{fftw_malloc()} you must deallocate
cannam@127	165 it with @code{fftw_free()}. Do not use @code{free()} or, heaven
cannam@127	166 forbid, @code{delete}.
cannam@127	167 @findex fftw_free
cannam@127	168
cannam@127	169 FFTW computes an @emph{unnormalized} DFT. Thus, computing a forward
cannam@127	170 followed by a backward transform (or vice versa) results in the original
cannam@127	171 array scaled by @code{n}. For the definition of the DFT, see @ref{What
cannam@127	172 FFTW Really Computes}.
cannam@127	173 @cindex DFT
cannam@127	174 @cindex normalization
cannam@127	175
cannam@127	176
cannam@127	177 If you have a C compiler, such as @code{gcc}, that supports the
cannam@127	178 C99 standard, and you @code{#include <complex.h>} @emph{before}
cannam@127	179 @code{<fftw3.h>}, then @code{fftw_complex} is the native
cannam@127	180 double-precision complex type and you can manipulate it with ordinary
cannam@127	181 arithmetic. Otherwise, FFTW defines its own complex type, which is
cannam@127	182 bit-compatible with the C99 complex type. @xref{Complex numbers}.
cannam@127	183 (The C++ @code{<complex>} template class may also be usable via a
cannam@127	184 typecast.)
cannam@127	185 @cindex C++
cannam@127	186
cannam@127	187 To use single or long-double precision versions of FFTW, replace the
cannam@127	188 @code{fftw_} prefix by @code{fftwf_} or @code{fftwl_} and link with
cannam@127	189 @code{-lfftw3f} or @code{-lfftw3l}, but use the @emph{same}
cannam@127	190 @code{<fftw3.h>} header file.
cannam@127	191 @cindex precision
cannam@127	192
cannam@127	193
cannam@127	194 Many more flags exist besides @code{FFTW_MEASURE} and
cannam@127	195 @code{FFTW_ESTIMATE}. For example, use @code{FFTW_PATIENT} if you're
cannam@127	196 willing to wait even longer for a possibly even faster plan (@pxref{FFTW
cannam@127	197 Reference}).
cannam@127	198 @ctindex FFTW_PATIENT
cannam@127	199 You can also save plans for future use, as described by @ref{Words of
cannam@127	200 Wisdom-Saving Plans}.
cannam@127	201
cannam@127	202 @c ------------------------------------------------------------
cannam@127	203 @node Complex Multi-Dimensional DFTs, One-Dimensional DFTs of Real Data, Complex One-Dimensional DFTs, Tutorial
cannam@127	204 @section Complex Multi-Dimensional DFTs
cannam@127	205
cannam@127	206 Multi-dimensional transforms work much the same way as one-dimensional
cannam@127	207 transforms: you allocate arrays of @code{fftw_complex} (preferably
cannam@127	208 using @code{fftw_malloc}), create an @code{fftw_plan}, execute it as
cannam@127	209 many times as you want with @code{fftw_execute(plan)}, and clean up
cannam@127	210 with @code{fftw_destroy_plan(plan)} (and @code{fftw_free}).
cannam@127	211
cannam@127	212 FFTW provides two routines for creating plans for 2d and 3d transforms,
cannam@127	213 and one routine for creating plans of arbitrary dimensionality.
cannam@127	214 The 2d and 3d routines have the following signature:
cannam@127	215 @example
cannam@127	216 fftw_plan fftw_plan_dft_2d(int n0, int n1,
cannam@127	217 fftw_complex in, fftw_complex out,
cannam@127	218 int sign, unsigned flags);
cannam@127	219 fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2,
cannam@127	220 fftw_complex in, fftw_complex out,
cannam@127	221 int sign, unsigned flags);
cannam@127	222 @end example
cannam@127	223 @findex fftw_plan_dft_2d
cannam@127	224 @findex fftw_plan_dft_3d
cannam@127	225
cannam@127	226 These routines create plans for @code{n0} by @code{n1} two-dimensional
cannam@127	227 (2d) transforms and @code{n0} by @code{n1} by @code{n2} 3d transforms,
cannam@127	228 respectively. All of these transforms operate on contiguous arrays in
cannam@127	229 the C-standard @dfn{row-major} order, so that the last dimension has the
cannam@127	230 fastest-varying index in the array. This layout is described further in
cannam@127	231 @ref{Multi-dimensional Array Format}.
cannam@127	232
cannam@127	233 FFTW can also compute transforms of higher dimensionality. In order to
cannam@127	234 avoid confusion between the various meanings of the the word
cannam@127	235 ``dimension'', we use the term @emph{rank}
cannam@127	236 @cindex rank
cannam@127	237 to denote the number of independent indices in an array.@footnote{The
cannam@127	238 term ``rank'' is commonly used in the APL, FORTRAN, and Common Lisp
cannam@127	239 traditions, although it is not so common in the C@tie{}world.} For
cannam@127	240 example, we say that a 2d transform has rank@tie{}2, a 3d transform has
cannam@127	241 rank@tie{}3, and so on. You can plan transforms of arbitrary rank by
cannam@127	242 means of the following function:
cannam@127	243
cannam@127	244 @example
cannam@127	245 fftw_plan fftw_plan_dft(int rank, const int *n,
cannam@127	246 fftw_complex in, fftw_complex out,
cannam@127	247 int sign, unsigned flags);
cannam@127	248 @end example
cannam@127	249 @findex fftw_plan_dft
cannam@127	250
cannam@127	251 Here, @code{n} is a pointer to an array @code{n[rank]} denoting an
cannam@127	252 @code{n[0]} by @code{n[1]} by @dots{} by @code{n[rank-1]} transform.
cannam@127	253 Thus, for example, the call
cannam@127	254 @example
cannam@127	255 fftw_plan_dft_2d(n0, n1, in, out, sign, flags);
cannam@127	256 @end example
cannam@127	257 is equivalent to the following code fragment:
cannam@127	258 @example
cannam@127	259 int n[2];
cannam@127	260 n[0] = n0;
cannam@127	261 n[1] = n1;
cannam@127	262 fftw_plan_dft(2, n, in, out, sign, flags);
cannam@127	263 @end example
cannam@127	264 @code{fftw_plan_dft} is not restricted to 2d and 3d transforms,
cannam@127	265 however, but it can plan transforms of arbitrary rank.
cannam@127	266
cannam@127	267 You may have noticed that all the planner routines described so far
cannam@127	268 have overlapping functionality. For example, you can plan a 1d or 2d
cannam@127	269 transform by using @code{fftw_plan_dft} with a @code{rank} of @code{1}
cannam@127	270 or @code{2}, or even by calling @code{fftw_plan_dft_3d} with @code{n0}
cannam@127	271 and/or @code{n1} equal to @code{1} (with no loss in efficiency). This
cannam@127	272 pattern continues, and FFTW's planning routines in general form a
cannam@127	273 ``partial order,'' sequences of
cannam@127	274 @cindex partial order
cannam@127	275 interfaces with strictly increasing generality but correspondingly
cannam@127	276 greater complexity.
cannam@127	277
cannam@127	278 @code{fftw_plan_dft} is the most general complex-DFT routine that we
cannam@127	279 describe in this tutorial, but there are also the advanced and guru interfaces,
cannam@127	280 @cindex advanced interface
cannam@127	281 @cindex guru interface
cannam@127	282 which allow one to efficiently combine multiple/strided transforms
cannam@127	283 into a single FFTW plan, transform a subset of a larger
cannam@127	284 multi-dimensional array, and/or to handle more general complex-number
cannam@127	285 formats. For more information, see @ref{FFTW Reference}.
cannam@127	286
cannam@127	287 @c ------------------------------------------------------------
cannam@127	288 @node One-Dimensional DFTs of Real Data, Multi-Dimensional DFTs of Real Data, Complex Multi-Dimensional DFTs, Tutorial
cannam@127	289 @section One-Dimensional DFTs of Real Data
cannam@127	290
cannam@127	291 In many practical applications, the input data @code{in[i]} are purely
cannam@127	292 real numbers, in which case the DFT output satisfies the ``Hermitian''
cannam@127	293 @cindex Hermitian
cannam@127	294 redundancy: @code{out[i]} is the conjugate of @code{out[n-i]}. It is
cannam@127	295 possible to take advantage of these circumstances in order to achieve
cannam@127	296 roughly a factor of two improvement in both speed and memory usage.
cannam@127	297
cannam@127	298 In exchange for these speed and space advantages, the user sacrifices
cannam@127	299 some of the simplicity of FFTW's complex transforms. First of all, the
cannam@127	300 input and output arrays are of @emph{different sizes and types}: the
cannam@127	301 input is @code{n} real numbers, while the output is @code{n/2+1}
cannam@127	302 complex numbers (the non-redundant outputs); this also requires slight
cannam@127	303 ``padding'' of the input array for
cannam@127	304 @cindex padding
cannam@127	305 in-place transforms. Second, the inverse transform (complex to real)
cannam@127	306 has the side-effect of @emph{overwriting its input array}, by default.
cannam@127	307 Neither of these inconveniences should pose a serious problem for
cannam@127	308 users, but it is important to be aware of them.
cannam@127	309
cannam@127	310 The routines to perform real-data transforms are almost the same as
cannam@127	311 those for complex transforms: you allocate arrays of @code{double}
cannam@127	312 and/or @code{fftw_complex} (preferably using @code{fftw_malloc} or
cannam@127	313 @code{fftw_alloc_complex}), create an @code{fftw_plan}, execute it as
cannam@127	314 many times as you want with @code{fftw_execute(plan)}, and clean up
cannam@127	315 with @code{fftw_destroy_plan(plan)} (and @code{fftw_free}). The only
cannam@127	316 differences are that the input (or output) is of type @code{double}
cannam@127	317 and there are new routines to create the plan. In one dimension:
cannam@127	318
cannam@127	319 @example
cannam@127	320 fftw_plan fftw_plan_dft_r2c_1d(int n, double in, fftw_complex out,
cannam@127	321 unsigned flags);
cannam@127	322 fftw_plan fftw_plan_dft_c2r_1d(int n, fftw_complex in, double out,
cannam@127	323 unsigned flags);
cannam@127	324 @end example
cannam@127	325 @findex fftw_plan_dft_r2c_1d
cannam@127	326 @findex fftw_plan_dft_c2r_1d
cannam@127	327
cannam@127	328 for the real input to complex-Hermitian output (@dfn{r2c}) and
cannam@127	329 complex-Hermitian input to real output (@dfn{c2r}) transforms.
cannam@127	330 @cindex r2c
cannam@127	331 @cindex c2r
cannam@127	332 Unlike the complex DFT planner, there is no @code{sign} argument.
cannam@127	333 Instead, r2c DFTs are always @code{FFTW_FORWARD} and c2r DFTs are
cannam@127	334 always @code{FFTW_BACKWARD}.
cannam@127	335 @ctindex FFTW_FORWARD
cannam@127	336 @ctindex FFTW_BACKWARD
cannam@127	337 (For single/long-double precision
cannam@127	338 @code{fftwf} and @code{fftwl}, @code{double} should be replaced by
cannam@127	339 @code{float} and @code{long double}, respectively.)
cannam@127	340 @cindex precision
cannam@127	341
cannam@127	342
cannam@127	343 Here, @code{n} is the ``logical'' size of the DFT, not necessarily the
cannam@127	344 physical size of the array. In particular, the real (@code{double})
cannam@127	345 array has @code{n} elements, while the complex (@code{fftw_complex})
cannam@127	346 array has @code{n/2+1} elements (where the division is rounded down).
cannam@127	347 For an in-place transform,
cannam@127	348 @cindex in-place
cannam@127	349 @code{in} and @code{out} are aliased to the same array, which must be
cannam@127	350 big enough to hold both; so, the real array would actually have
cannam@127	351 @code{2*(n/2+1)} elements, where the elements beyond the first
cannam@127	352 @code{n} are unused padding. (Note that this is very different from
cannam@127	353 the concept of ``zero-padding'' a transform to a larger length, which
cannam@127	354 changes the logical size of the DFT by actually adding new input
cannam@127	355 data.) The @math{k}th element of the complex array is exactly the
cannam@127	356 same as the @math{k}th element of the corresponding complex DFT. All
cannam@127	357 positive @code{n} are supported; products of small factors are most
cannam@127	358 efficient, but an @Onlogn algorithm is used even for prime sizes.
cannam@127	359
cannam@127	360 As noted above, the c2r transform destroys its input array even for
cannam@127	361 out-of-place transforms. This can be prevented, if necessary, by
cannam@127	362 including @code{FFTW_PRESERVE_INPUT} in the @code{flags}, with
cannam@127	363 unfortunately some sacrifice in performance.
cannam@127	364 @cindex flags
cannam@127	365 @ctindex FFTW_PRESERVE_INPUT
cannam@127	366 This flag is also not currently supported for multi-dimensional real
cannam@127	367 DFTs (next section).
cannam@127	368
cannam@127	369 Readers familiar with DFTs of real data will recall that the 0th (the
cannam@127	370 ``DC'') and @code{n/2}-th (the ``Nyquist'' frequency, when @code{n} is
cannam@127	371 even) elements of the complex output are purely real. Some
cannam@127	372 implementations therefore store the Nyquist element where the DC
cannam@127	373 imaginary part would go, in order to make the input and output arrays
cannam@127	374 the same size. Such packing, however, does not generalize well to
cannam@127	375 multi-dimensional transforms, and the space savings are miniscule in
cannam@127	376 any case; FFTW does not support it.
cannam@127	377
cannam@127	378 An alternative interface for one-dimensional r2c and c2r DFTs can be
cannam@127	379 found in the @samp{r2r} interface (@pxref{The Halfcomplex-format
cannam@127	380 DFT}), with ``halfcomplex''-format output that @emph{is} the same size
cannam@127	381 (and type) as the input array.
cannam@127	382 @cindex halfcomplex format
cannam@127	383 That interface, although it is not very useful for multi-dimensional
cannam@127	384 transforms, may sometimes yield better performance.
cannam@127	385
cannam@127	386 @c ------------------------------------------------------------
cannam@127	387 @node Multi-Dimensional DFTs of Real Data, More DFTs of Real Data, One-Dimensional DFTs of Real Data, Tutorial
cannam@127	388 @section Multi-Dimensional DFTs of Real Data
cannam@127	389
cannam@127	390 Multi-dimensional DFTs of real data use the following planner routines:
cannam@127	391
cannam@127	392 @example
cannam@127	393 fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1,
cannam@127	394 double in, fftw_complex out,
cannam@127	395 unsigned flags);
cannam@127	396 fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2,
cannam@127	397 double in, fftw_complex out,
cannam@127	398 unsigned flags);
cannam@127	399 fftw_plan fftw_plan_dft_r2c(int rank, const int *n,
cannam@127	400 double in, fftw_complex out,
cannam@127	401 unsigned flags);
cannam@127	402 @end example
cannam@127	403 @findex fftw_plan_dft_r2c_2d
cannam@127	404 @findex fftw_plan_dft_r2c_3d
cannam@127	405 @findex fftw_plan_dft_r2c
cannam@127	406
cannam@127	407 as well as the corresponding @code{c2r} routines with the input/output
cannam@127	408 types swapped. These routines work similarly to their complex
cannam@127	409 analogues, except for the fact that here the complex output array is cut
cannam@127	410 roughly in half and the real array requires padding for in-place
cannam@127	411 transforms (as in 1d, above).
cannam@127	412
cannam@127	413 As before, @code{n} is the logical size of the array, and the
cannam@127	414 consequences of this on the the format of the complex arrays deserve
cannam@127	415 careful attention.
cannam@127	416 @cindex r2c/c2r multi-dimensional array format
cannam@127	417 Suppose that the real data has dimensions @ndims (in row-major order).
cannam@127	418 Then, after an r2c transform, the output is an @ndimshalf array of
cannam@127	419 @code{fftw_complex} values in row-major order, corresponding to slightly
cannam@127	420 over half of the output of the corresponding complex DFT. (The division
cannam@127	421 is rounded down.) The ordering of the data is otherwise exactly the
cannam@127	422 same as in the complex-DFT case.
cannam@127	423
cannam@127	424 For out-of-place transforms, this is the end of the story: the real
cannam@127	425 data is stored as a row-major array of size @ndims and the complex
cannam@127	426 data is stored as a row-major array of size @ndimshalf{}.
cannam@127	427
cannam@127	428 For in-place transforms, however, extra padding of the real-data array
cannam@127	429 is necessary because the complex array is larger than the real array,
cannam@127	430 and the two arrays share the same memory locations. Thus, for
cannam@127	431 in-place transforms, the final dimension of the real-data array must
cannam@127	432 be padded with extra values to accommodate the size of the complex
cannam@127	433 data---two values if the last dimension is even and one if it is odd.
cannam@127	434 @cindex padding
cannam@127	435 That is, the last dimension of the real data must physically contain
cannam@127	436 @tex
cannam@127	437 $2 (n_{d-1}/2+1)$
cannam@127	438 @end tex
cannam@127	439 @ifinfo
cannam@127	440 2 * (n[d-1]/2+1)
cannam@127	441 @end ifinfo
cannam@127	442 @html
cannam@127	443 2 * (n<sub>d-1</sub>/2+1)
cannam@127	444 @end html
cannam@127	445 @code{double} values (exactly enough to hold the complex data).
cannam@127	446 This physical array size does not, however, change the @emph{logical}
cannam@127	447 array size---only
cannam@127	448 @tex
cannam@127	449 $n_{d-1}$
cannam@127	450 @end tex
cannam@127	451 @ifinfo
cannam@127	452 n[d-1]
cannam@127	453 @end ifinfo
cannam@127	454 @html
cannam@127	455 n<sub>d-1</sub>
cannam@127	456 @end html
cannam@127	457 values are actually stored in the last dimension, and
cannam@127	458 @tex
cannam@127	459 $n_{d-1}$
cannam@127	460 @end tex
cannam@127	461 @ifinfo
cannam@127	462 n[d-1]
cannam@127	463 @end ifinfo
cannam@127	464 @html
cannam@127	465 n<sub>d-1</sub>
cannam@127	466 @end html
cannam@127	467 is the last dimension passed to the plan-creation routine.
cannam@127	468
cannam@127	469 For example, consider the transform of a two-dimensional real array of
cannam@127	470 size @code{n0} by @code{n1}. The output of the r2c transform is a
cannam@127	471 two-dimensional complex array of size @code{n0} by @code{n1/2+1}, where
cannam@127	472 the @code{y} dimension has been cut nearly in half because of
cannam@127	473 redundancies in the output. Because @code{fftw_complex} is twice the
cannam@127	474 size of @code{double}, the output array is slightly bigger than the
cannam@127	475 input array. Thus, if we want to compute the transform in place, we
cannam@127	476 must @emph{pad} the input array so that it is of size @code{n0} by
cannam@127	477 @code{2*(n1/2+1)}. If @code{n1} is even, then there are two padding
cannam@127	478 elements at the end of each row (which need not be initialized, as they
cannam@127	479 are only used for output).
cannam@127	480
cannam@127	481 @ifhtml
cannam@127	482 The following illustration depicts the input and output arrays just
cannam@127	483 described, for both the out-of-place and in-place transforms (with the
cannam@127	484 arrows indicating consecutive memory locations):
cannam@127	485 @image{rfftwnd-for-html}
cannam@127	486 @end ifhtml
cannam@127	487 @ifnotinfo
cannam@127	488 @ifnothtml
cannam@127	489 @float Figure,fig:rfftwnd
cannam@127	490 @center @image{rfftwnd}
cannam@127	491 @caption{Illustration of the data layout for a 2d @code{nx} by @code{ny}
cannam@127	492 real-to-complex transform.}
cannam@127	493 @end float
cannam@127	494 @ref{fig:rfftwnd} depicts the input and output arrays just
cannam@127	495 described, for both the out-of-place and in-place transforms (with the
cannam@127	496 arrows indicating consecutive memory locations):
cannam@127	497 @end ifnothtml
cannam@127	498 @end ifnotinfo
cannam@127	499
cannam@127	500 These transforms are unnormalized, so an r2c followed by a c2r
cannam@127	501 transform (or vice versa) will result in the original data scaled by
cannam@127	502 the number of real data elements---that is, the product of the
cannam@127	503 (logical) dimensions of the real data.
cannam@127	504 @cindex normalization
cannam@127	505
cannam@127	506
cannam@127	507 (Because the last dimension is treated specially, if it is equal to
cannam@127	508 @code{1} the transform is @emph{not} equivalent to a lower-dimensional
cannam@127	509 r2c/c2r transform. In that case, the last complex dimension also has
cannam@127	510 size @code{1} (@code{=1/2+1}), and no advantage is gained over the
cannam@127	511 complex transforms.)
cannam@127	512
cannam@127	513 @c ------------------------------------------------------------
cannam@127	514 @node More DFTs of Real Data, , Multi-Dimensional DFTs of Real Data, Tutorial
cannam@127	515 @section More DFTs of Real Data
cannam@127	516 @menu
cannam@127	517 * The Halfcomplex-format DFT::
cannam@127	518 * Real even/odd DFTs (cosine/sine transforms)::
cannam@127	519 * The Discrete Hartley Transform::
cannam@127	520 @end menu
cannam@127	521
cannam@127	522 FFTW supports several other transform types via a unified @dfn{r2r}
cannam@127	523 (real-to-real) interface,
cannam@127	524 @cindex r2r
cannam@127	525 so called because it takes a real (@code{double}) array and outputs a
cannam@127	526 real array of the same size. These r2r transforms currently fall into
cannam@127	527 three categories: DFTs of real input and complex-Hermitian output in
cannam@127	528 halfcomplex format, DFTs of real input with even/odd symmetry
cannam@127	529 (a.k.a. discrete cosine/sine transforms, DCTs/DSTs), and discrete
cannam@127	530 Hartley transforms (DHTs), all described in more detail by the
cannam@127	531 following sections.
cannam@127	532
cannam@127	533 The r2r transforms follow the by now familiar interface of creating an
cannam@127	534 @code{fftw_plan}, executing it with @code{fftw_execute(plan)}, and
cannam@127	535 destroying it with @code{fftw_destroy_plan(plan)}. Furthermore, all
cannam@127	536 r2r transforms share the same planner interface:
cannam@127	537
cannam@127	538 @example
cannam@127	539 fftw_plan fftw_plan_r2r_1d(int n, double in, double out,
cannam@127	540 fftw_r2r_kind kind, unsigned flags);
cannam@127	541 fftw_plan fftw_plan_r2r_2d(int n0, int n1, double in, double out,
cannam@127	542 fftw_r2r_kind kind0, fftw_r2r_kind kind1,
cannam@127	543 unsigned flags);
cannam@127	544 fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2,
cannam@127	545 double in, double out,
cannam@127	546 fftw_r2r_kind kind0,
cannam@127	547 fftw_r2r_kind kind1,
cannam@127	548 fftw_r2r_kind kind2,
cannam@127	549 unsigned flags);
cannam@127	550 fftw_plan fftw_plan_r2r(int rank, const int n, double in, double *out,
cannam@127	551 const fftw_r2r_kind *kind, unsigned flags);
cannam@127	552 @end example
cannam@127	553 @findex fftw_plan_r2r_1d
cannam@127	554 @findex fftw_plan_r2r_2d
cannam@127	555 @findex fftw_plan_r2r_3d
cannam@127	556 @findex fftw_plan_r2r
cannam@127	557
cannam@127	558 Just as for the complex DFT, these plan 1d/2d/3d/multi-dimensional
cannam@127	559 transforms for contiguous arrays in row-major order, transforming (real)
cannam@127	560 input to output of the same size, where @code{n} specifies the
cannam@127	561 @emph{physical} dimensions of the arrays. All positive @code{n} are
cannam@127	562 supported (with the exception of @code{n=1} for the @code{FFTW_REDFT00}
cannam@127	563 kind, noted in the real-even subsection below); products of small
cannam@127	564 factors are most efficient (factorizing @code{n-1} and @code{n+1} for
cannam@127	565 @code{FFTW_REDFT00} and @code{FFTW_RODFT00} kinds, described below), but
cannam@127	566 an @Onlogn algorithm is used even for prime sizes.
cannam@127	567
cannam@127	568 Each dimension has a @dfn{kind} parameter, of type
cannam@127	569 @code{fftw_r2r_kind}, specifying the kind of r2r transform to be used
cannam@127	570 for that dimension.
cannam@127	571 @cindex kind (r2r)
cannam@127	572 @tindex fftw_r2r_kind
cannam@127	573 (In the case of @code{fftw_plan_r2r}, this is an array @code{kind[rank]}
cannam@127	574 where @code{kind[i]} is the transform kind for the dimension
cannam@127	575 @code{n[i]}.) The kind can be one of a set of predefined constants,
cannam@127	576 defined in the following subsections.
cannam@127	577
cannam@127	578 In other words, FFTW computes the separable product of the specified
cannam@127	579 r2r transforms over each dimension, which can be used e.g. for partial
cannam@127	580 differential equations with mixed boundary conditions. (For some r2r
cannam@127	581 kinds, notably the halfcomplex DFT and the DHT, such a separable
cannam@127	582 product is somewhat problematic in more than one dimension, however,
cannam@127	583 as is described below.)
cannam@127	584
cannam@127	585 In the current version of FFTW, all r2r transforms except for the
cannam@127	586 halfcomplex type are computed via pre- or post-processing of
cannam@127	587 halfcomplex transforms, and they are therefore not as fast as they
cannam@127	588 could be. Since most other general DCT/DST codes employ a similar
cannam@127	589 algorithm, however, FFTW's implementation should provide at least
cannam@127	590 competitive performance.
cannam@127	591
cannam@127	592 @c =========>
cannam@127	593 @node The Halfcomplex-format DFT, Real even/odd DFTs (cosine/sine transforms), More DFTs of Real Data, More DFTs of Real Data
cannam@127	594 @subsection The Halfcomplex-format DFT
cannam@127	595
cannam@127	596 An r2r kind of @code{FFTW_R2HC} (@dfn{r2hc}) corresponds to an r2c DFT
cannam@127	597 @ctindex FFTW_R2HC
cannam@127	598 @cindex r2c
cannam@127	599 @cindex r2hc
cannam@127	600 (@pxref{One-Dimensional DFTs of Real Data}) but with ``halfcomplex''
cannam@127	601 format output, and may sometimes be faster and/or more convenient than
cannam@127	602 the latter.
cannam@127	603 @cindex halfcomplex format
cannam@127	604 The inverse @dfn{hc2r} transform is of kind @code{FFTW_HC2R}.
cannam@127	605 @ctindex FFTW_HC2R
cannam@127	606 @cindex hc2r
cannam@127	607 This consists of the non-redundant half of the complex output for a 1d
cannam@127	608 real-input DFT of size @code{n}, stored as a sequence of @code{n} real
cannam@127	609 numbers (@code{double}) in the format:
cannam@127	610
cannam@127	611 @tex
cannam@127	612 $$
cannam@127	613 r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1
cannam@127	614 $$
cannam@127	615 @end tex
cannam@127	616 @ifinfo
cannam@127	617 r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1
cannam@127	618 @end ifinfo
cannam@127	619 @html
cannam@127	620 <p align=center>
cannam@127	621 r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub>
cannam@127	622 </p>
cannam@127	623 @end html
cannam@127	624
cannam@127	625 Here,
cannam@127	626 @ifinfo
cannam@127	627 rk
cannam@127	628 @end ifinfo
cannam@127	629 @tex
cannam@127	630 $r_k$
cannam@127	631 @end tex
cannam@127	632 @html
cannam@127	633 r<sub>k</sub>
cannam@127	634 @end html
cannam@127	635 is the real part of the @math{k}th output, and
cannam@127	636 @ifinfo
cannam@127	637 ik
cannam@127	638 @end ifinfo
cannam@127	639 @tex
cannam@127	640 $i_k$
cannam@127	641 @end tex
cannam@127	642 @html
cannam@127	643 i<sub>k</sub>
cannam@127	644 @end html
cannam@127	645 is the imaginary part. (Division by 2 is rounded down.) For a
cannam@127	646 halfcomplex array @code{hc[n]}, the @math{k}th component thus has its
cannam@127	647 real part in @code{hc[k]} and its imaginary part in @code{hc[n-k]}, with
cannam@127	648 the exception of @code{k} @code{==} @code{0} or @code{n/2} (the latter
cannam@127	649 only if @code{n} is even)---in these two cases, the imaginary part is
cannam@127	650 zero due to symmetries of the real-input DFT, and is not stored.
cannam@127	651 Thus, the r2hc transform of @code{n} real values is a halfcomplex array of
cannam@127	652 length @code{n}, and vice versa for hc2r.
cannam@127	653 @cindex normalization
cannam@127	654
cannam@127	655
cannam@127	656 Aside from the differing format, the output of
cannam@127	657 @code{FFTW_R2HC}/@code{FFTW_HC2R} is otherwise exactly the same as for
cannam@127	658 the corresponding 1d r2c/c2r transform
cannam@127	659 (i.e. @code{FFTW_FORWARD}/@code{FFTW_BACKWARD} transforms, respectively).
cannam@127	660 Recall that these transforms are unnormalized, so r2hc followed by hc2r
cannam@127	661 will result in the original data multiplied by @code{n}. Furthermore,
cannam@127	662 like the c2r transform, an out-of-place hc2r transform will
cannam@127	663 @emph{destroy its input} array.
cannam@127	664
cannam@127	665 Although these halfcomplex transforms can be used with the
cannam@127	666 multi-dimensional r2r interface, the interpretation of such a separable
cannam@127	667 product of transforms along each dimension is problematic. For example,
cannam@127	668 consider a two-dimensional @code{n0} by @code{n1}, r2hc by r2hc
cannam@127	669 transform planned by @code{fftw_plan_r2r_2d(n0, n1, in, out, FFTW_R2HC,
cannam@127	670 FFTW_R2HC, FFTW_MEASURE)}. Conceptually, FFTW first transforms the rows
cannam@127	671 (of size @code{n1}) to produce halfcomplex rows, and then transforms the
cannam@127	672 columns (of size @code{n0}). Half of these column transforms, however,
cannam@127	673 are of imaginary parts, and should therefore be multiplied by @math{i}
cannam@127	674 and combined with the r2hc transforms of the real columns to produce the
cannam@127	675 2d DFT amplitudes; FFTW's r2r transform does @emph{not} perform this
cannam@127	676 combination for you. Thus, if a multi-dimensional real-input/output DFT
cannam@127	677 is required, we recommend using the ordinary r2c/c2r
cannam@127	678 interface (@pxref{Multi-Dimensional DFTs of Real Data}).
cannam@127	679
cannam@127	680 @c =========>
cannam@127	681 @node Real even/odd DFTs (cosine/sine transforms), The Discrete Hartley Transform, The Halfcomplex-format DFT, More DFTs of Real Data
cannam@127	682 @subsection Real even/odd DFTs (cosine/sine transforms)
cannam@127	683
cannam@127	684 The Fourier transform of a real-even function @math{f(-x) = f(x)} is
cannam@127	685 real-even, and @math{i} times the Fourier transform of a real-odd
cannam@127	686 function @math{f(-x) = -f(x)} is real-odd. Similar results hold for a
cannam@127	687 discrete Fourier transform, and thus for these symmetries the need for
cannam@127	688 complex inputs/outputs is entirely eliminated. Moreover, one gains a
cannam@127	689 factor of two in speed/space from the fact that the data are real, and
cannam@127	690 an additional factor of two from the even/odd symmetry: only the
cannam@127	691 non-redundant (first) half of the array need be stored. The result is
cannam@127	692 the real-even DFT (@dfn{REDFT}) and the real-odd DFT (@dfn{RODFT}), also
cannam@127	693 known as the discrete cosine and sine transforms (@dfn{DCT} and
cannam@127	694 @dfn{DST}), respectively.
cannam@127	695 @cindex real-even DFT
cannam@127	696 @cindex REDFT
cannam@127	697 @cindex real-odd DFT
cannam@127	698 @cindex RODFT
cannam@127	699 @cindex discrete cosine transform
cannam@127	700 @cindex DCT
cannam@127	701 @cindex discrete sine transform
cannam@127	702 @cindex DST
cannam@127	703
cannam@127	704
cannam@127	705 (In this section, we describe the 1d transforms; multi-dimensional
cannam@127	706 transforms are just a separable product of these transforms operating
cannam@127	707 along each dimension.)
cannam@127	708
cannam@127	709 Because of the discrete sampling, one has an additional choice: is the
cannam@127	710 data even/odd around a sampling point, or around the point halfway
cannam@127	711 between two samples? The latter corresponds to @emph{shifting} the
cannam@127	712 samples by @emph{half} an interval, and gives rise to several transform
cannam@127	713 variants denoted by REDFT@math{ab} and RODFT@math{ab}: @math{a} and
cannam@127	714 @math{b} are @math{0} or @math{1}, and indicate whether the input
cannam@127	715 (@math{a}) and/or output (@math{b}) are shifted by half a sample
cannam@127	716 (@math{1} means it is shifted). These are also known as types I-IV of
cannam@127	717 the DCT and DST, and all four types are supported by FFTW's r2r
cannam@127	718 interface.@footnote{There are also type V-VIII transforms, which
cannam@127	719 correspond to a logical DFT of @emph{odd} size @math{N}, independent of
cannam@127	720 whether the physical size @code{n} is odd, but we do not support these
cannam@127	721 variants.}
cannam@127	722
cannam@127	723 The r2r kinds for the various REDFT and RODFT types supported by FFTW,
cannam@127	724 along with the boundary conditions at both ends of the @emph{input}
cannam@127	725 array (@code{n} real numbers @code{in[j=0..n-1]}), are:
cannam@127	726
cannam@127	727 @itemize @bullet
cannam@127	728
cannam@127	729 @item
cannam@127	730 @code{FFTW_REDFT00} (DCT-I): even around @math{j=0} and even around @math{j=n-1}.
cannam@127	731 @ctindex FFTW_REDFT00
cannam@127	732
cannam@127	733 @item
cannam@127	734 @code{FFTW_REDFT10} (DCT-II, ``the'' DCT): even around @math{j=-0.5} and even around @math{j=n-0.5}.
cannam@127	735 @ctindex FFTW_REDFT10
cannam@127	736
cannam@127	737 @item
cannam@127	738 @code{FFTW_REDFT01} (DCT-III, ``the'' IDCT): even around @math{j=0} and odd around @math{j=n}.
cannam@127	739 @ctindex FFTW_REDFT01
cannam@127	740 @cindex IDCT
cannam@127	741
cannam@127	742 @item
cannam@127	743 @code{FFTW_REDFT11} (DCT-IV): even around @math{j=-0.5} and odd around @math{j=n-0.5}.
cannam@127	744 @ctindex FFTW_REDFT11
cannam@127	745
cannam@127	746 @item
cannam@127	747 @code{FFTW_RODFT00} (DST-I): odd around @math{j=-1} and odd around @math{j=n}.
cannam@127	748 @ctindex FFTW_RODFT00
cannam@127	749
cannam@127	750 @item
cannam@127	751 @code{FFTW_RODFT10} (DST-II): odd around @math{j=-0.5} and odd around @math{j=n-0.5}.
cannam@127	752 @ctindex FFTW_RODFT10
cannam@127	753
cannam@127	754 @item
cannam@127	755 @code{FFTW_RODFT01} (DST-III): odd around @math{j=-1} and even around @math{j=n-1}.
cannam@127	756 @ctindex FFTW_RODFT01
cannam@127	757
cannam@127	758 @item
cannam@127	759 @code{FFTW_RODFT11} (DST-IV): odd around @math{j=-0.5} and even around @math{j=n-0.5}.
cannam@127	760 @ctindex FFTW_RODFT11
cannam@127	761
cannam@127	762 @end itemize
cannam@127	763
cannam@127	764 Note that these symmetries apply to the ``logical'' array being
cannam@127	765 transformed; @strong{there are no constraints on your physical input
cannam@127	766 data}. So, for example, if you specify a size-5 REDFT00 (DCT-I) of the
cannam@127	767 data @math{abcde}, it corresponds to the DFT of the logical even array
cannam@127	768 @math{abcdedcb} of size 8. A size-4 REDFT10 (DCT-II) of the data
cannam@127	769 @math{abcd} corresponds to the size-8 logical DFT of the even array
cannam@127	770 @math{abcddcba}, shifted by half a sample.
cannam@127	771
cannam@127	772 All of these transforms are invertible. The inverse of R*DFT00 is
cannam@127	773 RDFT00; of RDFT10 is R*DFT01 and vice versa (these are often called
cannam@127	774 simply ``the'' DCT and IDCT, respectively); and of RDFT11 is RDFT11.
cannam@127	775 However, the transforms computed by FFTW are unnormalized, exactly
cannam@127	776 like the corresponding real and complex DFTs, so computing a transform
cannam@127	777 followed by its inverse yields the original array scaled by @math{N},
cannam@127	778 where @math{N} is the @emph{logical} DFT size. For REDFT00,
cannam@127	779 @math{N=2(n-1)}; for RODFT00, @math{N=2(n+1)}; otherwise, @math{N=2n}.
cannam@127	780 @cindex normalization
cannam@127	781 @cindex IDCT
cannam@127	782
cannam@127	783
cannam@127	784 Note that the boundary conditions of the transform output array are
cannam@127	785 given by the input boundary conditions of the inverse transform.
cannam@127	786 Thus, the above transforms are all inequivalent in terms of
cannam@127	787 input/output boundary conditions, even neglecting the 0.5 shift
cannam@127	788 difference.
cannam@127	789
cannam@127	790 FFTW is most efficient when @math{N} is a product of small factors; note
cannam@127	791 that this @emph{differs} from the factorization of the physical size
cannam@127	792 @code{n} for REDFT00 and RODFT00! There is another oddity: @code{n=1}
cannam@127	793 REDFT00 transforms correspond to @math{N=0}, and so are @emph{not
cannam@127	794 defined} (the planner will return @code{NULL}). Otherwise, any positive
cannam@127	795 @code{n} is supported.
cannam@127	796
cannam@127	797 For the precise mathematical definitions of these transforms as used by
cannam@127	798 FFTW, see @ref{What FFTW Really Computes}. (For people accustomed to
cannam@127	799 the DCT/DST, FFTW's definitions have a coefficient of @math{2} in front
cannam@127	800 of the cos/sin functions so that they correspond precisely to an
cannam@127	801 even/odd DFT of size @math{N}. Some authors also include additional
cannam@127	802 multiplicative factors of
cannam@127	803 @ifinfo
cannam@127	804 sqrt(2)
cannam@127	805 @end ifinfo
cannam@127	806 @html
cannam@127	807 √2
cannam@127	808 @end html
cannam@127	809 @tex
cannam@127	810 $\sqrt{2}$
cannam@127	811 @end tex
cannam@127	812 for selected inputs and outputs; this makes
cannam@127	813 the transform orthogonal, but sacrifices the direct equivalence to a
cannam@127	814 symmetric DFT.)
cannam@127	815
cannam@127	816 @subsubheading Which type do you need?
cannam@127	817
cannam@127	818 Since the required flavor of even/odd DFT depends upon your problem,
cannam@127	819 you are the best judge of this choice, but we can make a few comments
cannam@127	820 on relative efficiency to help you in your selection. In particular,
cannam@127	821 RDFT01 and RDFT10 tend to be slightly faster than R*DFT11
cannam@127	822 (especially for odd sizes), while the R*DFT00 transforms are sometimes
cannam@127	823 significantly slower (especially for even sizes).@footnote{R*DFT00 is
cannam@127	824 sometimes slower in FFTW because we discovered that the standard
cannam@127	825 algorithm for computing this by a pre/post-processed real DFT---the
cannam@127	826 algorithm used in FFTPACK, Numerical Recipes, and other sources for
cannam@127	827 decades now---has serious numerical problems: it already loses several
cannam@127	828 decimal places of accuracy for 16k sizes. There seem to be only two
cannam@127	829 alternatives in the literature that do not suffer similarly: a
cannam@127	830 recursive decomposition into smaller DCTs, which would require a large
cannam@127	831 set of codelets for efficiency and generality, or sacrificing a factor of
cannam@127	832 @tex
cannam@127	833 $\sim 2$
cannam@127	834 @end tex
cannam@127	835 @ifnottex
cannam@127	836 2
cannam@127	837 @end ifnottex
cannam@127	838 in speed to use a real DFT of twice the size. We currently
cannam@127	839 employ the latter technique for general @math{n}, as well as a limited
cannam@127	840 form of the former method: a split-radix decomposition when @math{n}
cannam@127	841 is odd (@math{N} a multiple of 4). For @math{N} containing many
cannam@127	842 factors of 2, the split-radix method seems to recover most of the
cannam@127	843 speed of the standard algorithm without the accuracy tradeoff.}
cannam@127	844
cannam@127	845 Thus, if only the boundary conditions on the transform inputs are
cannam@127	846 specified, we generally recommend RDFT10 over RDFT00 and R*DFT01 over
cannam@127	847 R*DFT11 (unless the half-sample shift or the self-inverse property is
cannam@127	848 significant for your problem).
cannam@127	849
cannam@127	850 If performance is important to you and you are using only small sizes
cannam@127	851 (say @math{n<200}), e.g. for multi-dimensional transforms, then you
cannam@127	852 might consider generating hard-coded transforms of those sizes and types
cannam@127	853 that you are interested in (@pxref{Generating your own code}).
cannam@127	854
cannam@127	855 We are interested in hearing what types of symmetric transforms you find
cannam@127	856 most useful.
cannam@127	857
cannam@127	858 @c =========>
cannam@127	859 @node The Discrete Hartley Transform, , Real even/odd DFTs (cosine/sine transforms), More DFTs of Real Data
cannam@127	860 @subsection The Discrete Hartley Transform
cannam@127	861
cannam@127	862 If you are planning to use the DHT because you've heard that it is
cannam@127	863 ``faster'' than the DFT (FFT), @strong{stop here}. The DHT is not
cannam@127	864 faster than the DFT. That story is an old but enduring misconception
cannam@127	865 that was debunked in 1987.
cannam@127	866
cannam@127	867 The discrete Hartley transform (DHT) is an invertible linear transform
cannam@127	868 closely related to the DFT. In the DFT, one multiplies each input by
cannam@127	869 @math{cos - i * sin} (a complex exponential), whereas in the DHT each
cannam@127	870 input is multiplied by simply @math{cos + sin}. Thus, the DHT
cannam@127	871 transforms @code{n} real numbers to @code{n} real numbers, and has the
cannam@127	872 convenient property of being its own inverse. In FFTW, a DHT (of any
cannam@127	873 positive @code{n}) can be specified by an r2r kind of @code{FFTW_DHT}.
cannam@127	874 @ctindex FFTW_DHT
cannam@127	875 @cindex discrete Hartley transform
cannam@127	876 @cindex DHT
cannam@127	877
cannam@127	878 Like the DFT, in FFTW the DHT is unnormalized, so computing a DHT of
cannam@127	879 size @code{n} followed by another DHT of the same size will result in
cannam@127	880 the original array multiplied by @code{n}.
cannam@127	881 @cindex normalization
cannam@127	882
cannam@127	883 The DHT was originally proposed as a more efficient alternative to the
cannam@127	884 DFT for real data, but it was subsequently shown that a specialized DFT
cannam@127	885 (such as FFTW's r2hc or r2c transforms) could be just as fast. In FFTW,
cannam@127	886 the DHT is actually computed by post-processing an r2hc transform, so
cannam@127	887 there is ordinarily no reason to prefer it from a performance
cannam@127	888 perspective.@footnote{We provide the DHT mainly as a byproduct of some
cannam@127	889 internal algorithms. FFTW computes a real input/output DFT of
cannam@127	890 @emph{prime} size by re-expressing it as a DHT plus post/pre-processing
cannam@127	891 and then using Rader's prime-DFT algorithm adapted to the DHT.}
cannam@127	892 However, we have heard rumors that the DHT might be the most appropriate
cannam@127	893 transform in its own right for certain applications, and we would be
cannam@127	894 very interested to hear from anyone who finds it useful.
cannam@127	895
cannam@127	896 If @code{FFTW_DHT} is specified for multiple dimensions of a
cannam@127	897 multi-dimensional transform, FFTW computes the separable product of 1d
cannam@127	898 DHTs along each dimension. Unfortunately, this is not quite the same
cannam@127	899 thing as a true multi-dimensional DHT; you can compute the latter, if
cannam@127	900 necessary, with at most @code{rank-1} post-processing passes
cannam@127	901 [see e.g. H. Hao and R. N. Bracewell, @i{Proc. IEEE} @b{75}, 264--266 (1987)].
cannam@127	902
cannam@127	903 For the precise mathematical definition of the DHT as used by FFTW, see
cannam@127	904 @ref{What FFTW Really Computes}.
cannam@127	905

Mercurial > hg > sv-dependency-builds

annotate src/fftw-3.3.5/doc/tutorial.texi @ 169:223a55898ab9 tip default