sv-dependency-builds: src/fftw-3.3.5/doc/other.texi annotate

annotate src/fftw-3.3.5/doc/other.texi @ 169:223a55898ab9 tip default

Add null config files

author	Chris Cannam <cannam@all-day-breakfast.com>
date	Mon, 02 Mar 2020 14:03:47 +0000
parents	7867fa7e1b6b
children

rev	line source
cannam@127	1 @node Other Important Topics, FFTW Reference, Tutorial, Top
cannam@127	2 @chapter Other Important Topics
cannam@127	3 @menu
cannam@127	4 * SIMD alignment and fftw_malloc::
cannam@127	5 * Multi-dimensional Array Format::
cannam@127	6 * Words of Wisdom-Saving Plans::
cannam@127	7 * Caveats in Using Wisdom::
cannam@127	8 @end menu
cannam@127	9
cannam@127	10 @c ------------------------------------------------------------
cannam@127	11 @node SIMD alignment and fftw_malloc, Multi-dimensional Array Format, Other Important Topics, Other Important Topics
cannam@127	12 @section SIMD alignment and fftw_malloc
cannam@127	13
cannam@127	14 SIMD, which stands for ``Single Instruction Multiple Data,'' is a set of
cannam@127	15 special operations supported by some processors to perform a single
cannam@127	16 operation on several numbers (usually 2 or 4) simultaneously. SIMD
cannam@127	17 floating-point instructions are available on several popular CPUs:
cannam@127	18 SSE/SSE2/AVX/AVX2/AVX512/KCVI on some x86/x86-64 processors, AltiVec and
cannam@127	19 VSX on some POWER/PowerPCs, NEON on some ARM models. FFTW can be
cannam@127	20 compiled to support the SIMD instructions on any of these systems.
cannam@127	21 @cindex SIMD
cannam@127	22 @cindex SSE
cannam@127	23 @cindex SSE2
cannam@127	24 @cindex AVX
cannam@127	25 @cindex AVX2
cannam@127	26 @cindex AVX512
cannam@127	27 @cindex AltiVec
cannam@127	28 @cindex VSX
cannam@127	29 @cindex precision
cannam@127	30
cannam@127	31
cannam@127	32 A program linking to an FFTW library compiled with SIMD support can
cannam@127	33 obtain a nonnegligible speedup for most complex and r2c/c2r
cannam@127	34 transforms. In order to obtain this speedup, however, the arrays of
cannam@127	35 complex (or real) data passed to FFTW must be specially aligned in
cannam@127	36 memory (typically 16-byte aligned), and often this alignment is more
cannam@127	37 stringent than that provided by the usual @code{malloc} (etc.)
cannam@127	38 allocation routines.
cannam@127	39
cannam@127	40 @cindex portability
cannam@127	41 In order to guarantee proper alignment for SIMD, therefore, in case
cannam@127	42 your program is ever linked against a SIMD-using FFTW, we recommend
cannam@127	43 allocating your transform data with @code{fftw_malloc} and
cannam@127	44 de-allocating it with @code{fftw_free}.
cannam@127	45 @findex fftw_malloc
cannam@127	46 @findex fftw_free
cannam@127	47 These have exactly the same interface and behavior as
cannam@127	48 @code{malloc}/@code{free}, except that for a SIMD FFTW they ensure
cannam@127	49 that the returned pointer has the necessary alignment (by calling
cannam@127	50 @code{memalign} or its equivalent on your OS).
cannam@127	51
cannam@127	52 You are not @emph{required} to use @code{fftw_malloc}. You can
cannam@127	53 allocate your data in any way that you like, from @code{malloc} to
cannam@127	54 @code{new} (in C++) to a fixed-size array declaration. If the array
cannam@127	55 happens not to be properly aligned, FFTW will not use the SIMD
cannam@127	56 extensions.
cannam@127	57 @cindex C++
cannam@127	58
cannam@127	59 @findex fftw_alloc_real
cannam@127	60 @findex fftw_alloc_complex
cannam@127	61 Since @code{fftw_malloc} only ever needs to be used for real and
cannam@127	62 complex arrays, we provide two convenient wrapper routines
cannam@127	63 @code{fftw_alloc_real(N)} and @code{fftw_alloc_complex(N)} that are
cannam@127	64 equivalent to @code{(double)fftw_malloc(sizeof(double) N)} and
cannam@127	65 @code{(fftw_complex)fftw_malloc(sizeof(fftw_complex) N)},
cannam@127	66 respectively (or their equivalents in other precisions).
cannam@127	67
cannam@127	68 @c ------------------------------------------------------------
cannam@127	69 @node Multi-dimensional Array Format, Words of Wisdom-Saving Plans, SIMD alignment and fftw_malloc, Other Important Topics
cannam@127	70 @section Multi-dimensional Array Format
cannam@127	71
cannam@127	72 This section describes the format in which multi-dimensional arrays
cannam@127	73 are stored in FFTW. We felt that a detailed discussion of this topic
cannam@127	74 was necessary. Since several different formats are common, this topic
cannam@127	75 is often a source of confusion.
cannam@127	76
cannam@127	77 @menu
cannam@127	78 * Row-major Format::
cannam@127	79 * Column-major Format::
cannam@127	80 * Fixed-size Arrays in C::
cannam@127	81 * Dynamic Arrays in C::
cannam@127	82 * Dynamic Arrays in C-The Wrong Way::
cannam@127	83 @end menu
cannam@127	84
cannam@127	85 @c =========>
cannam@127	86 @node Row-major Format, Column-major Format, Multi-dimensional Array Format, Multi-dimensional Array Format
cannam@127	87 @subsection Row-major Format
cannam@127	88 @cindex row-major
cannam@127	89
cannam@127	90 The multi-dimensional arrays passed to @code{fftw_plan_dft} etcetera
cannam@127	91 are expected to be stored as a single contiguous block in
cannam@127	92 @dfn{row-major} order (sometimes called ``C order''). Basically, this
cannam@127	93 means that as you step through adjacent memory locations, the first
cannam@127	94 dimension's index varies most slowly and the last dimension's index
cannam@127	95 varies most quickly.
cannam@127	96
cannam@127	97 To be more explicit, let us consider an array of rank @math{d} whose
cannam@127	98 dimensions are @ndims{}. Now, we specify a location in the array by a
cannam@127	99 sequence of @math{d} (zero-based) indices, one for each dimension:
cannam@127	100 @tex
cannam@127	101 $(i_0, i_1, i_2, \ldots, i_{d-1})$.
cannam@127	102 @end tex
cannam@127	103 @ifinfo
cannam@127	104 (i[0], i[1], ..., i[d-1]).
cannam@127	105 @end ifinfo
cannam@127	106 @html
cannam@127	107 (i<sub>0</sub>, i<sub>1</sub>, i<sub>2</sub>,..., i<sub>d-1</sub>).
cannam@127	108 @end html
cannam@127	109 If the array is stored in row-major
cannam@127	110 order, then this element is located at the position
cannam@127	111 @tex
cannam@127	112 $i_{d-1} + n_{d-1} (i_{d-2} + n_{d-2} (\ldots + n_1 i_0))$.
cannam@127	113 @end tex
cannam@127	114 @ifinfo
cannam@127	115 i[d-1] + n[d-1] * (i[d-2] + n[d-2] * (... + n[1] * i[0])).
cannam@127	116 @end ifinfo
cannam@127	117 @html
cannam@127	118 i<sub>d-1</sub> + n<sub>d-1</sub> * (i<sub>d-2</sub> + n<sub>d-2</sub> * (... + n<sub>1</sub> * i<sub>0</sub>)).
cannam@127	119 @end html
cannam@127	120
cannam@127	121 Note that, for the ordinary complex DFT, each element of the array
cannam@127	122 must be of type @code{fftw_complex}; i.e. a (real, imaginary) pair of
cannam@127	123 (double-precision) numbers.
cannam@127	124
cannam@127	125 In the advanced FFTW interface, the physical dimensions @math{n} from
cannam@127	126 which the indices are computed can be different from (larger than)
cannam@127	127 the logical dimensions of the transform to be computed, in order to
cannam@127	128 transform a subset of a larger array.
cannam@127	129 @cindex advanced interface
cannam@127	130 Note also that, in the advanced interface, the expression above is
cannam@127	131 multiplied by a @dfn{stride} to get the actual array index---this is
cannam@127	132 useful in situations where each element of the multi-dimensional array
cannam@127	133 is actually a data structure (or another array), and you just want to
cannam@127	134 transform a single field. In the basic interface, however, the stride
cannam@127	135 is 1.
cannam@127	136 @cindex stride
cannam@127	137
cannam@127	138 @c =========>
cannam@127	139 @node Column-major Format, Fixed-size Arrays in C, Row-major Format, Multi-dimensional Array Format
cannam@127	140 @subsection Column-major Format
cannam@127	141 @cindex column-major
cannam@127	142
cannam@127	143 Readers from the Fortran world are used to arrays stored in
cannam@127	144 @dfn{column-major} order (sometimes called ``Fortran order''). This is
cannam@127	145 essentially the exact opposite of row-major order in that, here, the
cannam@127	146 @emph{first} dimension's index varies most quickly.
cannam@127	147
cannam@127	148 If you have an array stored in column-major order and wish to
cannam@127	149 transform it using FFTW, it is quite easy to do. When creating the
cannam@127	150 plan, simply pass the dimensions of the array to the planner in
cannam@127	151 @emph{reverse order}. For example, if your array is a rank three
cannam@127	152 @code{N x M x L} matrix in column-major order, you should pass the
cannam@127	153 dimensions of the array as if it were an @code{L x M x N} matrix
cannam@127	154 (which it is, from the perspective of FFTW). This is done for you
cannam@127	155 @emph{automatically} by the FFTW legacy-Fortran interface
cannam@127	156 (@pxref{Calling FFTW from Legacy Fortran}), but you must do it
cannam@127	157 manually with the modern Fortran interface (@pxref{Reversing array
cannam@127	158 dimensions}).
cannam@127	159 @cindex Fortran interface
cannam@127	160
cannam@127	161 @c =========>
cannam@127	162 @node Fixed-size Arrays in C, Dynamic Arrays in C, Column-major Format, Multi-dimensional Array Format
cannam@127	163 @subsection Fixed-size Arrays in C
cannam@127	164 @cindex C multi-dimensional arrays
cannam@127	165
cannam@127	166 A multi-dimensional array whose size is declared at compile time in C
cannam@127	167 is @emph{already} in row-major order. You don't have to do anything
cannam@127	168 special to transform it. For example:
cannam@127	169
cannam@127	170 @example
cannam@127	171 @{
cannam@127	172 fftw_complex data[N0][N1][N2];
cannam@127	173 fftw_plan plan;
cannam@127	174 ...
cannam@127	175 plan = fftw_plan_dft_3d(N0, N1, N2, &data[0][0][0], &data[0][0][0],
cannam@127	176 FFTW_FORWARD, FFTW_ESTIMATE);
cannam@127	177 ...
cannam@127	178 @}
cannam@127	179 @end example
cannam@127	180
cannam@127	181 This will plan a 3d in-place transform of size @code{N0 x N1 x N2}.
cannam@127	182 Notice how we took the address of the zero-th element to pass to the
cannam@127	183 planner (we could also have used a typecast).
cannam@127	184
cannam@127	185 However, we tend to @emph{discourage} users from declaring their
cannam@127	186 arrays in this way, for two reasons. First, this allocates the array
cannam@127	187 on the stack (``automatic'' storage), which has a very limited size on
cannam@127	188 most operating systems (declaring an array with more than a few
cannam@127	189 thousand elements will often cause a crash). (You can get around this
cannam@127	190 limitation on many systems by declaring the array as
cannam@127	191 @code{static} and/or global, but that has its own drawbacks.)
cannam@127	192 Second, it may not optimally align the array for use with a SIMD
cannam@127	193 FFTW (@pxref{SIMD alignment and fftw_malloc}). Instead, we recommend
cannam@127	194 using @code{fftw_malloc}, as described below.
cannam@127	195
cannam@127	196 @c =========>
cannam@127	197 @node Dynamic Arrays in C, Dynamic Arrays in C-The Wrong Way, Fixed-size Arrays in C, Multi-dimensional Array Format
cannam@127	198 @subsection Dynamic Arrays in C
cannam@127	199
cannam@127	200 We recommend allocating most arrays dynamically, with
cannam@127	201 @code{fftw_malloc}. This isn't too hard to do, although it is not as
cannam@127	202 straightforward for multi-dimensional arrays as it is for
cannam@127	203 one-dimensional arrays.
cannam@127	204
cannam@127	205 Creating the array is simple: using a dynamic-allocation routine like
cannam@127	206 @code{fftw_malloc}, allocate an array big enough to store N
cannam@127	207 @code{fftw_complex} values (for a complex DFT), where N is the product
cannam@127	208 of the sizes of the array dimensions (i.e. the total number of complex
cannam@127	209 values in the array). For example, here is code to allocate a
cannam@127	210 @threedims{5,12,27} rank-3 array:
cannam@127	211 @findex fftw_malloc
cannam@127	212
cannam@127	213 @example
cannam@127	214 fftw_complex *an_array;
cannam@127	215 an_array = (fftw_complex) fftw_malloc(51227 sizeof(fftw_complex));
cannam@127	216 @end example
cannam@127	217
cannam@127	218 Accessing the array elements, however, is more tricky---you can't
cannam@127	219 simply use multiple applications of the @samp{[]} operator like you
cannam@127	220 could for fixed-size arrays. Instead, you have to explicitly compute
cannam@127	221 the offset into the array using the formula given earlier for
cannam@127	222 row-major arrays. For example, to reference the @math{(i,j,k)}-th
cannam@127	223 element of the array allocated above, you would use the expression
cannam@127	224 @code{an_array[k + 27 * (j + 12 * i)]}.
cannam@127	225
cannam@127	226 This pain can be alleviated somewhat by defining appropriate macros,
cannam@127	227 or, in C++, creating a class and overloading the @samp{()} operator.
cannam@127	228 The recent C99 standard provides a way to reinterpret the dynamic
cannam@127	229 array as a ``variable-length'' multi-dimensional array amenable to
cannam@127	230 @samp{[]}, but this feature is not yet widely supported by compilers.
cannam@127	231 @cindex C99
cannam@127	232 @cindex C++
cannam@127	233
cannam@127	234 @c =========>
cannam@127	235 @node Dynamic Arrays in C-The Wrong Way, , Dynamic Arrays in C, Multi-dimensional Array Format
cannam@127	236 @subsection Dynamic Arrays in C---The Wrong Way
cannam@127	237
cannam@127	238 A different method for allocating multi-dimensional arrays in C is
cannam@127	239 often suggested that is incompatible with FFTW: @emph{using it will
cannam@127	240 cause FFTW to die a painful death}. We discuss the technique here,
cannam@127	241 however, because it is so commonly known and used. This method is to
cannam@127	242 create arrays of pointers of arrays of pointers of @dots{}etcetera.
cannam@127	243 For example, the analogue in this method to the example above is:
cannam@127	244
cannam@127	245 @example
cannam@127	246 int i,j;
cannam@127	247 fftw_complex **a_bad_array; / @r{another way to make a 5x12x27 array} */
cannam@127	248
cannam@127	249 a_bad_array = (fftw_complex **) malloc(5 sizeof(fftw_complex **));
cannam@127	250 for (i = 0; i < 5; ++i) @{
cannam@127	251 a_bad_array[i] =
cannam@127	252 (fftw_complex *) malloc(12 sizeof(fftw_complex *));
cannam@127	253 for (j = 0; j < 12; ++j)
cannam@127	254 a_bad_array[i][j] =
cannam@127	255 (fftw_complex ) malloc(27 sizeof(fftw_complex));
cannam@127	256 @}
cannam@127	257 @end example
cannam@127	258
cannam@127	259 As you can see, this sort of array is inconvenient to allocate (and
cannam@127	260 deallocate). On the other hand, it has the advantage that the
cannam@127	261 @math{(i,j,k)}-th element can be referenced simply by
cannam@127	262 @code{a_bad_array[i][j][k]}.
cannam@127	263
cannam@127	264 If you like this technique and want to maximize convenience in accessing
cannam@127	265 the array, but still want to pass the array to FFTW, you can use a
cannam@127	266 hybrid method. Allocate the array as one contiguous block, but also
cannam@127	267 declare an array of arrays of pointers that point to appropriate places
cannam@127	268 in the block. That sort of trick is beyond the scope of this
cannam@127	269 documentation; for more information on multi-dimensional arrays in C,
cannam@127	270 see the @code{comp.lang.c}
cannam@127	271 @uref{http://c-faq.com/aryptr/dynmuldimary.html, FAQ}.
cannam@127	272
cannam@127	273 @c ------------------------------------------------------------
cannam@127	274 @node Words of Wisdom-Saving Plans, Caveats in Using Wisdom, Multi-dimensional Array Format, Other Important Topics
cannam@127	275 @section Words of Wisdom---Saving Plans
cannam@127	276 @cindex wisdom
cannam@127	277 @cindex saving plans to disk
cannam@127	278
cannam@127	279 FFTW implements a method for saving plans to disk and restoring them.
cannam@127	280 In fact, what FFTW does is more general than just saving and loading
cannam@127	281 plans. The mechanism is called @dfn{wisdom}. Here, we describe
cannam@127	282 this feature at a high level. @xref{FFTW Reference}, for a less casual
cannam@127	283 but more complete discussion of how to use wisdom in FFTW.
cannam@127	284
cannam@127	285 Plans created with the @code{FFTW_MEASURE}, @code{FFTW_PATIENT}, or
cannam@127	286 @code{FFTW_EXHAUSTIVE} options produce near-optimal FFT performance,
cannam@127	287 but may require a long time to compute because FFTW must measure the
cannam@127	288 runtime of many possible plans and select the best one. This setup is
cannam@127	289 designed for the situations where so many transforms of the same size
cannam@127	290 must be computed that the start-up time is irrelevant. For short
cannam@127	291 initialization times, but slower transforms, we have provided
cannam@127	292 @code{FFTW_ESTIMATE}. The @code{wisdom} mechanism is a way to get the
cannam@127	293 best of both worlds: you compute a good plan once, save it to
cannam@127	294 disk, and later reload it as many times as necessary. The wisdom
cannam@127	295 mechanism can actually save and reload many plans at once, not just
cannam@127	296 one.
cannam@127	297 @ctindex FFTW_MEASURE
cannam@127	298 @ctindex FFTW_PATIENT
cannam@127	299 @ctindex FFTW_EXHAUSTIVE
cannam@127	300 @ctindex FFTW_ESTIMATE
cannam@127	301
cannam@127	302
cannam@127	303 Whenever you create a plan, the FFTW planner accumulates wisdom, which
cannam@127	304 is information sufficient to reconstruct the plan. After planning,
cannam@127	305 you can save this information to disk by means of the function:
cannam@127	306 @example
cannam@127	307 int fftw_export_wisdom_to_filename(const char *filename);
cannam@127	308 @end example
cannam@127	309 @findex fftw_export_wisdom_to_filename
cannam@127	310 (This function returns non-zero on success.)
cannam@127	311
cannam@127	312 The next time you run the program, you can restore the wisdom with
cannam@127	313 @code{fftw_import_wisdom_from_filename} (which also returns non-zero on success),
cannam@127	314 and then recreate the plan using the same flags as before.
cannam@127	315 @example
cannam@127	316 int fftw_import_wisdom_from_filename(const char *filename);
cannam@127	317 @end example
cannam@127	318 @findex fftw_import_wisdom_from_filename
cannam@127	319
cannam@127	320 Wisdom is automatically used for any size to which it is applicable, as
cannam@127	321 long as the planner flags are not more ``patient'' than those with which
cannam@127	322 the wisdom was created. For example, wisdom created with
cannam@127	323 @code{FFTW_MEASURE} can be used if you later plan with
cannam@127	324 @code{FFTW_ESTIMATE} or @code{FFTW_MEASURE}, but not with
cannam@127	325 @code{FFTW_PATIENT}.
cannam@127	326
cannam@127	327 The @code{wisdom} is cumulative, and is stored in a global, private
cannam@127	328 data structure managed internally by FFTW. The storage space required
cannam@127	329 is minimal, proportional to the logarithm of the sizes the wisdom was
cannam@127	330 generated from. If memory usage is a concern, however, the wisdom can
cannam@127	331 be forgotten and its associated memory freed by calling:
cannam@127	332 @example
cannam@127	333 void fftw_forget_wisdom(void);
cannam@127	334 @end example
cannam@127	335 @findex fftw_forget_wisdom
cannam@127	336
cannam@127	337 Wisdom can be exported to a file, a string, or any other medium.
cannam@127	338 For details, see @ref{Wisdom}.
cannam@127	339
cannam@127	340 @node Caveats in Using Wisdom, , Words of Wisdom-Saving Plans, Other Important Topics
cannam@127	341 @section Caveats in Using Wisdom
cannam@127	342 @cindex wisdom, problems with
cannam@127	343
cannam@127	344 @quotation
cannam@127	345 @html
cannam@127	346 <i>
cannam@127	347 @end html
cannam@127	348 For in much wisdom is much grief, and he that increaseth knowledge
cannam@127	349 increaseth sorrow.
cannam@127	350 @html
cannam@127	351 </i>
cannam@127	352 @end html
cannam@127	353 [Ecclesiastes 1:18]
cannam@127	354 @cindex Ecclesiastes
cannam@127	355 @end quotation
cannam@127	356 @iftex
cannam@127	357 @medskip
cannam@127	358 @end iftex
cannam@127	359
cannam@127	360 @cindex portability
cannam@127	361 There are pitfalls to using wisdom, in that it can negate FFTW's
cannam@127	362 ability to adapt to changing hardware and other conditions. For
cannam@127	363 example, it would be perfectly possible to export wisdom from a
cannam@127	364 program running on one processor and import it into a program running
cannam@127	365 on another processor. Doing so, however, would mean that the second
cannam@127	366 program would use plans optimized for the first processor, instead of
cannam@127	367 the one it is running on.
cannam@127	368
cannam@127	369 It should be safe to reuse wisdom as long as the hardware and program
cannam@127	370 binaries remain unchanged. (Actually, the optimal plan may change even
cannam@127	371 between runs of the same binary on identical hardware, due to
cannam@127	372 differences in the virtual memory environment, etcetera. Users
cannam@127	373 seriously interested in performance should worry about this problem,
cannam@127	374 too.) It is likely that, if the same wisdom is used for two
cannam@127	375 different program binaries, even running on the same machine, the
cannam@127	376 plans may be sub-optimal because of differing code alignments. It is
cannam@127	377 therefore wise to recreate wisdom every time an application is
cannam@127	378 recompiled. The more the underlying hardware and software changes
cannam@127	379 between the creation of wisdom and its use, the greater grows
cannam@127	380 the risk of sub-optimal plans.
cannam@127	381
cannam@127	382 Nevertheless, if the choice is between using @code{FFTW_ESTIMATE} or
cannam@127	383 using possibly-suboptimal wisdom (created on the same machine, but for a
cannam@127	384 different binary), the wisdom is likely to be better. For this reason,
cannam@127	385 we provide a function to import wisdom from a standard system-wide
cannam@127	386 location (@code{/etc/fftw/wisdom} on Unix):
cannam@127	387 @cindex wisdom, system-wide
cannam@127	388
cannam@127	389 @example
cannam@127	390 int fftw_import_system_wisdom(void);
cannam@127	391 @end example
cannam@127	392 @findex fftw_import_system_wisdom
cannam@127	393
cannam@127	394 FFTW also provides a standalone program, @code{fftw-wisdom} (described
cannam@127	395 by its own @code{man} page on Unix) with which users can create wisdom,
cannam@127	396 e.g. for a canonical set of sizes to store in the system wisdom file.
cannam@127	397 @xref{Wisdom Utilities}.
cannam@127	398 @cindex fftw-wisdom utility
cannam@127	399

Mercurial > hg > sv-dependency-builds

annotate src/fftw-3.3.5/doc/other.texi @ 169:223a55898ab9 tip default