| Chris@10 | 1 @node Other Important Topics, FFTW Reference, Tutorial, Top | 
| Chris@10 | 2 @chapter Other Important Topics | 
| Chris@10 | 3 @menu | 
| Chris@10 | 4 * SIMD alignment and fftw_malloc:: | 
| Chris@10 | 5 * Multi-dimensional Array Format:: | 
| Chris@10 | 6 * Words of Wisdom-Saving Plans:: | 
| Chris@10 | 7 * Caveats in Using Wisdom:: | 
| Chris@10 | 8 @end menu | 
| Chris@10 | 9 | 
| Chris@10 | 10 @c ------------------------------------------------------------ | 
| Chris@10 | 11 @node SIMD alignment and fftw_malloc, Multi-dimensional Array Format, Other Important Topics, Other Important Topics | 
| Chris@10 | 12 @section SIMD alignment and fftw_malloc | 
| Chris@10 | 13 | 
| Chris@10 | 14 SIMD, which stands for ``Single Instruction Multiple Data,'' is a set of | 
| Chris@10 | 15 special operations supported by some processors to perform a single | 
| Chris@10 | 16 operation on several numbers (usually 2 or 4) simultaneously.  SIMD | 
| Chris@10 | 17 floating-point instructions are available on several popular CPUs: | 
| Chris@10 | 18 SSE/SSE2/AVX on recent x86/x86-64 processors, AltiVec (single precision) | 
| Chris@10 | 19 on some PowerPCs (Apple G4 and higher), NEON on some ARM models, and MIPS Paired Single | 
| Chris@10 | 20 (currently only in FFTW 3.2.x).  FFTW can be compiled to support the | 
| Chris@10 | 21 SIMD instructions on any of these systems. | 
| Chris@10 | 22 @cindex SIMD | 
| Chris@10 | 23 @cindex SSE | 
| Chris@10 | 24 @cindex SSE2 | 
| Chris@10 | 25 @cindex AVX | 
| Chris@10 | 26 @cindex AltiVec | 
| Chris@10 | 27 @cindex MIPS PS | 
| Chris@10 | 28 @cindex precision | 
| Chris@10 | 29 | 
| Chris@10 | 30 | 
| Chris@10 | 31 A program linking to an FFTW library compiled with SIMD support can | 
| Chris@10 | 32 obtain a nonnegligible speedup for most complex and r2c/c2r | 
| Chris@10 | 33 transforms.  In order to obtain this speedup, however, the arrays of | 
| Chris@10 | 34 complex (or real) data passed to FFTW must be specially aligned in | 
| Chris@10 | 35 memory (typically 16-byte aligned), and often this alignment is more | 
| Chris@10 | 36 stringent than that provided by the usual @code{malloc} (etc.) | 
| Chris@10 | 37 allocation routines. | 
| Chris@10 | 38 | 
| Chris@10 | 39 @cindex portability | 
| Chris@10 | 40 In order to guarantee proper alignment for SIMD, therefore, in case | 
| Chris@10 | 41 your program is ever linked against a SIMD-using FFTW, we recommend | 
| Chris@10 | 42 allocating your transform data with @code{fftw_malloc} and | 
| Chris@10 | 43 de-allocating it with @code{fftw_free}. | 
| Chris@10 | 44 @findex fftw_malloc | 
| Chris@10 | 45 @findex fftw_free | 
| Chris@10 | 46 These have exactly the same interface and behavior as | 
| Chris@10 | 47 @code{malloc}/@code{free}, except that for a SIMD FFTW they ensure | 
| Chris@10 | 48 that the returned pointer has the necessary alignment (by calling | 
| Chris@10 | 49 @code{memalign} or its equivalent on your OS). | 
| Chris@10 | 50 | 
| Chris@10 | 51 You are not @emph{required} to use @code{fftw_malloc}.  You can | 
| Chris@10 | 52 allocate your data in any way that you like, from @code{malloc} to | 
| Chris@10 | 53 @code{new} (in C++) to a fixed-size array declaration.  If the array | 
| Chris@10 | 54 happens not to be properly aligned, FFTW will not use the SIMD | 
| Chris@10 | 55 extensions. | 
| Chris@10 | 56 @cindex C++ | 
| Chris@10 | 57 | 
| Chris@10 | 58 @findex fftw_alloc_real | 
| Chris@10 | 59 @findex fftw_alloc_complex | 
| Chris@10 | 60 Since @code{fftw_malloc} only ever needs to be used for real and | 
| Chris@10 | 61 complex arrays, we provide two convenient wrapper routines | 
| Chris@10 | 62 @code{fftw_alloc_real(N)} and @code{fftw_alloc_complex(N)} that are | 
| Chris@10 | 63 equivalent to @code{(double*)fftw_malloc(sizeof(double) * N)} and | 
| Chris@10 | 64 @code{(fftw_complex*)fftw_malloc(sizeof(fftw_complex) * N)}, | 
| Chris@10 | 65 respectively (or their equivalents in other precisions). | 
| Chris@10 | 66 | 
| Chris@10 | 67 @c ------------------------------------------------------------ | 
| Chris@10 | 68 @node Multi-dimensional Array Format, Words of Wisdom-Saving Plans, SIMD alignment and fftw_malloc, Other Important Topics | 
| Chris@10 | 69 @section Multi-dimensional Array Format | 
| Chris@10 | 70 | 
| Chris@10 | 71 This section describes the format in which multi-dimensional arrays | 
| Chris@10 | 72 are stored in FFTW.  We felt that a detailed discussion of this topic | 
| Chris@10 | 73 was necessary.  Since several different formats are common, this topic | 
| Chris@10 | 74 is often a source of confusion. | 
| Chris@10 | 75 | 
| Chris@10 | 76 @menu | 
| Chris@10 | 77 * Row-major Format:: | 
| Chris@10 | 78 * Column-major Format:: | 
| Chris@10 | 79 * Fixed-size Arrays in C:: | 
| Chris@10 | 80 * Dynamic Arrays in C:: | 
| Chris@10 | 81 * Dynamic Arrays in C-The Wrong Way:: | 
| Chris@10 | 82 @end menu | 
| Chris@10 | 83 | 
| Chris@10 | 84 @c =========> | 
| Chris@10 | 85 @node Row-major Format, Column-major Format, Multi-dimensional Array Format, Multi-dimensional Array Format | 
| Chris@10 | 86 @subsection Row-major Format | 
| Chris@10 | 87 @cindex row-major | 
| Chris@10 | 88 | 
| Chris@10 | 89 The multi-dimensional arrays passed to @code{fftw_plan_dft} etcetera | 
| Chris@10 | 90 are expected to be stored as a single contiguous block in | 
| Chris@10 | 91 @dfn{row-major} order (sometimes called ``C order'').  Basically, this | 
| Chris@10 | 92 means that as you step through adjacent memory locations, the first | 
| Chris@10 | 93 dimension's index varies most slowly and the last dimension's index | 
| Chris@10 | 94 varies most quickly. | 
| Chris@10 | 95 | 
| Chris@10 | 96 To be more explicit, let us consider an array of rank @math{d} whose | 
| Chris@10 | 97 dimensions are @ndims{}. Now, we specify a location in the array by a | 
| Chris@10 | 98 sequence of @math{d} (zero-based) indices, one for each dimension: | 
| Chris@10 | 99 @tex | 
| Chris@10 | 100 $(i_0, i_1, i_2, \ldots, i_{d-1})$. | 
| Chris@10 | 101 @end tex | 
| Chris@10 | 102 @ifinfo | 
| Chris@10 | 103 (i[0], i[1], ..., i[d-1]). | 
| Chris@10 | 104 @end ifinfo | 
| Chris@10 | 105 @html | 
| Chris@10 | 106 (i<sub>0</sub>, i<sub>1</sub>, i<sub>2</sub>,..., i<sub>d-1</sub>). | 
| Chris@10 | 107 @end html | 
| Chris@10 | 108 If the array is stored in row-major | 
| Chris@10 | 109 order, then this element is located at the position | 
| Chris@10 | 110 @tex | 
| Chris@10 | 111 $i_{d-1} + n_{d-1} (i_{d-2} + n_{d-2} (\ldots + n_1 i_0))$. | 
| Chris@10 | 112 @end tex | 
| Chris@10 | 113 @ifinfo | 
| Chris@10 | 114 i[d-1] + n[d-1] * (i[d-2] + n[d-2] * (... + n[1] * i[0])). | 
| Chris@10 | 115 @end ifinfo | 
| Chris@10 | 116 @html | 
| Chris@10 | 117 i<sub>d-1</sub> + n<sub>d-1</sub> * (i<sub>d-2</sub> + n<sub>d-2</sub> * (... + n<sub>1</sub> * i<sub>0</sub>)). | 
| Chris@10 | 118 @end html | 
| Chris@10 | 119 | 
| Chris@10 | 120 Note that, for the ordinary complex DFT, each element of the array | 
| Chris@10 | 121 must be of type @code{fftw_complex}; i.e. a (real, imaginary) pair of | 
| Chris@10 | 122 (double-precision) numbers. | 
| Chris@10 | 123 | 
| Chris@10 | 124 In the advanced FFTW interface, the physical dimensions @math{n} from | 
| Chris@10 | 125 which the indices are computed can be different from (larger than) | 
| Chris@10 | 126 the logical dimensions of the transform to be computed, in order to | 
| Chris@10 | 127 transform a subset of a larger array. | 
| Chris@10 | 128 @cindex advanced interface | 
| Chris@10 | 129 Note also that, in the advanced interface, the expression above is | 
| Chris@10 | 130 multiplied by a @dfn{stride} to get the actual array index---this is | 
| Chris@10 | 131 useful in situations where each element of the multi-dimensional array | 
| Chris@10 | 132 is actually a data structure (or another array), and you just want to | 
| Chris@10 | 133 transform a single field. In the basic interface, however, the stride | 
| Chris@10 | 134 is 1. | 
| Chris@10 | 135 @cindex stride | 
| Chris@10 | 136 | 
| Chris@10 | 137 @c =========> | 
| Chris@10 | 138 @node Column-major Format, Fixed-size Arrays in C, Row-major Format, Multi-dimensional Array Format | 
| Chris@10 | 139 @subsection Column-major Format | 
| Chris@10 | 140 @cindex column-major | 
| Chris@10 | 141 | 
| Chris@10 | 142 Readers from the Fortran world are used to arrays stored in | 
| Chris@10 | 143 @dfn{column-major} order (sometimes called ``Fortran order'').  This is | 
| Chris@10 | 144 essentially the exact opposite of row-major order in that, here, the | 
| Chris@10 | 145 @emph{first} dimension's index varies most quickly. | 
| Chris@10 | 146 | 
| Chris@10 | 147 If you have an array stored in column-major order and wish to | 
| Chris@10 | 148 transform it using FFTW, it is quite easy to do.  When creating the | 
| Chris@10 | 149 plan, simply pass the dimensions of the array to the planner in | 
| Chris@10 | 150 @emph{reverse order}.  For example, if your array is a rank three | 
| Chris@10 | 151 @code{N x M x L} matrix in column-major order, you should pass the | 
| Chris@10 | 152 dimensions of the array as if it were an @code{L x M x N} matrix | 
| Chris@10 | 153 (which it is, from the perspective of FFTW).  This is done for you | 
| Chris@10 | 154 @emph{automatically} by the FFTW legacy-Fortran interface | 
| Chris@10 | 155 (@pxref{Calling FFTW from Legacy Fortran}), but you must do it | 
| Chris@10 | 156 manually with the modern Fortran interface (@pxref{Reversing array | 
| Chris@10 | 157 dimensions}). | 
| Chris@10 | 158 @cindex Fortran interface | 
| Chris@10 | 159 | 
| Chris@10 | 160 @c =========> | 
| Chris@10 | 161 @node Fixed-size Arrays in C, Dynamic Arrays in C, Column-major Format, Multi-dimensional Array Format | 
| Chris@10 | 162 @subsection Fixed-size Arrays in C | 
| Chris@10 | 163 @cindex C multi-dimensional arrays | 
| Chris@10 | 164 | 
| Chris@10 | 165 A multi-dimensional array whose size is declared at compile time in C | 
| Chris@10 | 166 is @emph{already} in row-major order.  You don't have to do anything | 
| Chris@10 | 167 special to transform it.  For example: | 
| Chris@10 | 168 | 
| Chris@10 | 169 @example | 
| Chris@10 | 170 @{ | 
| Chris@10 | 171      fftw_complex data[N0][N1][N2]; | 
| Chris@10 | 172      fftw_plan plan; | 
| Chris@10 | 173      ... | 
| Chris@10 | 174      plan = fftw_plan_dft_3d(N0, N1, N2, &data[0][0][0], &data[0][0][0], | 
| Chris@10 | 175                              FFTW_FORWARD, FFTW_ESTIMATE); | 
| Chris@10 | 176      ... | 
| Chris@10 | 177 @} | 
| Chris@10 | 178 @end example | 
| Chris@10 | 179 | 
| Chris@10 | 180 This will plan a 3d in-place transform of size @code{N0 x N1 x N2}. | 
| Chris@10 | 181 Notice how we took the address of the zero-th element to pass to the | 
| Chris@10 | 182 planner (we could also have used a typecast). | 
| Chris@10 | 183 | 
| Chris@10 | 184 However, we tend to @emph{discourage} users from declaring their | 
| Chris@10 | 185 arrays in this way, for two reasons.  First, this allocates the array | 
| Chris@10 | 186 on the stack (``automatic'' storage), which has a very limited size on | 
| Chris@10 | 187 most operating systems (declaring an array with more than a few | 
| Chris@10 | 188 thousand elements will often cause a crash).  (You can get around this | 
| Chris@10 | 189 limitation on many systems by declaring the array as | 
| Chris@10 | 190 @code{static} and/or global, but that has its own drawbacks.) | 
| Chris@10 | 191 Second, it may not optimally align the array for use with a SIMD | 
| Chris@10 | 192 FFTW (@pxref{SIMD alignment and fftw_malloc}).  Instead, we recommend | 
| Chris@10 | 193 using @code{fftw_malloc}, as described below. | 
| Chris@10 | 194 | 
| Chris@10 | 195 @c =========> | 
| Chris@10 | 196 @node Dynamic Arrays in C, Dynamic Arrays in C-The Wrong Way, Fixed-size Arrays in C, Multi-dimensional Array Format | 
| Chris@10 | 197 @subsection Dynamic Arrays in C | 
| Chris@10 | 198 | 
| Chris@10 | 199 We recommend allocating most arrays dynamically, with | 
| Chris@10 | 200 @code{fftw_malloc}.  This isn't too hard to do, although it is not as | 
| Chris@10 | 201 straightforward for multi-dimensional arrays as it is for | 
| Chris@10 | 202 one-dimensional arrays. | 
| Chris@10 | 203 | 
| Chris@10 | 204 Creating the array is simple: using a dynamic-allocation routine like | 
| Chris@10 | 205 @code{fftw_malloc}, allocate an array big enough to store N | 
| Chris@10 | 206 @code{fftw_complex} values (for a complex DFT), where N is the product | 
| Chris@10 | 207 of the sizes of the array dimensions (i.e. the total number of complex | 
| Chris@10 | 208 values in the array).  For example, here is code to allocate a | 
| Chris@10 | 209 @threedims{5,12,27} rank-3 array: | 
| Chris@10 | 210 @findex fftw_malloc | 
| Chris@10 | 211 | 
| Chris@10 | 212 @example | 
| Chris@10 | 213 fftw_complex *an_array; | 
| Chris@10 | 214 an_array = (fftw_complex*) fftw_malloc(5*12*27 * sizeof(fftw_complex)); | 
| Chris@10 | 215 @end example | 
| Chris@10 | 216 | 
| Chris@10 | 217 Accessing the array elements, however, is more tricky---you can't | 
| Chris@10 | 218 simply use multiple applications of the @samp{[]} operator like you | 
| Chris@10 | 219 could for fixed-size arrays.  Instead, you have to explicitly compute | 
| Chris@10 | 220 the offset into the array using the formula given earlier for | 
| Chris@10 | 221 row-major arrays.  For example, to reference the @math{(i,j,k)}-th | 
| Chris@10 | 222 element of the array allocated above, you would use the expression | 
| Chris@10 | 223 @code{an_array[k + 27 * (j + 12 * i)]}. | 
| Chris@10 | 224 | 
| Chris@10 | 225 This pain can be alleviated somewhat by defining appropriate macros, | 
| Chris@10 | 226 or, in C++, creating a class and overloading the @samp{()} operator. | 
| Chris@10 | 227 The recent C99 standard provides a way to reinterpret the dynamic | 
| Chris@10 | 228 array as a ``variable-length'' multi-dimensional array amenable to | 
| Chris@10 | 229 @samp{[]}, but this feature is not yet widely supported by compilers. | 
| Chris@10 | 230 @cindex C99 | 
| Chris@10 | 231 @cindex C++ | 
| Chris@10 | 232 | 
| Chris@10 | 233 @c =========> | 
| Chris@10 | 234 @node Dynamic Arrays in C-The Wrong Way,  , Dynamic Arrays in C, Multi-dimensional Array Format | 
| Chris@10 | 235 @subsection Dynamic Arrays in C---The Wrong Way | 
| Chris@10 | 236 | 
| Chris@10 | 237 A different method for allocating multi-dimensional arrays in C is | 
| Chris@10 | 238 often suggested that is incompatible with FFTW: @emph{using it will | 
| Chris@10 | 239 cause FFTW to die a painful death}.  We discuss the technique here, | 
| Chris@10 | 240 however, because it is so commonly known and used.  This method is to | 
| Chris@10 | 241 create arrays of pointers of arrays of pointers of @dots{}etcetera. | 
| Chris@10 | 242 For example, the analogue in this method to the example above is: | 
| Chris@10 | 243 | 
| Chris@10 | 244 @example | 
| Chris@10 | 245 int i,j; | 
| Chris@10 | 246 fftw_complex ***a_bad_array;  /* @r{another way to make a 5x12x27 array} */ | 
| Chris@10 | 247 | 
| Chris@10 | 248 a_bad_array = (fftw_complex ***) malloc(5 * sizeof(fftw_complex **)); | 
| Chris@10 | 249 for (i = 0; i < 5; ++i) @{ | 
| Chris@10 | 250      a_bad_array[i] = | 
| Chris@10 | 251         (fftw_complex **) malloc(12 * sizeof(fftw_complex *)); | 
| Chris@10 | 252      for (j = 0; j < 12; ++j) | 
| Chris@10 | 253           a_bad_array[i][j] = | 
| Chris@10 | 254                 (fftw_complex *) malloc(27 * sizeof(fftw_complex)); | 
| Chris@10 | 255 @} | 
| Chris@10 | 256 @end example | 
| Chris@10 | 257 | 
| Chris@10 | 258 As you can see, this sort of array is inconvenient to allocate (and | 
| Chris@10 | 259 deallocate).  On the other hand, it has the advantage that the | 
| Chris@10 | 260 @math{(i,j,k)}-th element can be referenced simply by | 
| Chris@10 | 261 @code{a_bad_array[i][j][k]}. | 
| Chris@10 | 262 | 
| Chris@10 | 263 If you like this technique and want to maximize convenience in accessing | 
| Chris@10 | 264 the array, but still want to pass the array to FFTW, you can use a | 
| Chris@10 | 265 hybrid method.  Allocate the array as one contiguous block, but also | 
| Chris@10 | 266 declare an array of arrays of pointers that point to appropriate places | 
| Chris@10 | 267 in the block.  That sort of trick is beyond the scope of this | 
| Chris@10 | 268 documentation; for more information on multi-dimensional arrays in C, | 
| Chris@10 | 269 see the @code{comp.lang.c} | 
| Chris@10 | 270 @uref{http://c-faq.com/aryptr/dynmuldimary.html, FAQ}. | 
| Chris@10 | 271 | 
| Chris@10 | 272 @c ------------------------------------------------------------ | 
| Chris@10 | 273 @node Words of Wisdom-Saving Plans, Caveats in Using Wisdom, Multi-dimensional Array Format, Other Important Topics | 
| Chris@10 | 274 @section Words of Wisdom---Saving Plans | 
| Chris@10 | 275 @cindex wisdom | 
| Chris@10 | 276 @cindex saving plans to disk | 
| Chris@10 | 277 | 
| Chris@10 | 278 FFTW implements a method for saving plans to disk and restoring them. | 
| Chris@10 | 279 In fact, what FFTW does is more general than just saving and loading | 
| Chris@10 | 280 plans.  The mechanism is called @dfn{wisdom}.  Here, we describe | 
| Chris@10 | 281 this feature at a high level. @xref{FFTW Reference}, for a less casual | 
| Chris@10 | 282 but more complete discussion of how to use wisdom in FFTW. | 
| Chris@10 | 283 | 
| Chris@10 | 284 Plans created with the @code{FFTW_MEASURE}, @code{FFTW_PATIENT}, or | 
| Chris@10 | 285 @code{FFTW_EXHAUSTIVE} options produce near-optimal FFT performance, | 
| Chris@10 | 286 but may require a long time to compute because FFTW must measure the | 
| Chris@10 | 287 runtime of many possible plans and select the best one.  This setup is | 
| Chris@10 | 288 designed for the situations where so many transforms of the same size | 
| Chris@10 | 289 must be computed that the start-up time is irrelevant.  For short | 
| Chris@10 | 290 initialization times, but slower transforms, we have provided | 
| Chris@10 | 291 @code{FFTW_ESTIMATE}.  The @code{wisdom} mechanism is a way to get the | 
| Chris@10 | 292 best of both worlds: you compute a good plan once, save it to | 
| Chris@10 | 293 disk, and later reload it as many times as necessary.  The wisdom | 
| Chris@10 | 294 mechanism can actually save and reload many plans at once, not just | 
| Chris@10 | 295 one. | 
| Chris@10 | 296 @ctindex FFTW_MEASURE | 
| Chris@10 | 297 @ctindex FFTW_PATIENT | 
| Chris@10 | 298 @ctindex FFTW_EXHAUSTIVE | 
| Chris@10 | 299 @ctindex FFTW_ESTIMATE | 
| Chris@10 | 300 | 
| Chris@10 | 301 | 
| Chris@10 | 302 Whenever you create a plan, the FFTW planner accumulates wisdom, which | 
| Chris@10 | 303 is information sufficient to reconstruct the plan.  After planning, | 
| Chris@10 | 304 you can save this information to disk by means of the function: | 
| Chris@10 | 305 @example | 
| Chris@10 | 306 int fftw_export_wisdom_to_filename(const char *filename); | 
| Chris@10 | 307 @end example | 
| Chris@10 | 308 @findex fftw_export_wisdom_to_filename | 
| Chris@10 | 309 (This function returns non-zero on success.) | 
| Chris@10 | 310 | 
| Chris@10 | 311 The next time you run the program, you can restore the wisdom with | 
| Chris@10 | 312 @code{fftw_import_wisdom_from_filename} (which also returns non-zero on success), | 
| Chris@10 | 313 and then recreate the plan using the same flags as before. | 
| Chris@10 | 314 @example | 
| Chris@10 | 315 int fftw_import_wisdom_from_filename(const char *filename); | 
| Chris@10 | 316 @end example | 
| Chris@10 | 317 @findex fftw_import_wisdom_from_filename | 
| Chris@10 | 318 | 
| Chris@10 | 319 Wisdom is automatically used for any size to which it is applicable, as | 
| Chris@10 | 320 long as the planner flags are not more ``patient'' than those with which | 
| Chris@10 | 321 the wisdom was created.  For example, wisdom created with | 
| Chris@10 | 322 @code{FFTW_MEASURE} can be used if you later plan with | 
| Chris@10 | 323 @code{FFTW_ESTIMATE} or @code{FFTW_MEASURE}, but not with | 
| Chris@10 | 324 @code{FFTW_PATIENT}. | 
| Chris@10 | 325 | 
| Chris@10 | 326 The @code{wisdom} is cumulative, and is stored in a global, private | 
| Chris@10 | 327 data structure managed internally by FFTW.  The storage space required | 
| Chris@10 | 328 is minimal, proportional to the logarithm of the sizes the wisdom was | 
| Chris@10 | 329 generated from.  If memory usage is a concern, however, the wisdom can | 
| Chris@10 | 330 be forgotten and its associated memory freed by calling: | 
| Chris@10 | 331 @example | 
| Chris@10 | 332 void fftw_forget_wisdom(void); | 
| Chris@10 | 333 @end example | 
| Chris@10 | 334 @findex fftw_forget_wisdom | 
| Chris@10 | 335 | 
| Chris@10 | 336 Wisdom can be exported to a file, a string, or any other medium. | 
| Chris@10 | 337 For details, see @ref{Wisdom}. | 
| Chris@10 | 338 | 
| Chris@10 | 339 @node Caveats in Using Wisdom,  , Words of Wisdom-Saving Plans, Other Important Topics | 
| Chris@10 | 340 @section Caveats in Using Wisdom | 
| Chris@10 | 341 @cindex wisdom, problems with | 
| Chris@10 | 342 | 
| Chris@10 | 343 @quotation | 
| Chris@10 | 344 @html | 
| Chris@10 | 345 <i> | 
| Chris@10 | 346 @end html | 
| Chris@10 | 347 For in much wisdom is much grief, and he that increaseth knowledge | 
| Chris@10 | 348 increaseth sorrow. | 
| Chris@10 | 349 @html | 
| Chris@10 | 350 </i> | 
| Chris@10 | 351 @end html | 
| Chris@10 | 352 [Ecclesiastes 1:18] | 
| Chris@10 | 353 @cindex Ecclesiastes | 
| Chris@10 | 354 @end quotation | 
| Chris@10 | 355 @iftex | 
| Chris@10 | 356 @medskip | 
| Chris@10 | 357 @end iftex | 
| Chris@10 | 358 | 
| Chris@10 | 359 @cindex portability | 
| Chris@10 | 360 There are pitfalls to using wisdom, in that it can negate FFTW's | 
| Chris@10 | 361 ability to adapt to changing hardware and other conditions. For | 
| Chris@10 | 362 example, it would be perfectly possible to export wisdom from a | 
| Chris@10 | 363 program running on one processor and import it into a program running | 
| Chris@10 | 364 on another processor.  Doing so, however, would mean that the second | 
| Chris@10 | 365 program would use plans optimized for the first processor, instead of | 
| Chris@10 | 366 the one it is running on. | 
| Chris@10 | 367 | 
| Chris@10 | 368 It should be safe to reuse wisdom as long as the hardware and program | 
| Chris@10 | 369 binaries remain unchanged. (Actually, the optimal plan may change even | 
| Chris@10 | 370 between runs of the same binary on identical hardware, due to | 
| Chris@10 | 371 differences in the virtual memory environment, etcetera.  Users | 
| Chris@10 | 372 seriously interested in performance should worry about this problem, | 
| Chris@10 | 373 too.)  It is likely that, if the same wisdom is used for two | 
| Chris@10 | 374 different program binaries, even running on the same machine, the | 
| Chris@10 | 375 plans may be sub-optimal because of differing code alignments.  It is | 
| Chris@10 | 376 therefore wise to recreate wisdom every time an application is | 
| Chris@10 | 377 recompiled.  The more the underlying hardware and software changes | 
| Chris@10 | 378 between the creation of wisdom and its use, the greater grows | 
| Chris@10 | 379 the risk of sub-optimal plans. | 
| Chris@10 | 380 | 
| Chris@10 | 381 Nevertheless, if the choice is between using @code{FFTW_ESTIMATE} or | 
| Chris@10 | 382 using possibly-suboptimal wisdom (created on the same machine, but for a | 
| Chris@10 | 383 different binary), the wisdom is likely to be better.  For this reason, | 
| Chris@10 | 384 we provide a function to import wisdom from a standard system-wide | 
| Chris@10 | 385 location (@code{/etc/fftw/wisdom} on Unix): | 
| Chris@10 | 386 @cindex wisdom, system-wide | 
| Chris@10 | 387 | 
| Chris@10 | 388 @example | 
| Chris@10 | 389 int fftw_import_system_wisdom(void); | 
| Chris@10 | 390 @end example | 
| Chris@10 | 391 @findex fftw_import_system_wisdom | 
| Chris@10 | 392 | 
| Chris@10 | 393 FFTW also provides a standalone program, @code{fftw-wisdom} (described | 
| Chris@10 | 394 by its own @code{man} page on Unix) with which users can create wisdom, | 
| Chris@10 | 395 e.g. for a canonical set of sizes to store in the system wisdom file. | 
| Chris@10 | 396 @xref{Wisdom Utilities}. | 
| Chris@10 | 397 @cindex fftw-wisdom utility | 
| Chris@10 | 398 |