cannam@127
|
1 @node Other Important Topics, FFTW Reference, Tutorial, Top
|
cannam@127
|
2 @chapter Other Important Topics
|
cannam@127
|
3 @menu
|
cannam@127
|
4 * SIMD alignment and fftw_malloc::
|
cannam@127
|
5 * Multi-dimensional Array Format::
|
cannam@127
|
6 * Words of Wisdom-Saving Plans::
|
cannam@127
|
7 * Caveats in Using Wisdom::
|
cannam@127
|
8 @end menu
|
cannam@127
|
9
|
cannam@127
|
10 @c ------------------------------------------------------------
|
cannam@127
|
11 @node SIMD alignment and fftw_malloc, Multi-dimensional Array Format, Other Important Topics, Other Important Topics
|
cannam@127
|
12 @section SIMD alignment and fftw_malloc
|
cannam@127
|
13
|
cannam@127
|
14 SIMD, which stands for ``Single Instruction Multiple Data,'' is a set of
|
cannam@127
|
15 special operations supported by some processors to perform a single
|
cannam@127
|
16 operation on several numbers (usually 2 or 4) simultaneously. SIMD
|
cannam@127
|
17 floating-point instructions are available on several popular CPUs:
|
cannam@127
|
18 SSE/SSE2/AVX/AVX2/AVX512/KCVI on some x86/x86-64 processors, AltiVec and
|
cannam@127
|
19 VSX on some POWER/PowerPCs, NEON on some ARM models. FFTW can be
|
cannam@127
|
20 compiled to support the SIMD instructions on any of these systems.
|
cannam@127
|
21 @cindex SIMD
|
cannam@127
|
22 @cindex SSE
|
cannam@127
|
23 @cindex SSE2
|
cannam@127
|
24 @cindex AVX
|
cannam@127
|
25 @cindex AVX2
|
cannam@127
|
26 @cindex AVX512
|
cannam@127
|
27 @cindex AltiVec
|
cannam@127
|
28 @cindex VSX
|
cannam@127
|
29 @cindex precision
|
cannam@127
|
30
|
cannam@127
|
31
|
cannam@127
|
32 A program linking to an FFTW library compiled with SIMD support can
|
cannam@127
|
33 obtain a nonnegligible speedup for most complex and r2c/c2r
|
cannam@127
|
34 transforms. In order to obtain this speedup, however, the arrays of
|
cannam@127
|
35 complex (or real) data passed to FFTW must be specially aligned in
|
cannam@127
|
36 memory (typically 16-byte aligned), and often this alignment is more
|
cannam@127
|
37 stringent than that provided by the usual @code{malloc} (etc.)
|
cannam@127
|
38 allocation routines.
|
cannam@127
|
39
|
cannam@127
|
40 @cindex portability
|
cannam@127
|
41 In order to guarantee proper alignment for SIMD, therefore, in case
|
cannam@127
|
42 your program is ever linked against a SIMD-using FFTW, we recommend
|
cannam@127
|
43 allocating your transform data with @code{fftw_malloc} and
|
cannam@127
|
44 de-allocating it with @code{fftw_free}.
|
cannam@127
|
45 @findex fftw_malloc
|
cannam@127
|
46 @findex fftw_free
|
cannam@127
|
47 These have exactly the same interface and behavior as
|
cannam@127
|
48 @code{malloc}/@code{free}, except that for a SIMD FFTW they ensure
|
cannam@127
|
49 that the returned pointer has the necessary alignment (by calling
|
cannam@127
|
50 @code{memalign} or its equivalent on your OS).
|
cannam@127
|
51
|
cannam@127
|
52 You are not @emph{required} to use @code{fftw_malloc}. You can
|
cannam@127
|
53 allocate your data in any way that you like, from @code{malloc} to
|
cannam@127
|
54 @code{new} (in C++) to a fixed-size array declaration. If the array
|
cannam@127
|
55 happens not to be properly aligned, FFTW will not use the SIMD
|
cannam@127
|
56 extensions.
|
cannam@127
|
57 @cindex C++
|
cannam@127
|
58
|
cannam@127
|
59 @findex fftw_alloc_real
|
cannam@127
|
60 @findex fftw_alloc_complex
|
cannam@127
|
61 Since @code{fftw_malloc} only ever needs to be used for real and
|
cannam@127
|
62 complex arrays, we provide two convenient wrapper routines
|
cannam@127
|
63 @code{fftw_alloc_real(N)} and @code{fftw_alloc_complex(N)} that are
|
cannam@127
|
64 equivalent to @code{(double*)fftw_malloc(sizeof(double) * N)} and
|
cannam@127
|
65 @code{(fftw_complex*)fftw_malloc(sizeof(fftw_complex) * N)},
|
cannam@127
|
66 respectively (or their equivalents in other precisions).
|
cannam@127
|
67
|
cannam@127
|
68 @c ------------------------------------------------------------
|
cannam@127
|
69 @node Multi-dimensional Array Format, Words of Wisdom-Saving Plans, SIMD alignment and fftw_malloc, Other Important Topics
|
cannam@127
|
70 @section Multi-dimensional Array Format
|
cannam@127
|
71
|
cannam@127
|
72 This section describes the format in which multi-dimensional arrays
|
cannam@127
|
73 are stored in FFTW. We felt that a detailed discussion of this topic
|
cannam@127
|
74 was necessary. Since several different formats are common, this topic
|
cannam@127
|
75 is often a source of confusion.
|
cannam@127
|
76
|
cannam@127
|
77 @menu
|
cannam@127
|
78 * Row-major Format::
|
cannam@127
|
79 * Column-major Format::
|
cannam@127
|
80 * Fixed-size Arrays in C::
|
cannam@127
|
81 * Dynamic Arrays in C::
|
cannam@127
|
82 * Dynamic Arrays in C-The Wrong Way::
|
cannam@127
|
83 @end menu
|
cannam@127
|
84
|
cannam@127
|
85 @c =========>
|
cannam@127
|
86 @node Row-major Format, Column-major Format, Multi-dimensional Array Format, Multi-dimensional Array Format
|
cannam@127
|
87 @subsection Row-major Format
|
cannam@127
|
88 @cindex row-major
|
cannam@127
|
89
|
cannam@127
|
90 The multi-dimensional arrays passed to @code{fftw_plan_dft} etcetera
|
cannam@127
|
91 are expected to be stored as a single contiguous block in
|
cannam@127
|
92 @dfn{row-major} order (sometimes called ``C order''). Basically, this
|
cannam@127
|
93 means that as you step through adjacent memory locations, the first
|
cannam@127
|
94 dimension's index varies most slowly and the last dimension's index
|
cannam@127
|
95 varies most quickly.
|
cannam@127
|
96
|
cannam@127
|
97 To be more explicit, let us consider an array of rank @math{d} whose
|
cannam@127
|
98 dimensions are @ndims{}. Now, we specify a location in the array by a
|
cannam@127
|
99 sequence of @math{d} (zero-based) indices, one for each dimension:
|
cannam@127
|
100 @tex
|
cannam@127
|
101 $(i_0, i_1, i_2, \ldots, i_{d-1})$.
|
cannam@127
|
102 @end tex
|
cannam@127
|
103 @ifinfo
|
cannam@127
|
104 (i[0], i[1], ..., i[d-1]).
|
cannam@127
|
105 @end ifinfo
|
cannam@127
|
106 @html
|
cannam@127
|
107 (i<sub>0</sub>, i<sub>1</sub>, i<sub>2</sub>,..., i<sub>d-1</sub>).
|
cannam@127
|
108 @end html
|
cannam@127
|
109 If the array is stored in row-major
|
cannam@127
|
110 order, then this element is located at the position
|
cannam@127
|
111 @tex
|
cannam@127
|
112 $i_{d-1} + n_{d-1} (i_{d-2} + n_{d-2} (\ldots + n_1 i_0))$.
|
cannam@127
|
113 @end tex
|
cannam@127
|
114 @ifinfo
|
cannam@127
|
115 i[d-1] + n[d-1] * (i[d-2] + n[d-2] * (... + n[1] * i[0])).
|
cannam@127
|
116 @end ifinfo
|
cannam@127
|
117 @html
|
cannam@127
|
118 i<sub>d-1</sub> + n<sub>d-1</sub> * (i<sub>d-2</sub> + n<sub>d-2</sub> * (... + n<sub>1</sub> * i<sub>0</sub>)).
|
cannam@127
|
119 @end html
|
cannam@127
|
120
|
cannam@127
|
121 Note that, for the ordinary complex DFT, each element of the array
|
cannam@127
|
122 must be of type @code{fftw_complex}; i.e. a (real, imaginary) pair of
|
cannam@127
|
123 (double-precision) numbers.
|
cannam@127
|
124
|
cannam@127
|
125 In the advanced FFTW interface, the physical dimensions @math{n} from
|
cannam@127
|
126 which the indices are computed can be different from (larger than)
|
cannam@127
|
127 the logical dimensions of the transform to be computed, in order to
|
cannam@127
|
128 transform a subset of a larger array.
|
cannam@127
|
129 @cindex advanced interface
|
cannam@127
|
130 Note also that, in the advanced interface, the expression above is
|
cannam@127
|
131 multiplied by a @dfn{stride} to get the actual array index---this is
|
cannam@127
|
132 useful in situations where each element of the multi-dimensional array
|
cannam@127
|
133 is actually a data structure (or another array), and you just want to
|
cannam@127
|
134 transform a single field. In the basic interface, however, the stride
|
cannam@127
|
135 is 1.
|
cannam@127
|
136 @cindex stride
|
cannam@127
|
137
|
cannam@127
|
138 @c =========>
|
cannam@127
|
139 @node Column-major Format, Fixed-size Arrays in C, Row-major Format, Multi-dimensional Array Format
|
cannam@127
|
140 @subsection Column-major Format
|
cannam@127
|
141 @cindex column-major
|
cannam@127
|
142
|
cannam@127
|
143 Readers from the Fortran world are used to arrays stored in
|
cannam@127
|
144 @dfn{column-major} order (sometimes called ``Fortran order''). This is
|
cannam@127
|
145 essentially the exact opposite of row-major order in that, here, the
|
cannam@127
|
146 @emph{first} dimension's index varies most quickly.
|
cannam@127
|
147
|
cannam@127
|
148 If you have an array stored in column-major order and wish to
|
cannam@127
|
149 transform it using FFTW, it is quite easy to do. When creating the
|
cannam@127
|
150 plan, simply pass the dimensions of the array to the planner in
|
cannam@127
|
151 @emph{reverse order}. For example, if your array is a rank three
|
cannam@127
|
152 @code{N x M x L} matrix in column-major order, you should pass the
|
cannam@127
|
153 dimensions of the array as if it were an @code{L x M x N} matrix
|
cannam@127
|
154 (which it is, from the perspective of FFTW). This is done for you
|
cannam@127
|
155 @emph{automatically} by the FFTW legacy-Fortran interface
|
cannam@127
|
156 (@pxref{Calling FFTW from Legacy Fortran}), but you must do it
|
cannam@127
|
157 manually with the modern Fortran interface (@pxref{Reversing array
|
cannam@127
|
158 dimensions}).
|
cannam@127
|
159 @cindex Fortran interface
|
cannam@127
|
160
|
cannam@127
|
161 @c =========>
|
cannam@127
|
162 @node Fixed-size Arrays in C, Dynamic Arrays in C, Column-major Format, Multi-dimensional Array Format
|
cannam@127
|
163 @subsection Fixed-size Arrays in C
|
cannam@127
|
164 @cindex C multi-dimensional arrays
|
cannam@127
|
165
|
cannam@127
|
166 A multi-dimensional array whose size is declared at compile time in C
|
cannam@127
|
167 is @emph{already} in row-major order. You don't have to do anything
|
cannam@127
|
168 special to transform it. For example:
|
cannam@127
|
169
|
cannam@127
|
170 @example
|
cannam@127
|
171 @{
|
cannam@127
|
172 fftw_complex data[N0][N1][N2];
|
cannam@127
|
173 fftw_plan plan;
|
cannam@127
|
174 ...
|
cannam@127
|
175 plan = fftw_plan_dft_3d(N0, N1, N2, &data[0][0][0], &data[0][0][0],
|
cannam@127
|
176 FFTW_FORWARD, FFTW_ESTIMATE);
|
cannam@127
|
177 ...
|
cannam@127
|
178 @}
|
cannam@127
|
179 @end example
|
cannam@127
|
180
|
cannam@127
|
181 This will plan a 3d in-place transform of size @code{N0 x N1 x N2}.
|
cannam@127
|
182 Notice how we took the address of the zero-th element to pass to the
|
cannam@127
|
183 planner (we could also have used a typecast).
|
cannam@127
|
184
|
cannam@127
|
185 However, we tend to @emph{discourage} users from declaring their
|
cannam@127
|
186 arrays in this way, for two reasons. First, this allocates the array
|
cannam@127
|
187 on the stack (``automatic'' storage), which has a very limited size on
|
cannam@127
|
188 most operating systems (declaring an array with more than a few
|
cannam@127
|
189 thousand elements will often cause a crash). (You can get around this
|
cannam@127
|
190 limitation on many systems by declaring the array as
|
cannam@127
|
191 @code{static} and/or global, but that has its own drawbacks.)
|
cannam@127
|
192 Second, it may not optimally align the array for use with a SIMD
|
cannam@127
|
193 FFTW (@pxref{SIMD alignment and fftw_malloc}). Instead, we recommend
|
cannam@127
|
194 using @code{fftw_malloc}, as described below.
|
cannam@127
|
195
|
cannam@127
|
196 @c =========>
|
cannam@127
|
197 @node Dynamic Arrays in C, Dynamic Arrays in C-The Wrong Way, Fixed-size Arrays in C, Multi-dimensional Array Format
|
cannam@127
|
198 @subsection Dynamic Arrays in C
|
cannam@127
|
199
|
cannam@127
|
200 We recommend allocating most arrays dynamically, with
|
cannam@127
|
201 @code{fftw_malloc}. This isn't too hard to do, although it is not as
|
cannam@127
|
202 straightforward for multi-dimensional arrays as it is for
|
cannam@127
|
203 one-dimensional arrays.
|
cannam@127
|
204
|
cannam@127
|
205 Creating the array is simple: using a dynamic-allocation routine like
|
cannam@127
|
206 @code{fftw_malloc}, allocate an array big enough to store N
|
cannam@127
|
207 @code{fftw_complex} values (for a complex DFT), where N is the product
|
cannam@127
|
208 of the sizes of the array dimensions (i.e. the total number of complex
|
cannam@127
|
209 values in the array). For example, here is code to allocate a
|
cannam@127
|
210 @threedims{5,12,27} rank-3 array:
|
cannam@127
|
211 @findex fftw_malloc
|
cannam@127
|
212
|
cannam@127
|
213 @example
|
cannam@127
|
214 fftw_complex *an_array;
|
cannam@127
|
215 an_array = (fftw_complex*) fftw_malloc(5*12*27 * sizeof(fftw_complex));
|
cannam@127
|
216 @end example
|
cannam@127
|
217
|
cannam@127
|
218 Accessing the array elements, however, is more tricky---you can't
|
cannam@127
|
219 simply use multiple applications of the @samp{[]} operator like you
|
cannam@127
|
220 could for fixed-size arrays. Instead, you have to explicitly compute
|
cannam@127
|
221 the offset into the array using the formula given earlier for
|
cannam@127
|
222 row-major arrays. For example, to reference the @math{(i,j,k)}-th
|
cannam@127
|
223 element of the array allocated above, you would use the expression
|
cannam@127
|
224 @code{an_array[k + 27 * (j + 12 * i)]}.
|
cannam@127
|
225
|
cannam@127
|
226 This pain can be alleviated somewhat by defining appropriate macros,
|
cannam@127
|
227 or, in C++, creating a class and overloading the @samp{()} operator.
|
cannam@127
|
228 The recent C99 standard provides a way to reinterpret the dynamic
|
cannam@127
|
229 array as a ``variable-length'' multi-dimensional array amenable to
|
cannam@127
|
230 @samp{[]}, but this feature is not yet widely supported by compilers.
|
cannam@127
|
231 @cindex C99
|
cannam@127
|
232 @cindex C++
|
cannam@127
|
233
|
cannam@127
|
234 @c =========>
|
cannam@127
|
235 @node Dynamic Arrays in C-The Wrong Way, , Dynamic Arrays in C, Multi-dimensional Array Format
|
cannam@127
|
236 @subsection Dynamic Arrays in C---The Wrong Way
|
cannam@127
|
237
|
cannam@127
|
238 A different method for allocating multi-dimensional arrays in C is
|
cannam@127
|
239 often suggested that is incompatible with FFTW: @emph{using it will
|
cannam@127
|
240 cause FFTW to die a painful death}. We discuss the technique here,
|
cannam@127
|
241 however, because it is so commonly known and used. This method is to
|
cannam@127
|
242 create arrays of pointers of arrays of pointers of @dots{}etcetera.
|
cannam@127
|
243 For example, the analogue in this method to the example above is:
|
cannam@127
|
244
|
cannam@127
|
245 @example
|
cannam@127
|
246 int i,j;
|
cannam@127
|
247 fftw_complex ***a_bad_array; /* @r{another way to make a 5x12x27 array} */
|
cannam@127
|
248
|
cannam@127
|
249 a_bad_array = (fftw_complex ***) malloc(5 * sizeof(fftw_complex **));
|
cannam@127
|
250 for (i = 0; i < 5; ++i) @{
|
cannam@127
|
251 a_bad_array[i] =
|
cannam@127
|
252 (fftw_complex **) malloc(12 * sizeof(fftw_complex *));
|
cannam@127
|
253 for (j = 0; j < 12; ++j)
|
cannam@127
|
254 a_bad_array[i][j] =
|
cannam@127
|
255 (fftw_complex *) malloc(27 * sizeof(fftw_complex));
|
cannam@127
|
256 @}
|
cannam@127
|
257 @end example
|
cannam@127
|
258
|
cannam@127
|
259 As you can see, this sort of array is inconvenient to allocate (and
|
cannam@127
|
260 deallocate). On the other hand, it has the advantage that the
|
cannam@127
|
261 @math{(i,j,k)}-th element can be referenced simply by
|
cannam@127
|
262 @code{a_bad_array[i][j][k]}.
|
cannam@127
|
263
|
cannam@127
|
264 If you like this technique and want to maximize convenience in accessing
|
cannam@127
|
265 the array, but still want to pass the array to FFTW, you can use a
|
cannam@127
|
266 hybrid method. Allocate the array as one contiguous block, but also
|
cannam@127
|
267 declare an array of arrays of pointers that point to appropriate places
|
cannam@127
|
268 in the block. That sort of trick is beyond the scope of this
|
cannam@127
|
269 documentation; for more information on multi-dimensional arrays in C,
|
cannam@127
|
270 see the @code{comp.lang.c}
|
cannam@127
|
271 @uref{http://c-faq.com/aryptr/dynmuldimary.html, FAQ}.
|
cannam@127
|
272
|
cannam@127
|
273 @c ------------------------------------------------------------
|
cannam@127
|
274 @node Words of Wisdom-Saving Plans, Caveats in Using Wisdom, Multi-dimensional Array Format, Other Important Topics
|
cannam@127
|
275 @section Words of Wisdom---Saving Plans
|
cannam@127
|
276 @cindex wisdom
|
cannam@127
|
277 @cindex saving plans to disk
|
cannam@127
|
278
|
cannam@127
|
279 FFTW implements a method for saving plans to disk and restoring them.
|
cannam@127
|
280 In fact, what FFTW does is more general than just saving and loading
|
cannam@127
|
281 plans. The mechanism is called @dfn{wisdom}. Here, we describe
|
cannam@127
|
282 this feature at a high level. @xref{FFTW Reference}, for a less casual
|
cannam@127
|
283 but more complete discussion of how to use wisdom in FFTW.
|
cannam@127
|
284
|
cannam@127
|
285 Plans created with the @code{FFTW_MEASURE}, @code{FFTW_PATIENT}, or
|
cannam@127
|
286 @code{FFTW_EXHAUSTIVE} options produce near-optimal FFT performance,
|
cannam@127
|
287 but may require a long time to compute because FFTW must measure the
|
cannam@127
|
288 runtime of many possible plans and select the best one. This setup is
|
cannam@127
|
289 designed for the situations where so many transforms of the same size
|
cannam@127
|
290 must be computed that the start-up time is irrelevant. For short
|
cannam@127
|
291 initialization times, but slower transforms, we have provided
|
cannam@127
|
292 @code{FFTW_ESTIMATE}. The @code{wisdom} mechanism is a way to get the
|
cannam@127
|
293 best of both worlds: you compute a good plan once, save it to
|
cannam@127
|
294 disk, and later reload it as many times as necessary. The wisdom
|
cannam@127
|
295 mechanism can actually save and reload many plans at once, not just
|
cannam@127
|
296 one.
|
cannam@127
|
297 @ctindex FFTW_MEASURE
|
cannam@127
|
298 @ctindex FFTW_PATIENT
|
cannam@127
|
299 @ctindex FFTW_EXHAUSTIVE
|
cannam@127
|
300 @ctindex FFTW_ESTIMATE
|
cannam@127
|
301
|
cannam@127
|
302
|
cannam@127
|
303 Whenever you create a plan, the FFTW planner accumulates wisdom, which
|
cannam@127
|
304 is information sufficient to reconstruct the plan. After planning,
|
cannam@127
|
305 you can save this information to disk by means of the function:
|
cannam@127
|
306 @example
|
cannam@127
|
307 int fftw_export_wisdom_to_filename(const char *filename);
|
cannam@127
|
308 @end example
|
cannam@127
|
309 @findex fftw_export_wisdom_to_filename
|
cannam@127
|
310 (This function returns non-zero on success.)
|
cannam@127
|
311
|
cannam@127
|
312 The next time you run the program, you can restore the wisdom with
|
cannam@127
|
313 @code{fftw_import_wisdom_from_filename} (which also returns non-zero on success),
|
cannam@127
|
314 and then recreate the plan using the same flags as before.
|
cannam@127
|
315 @example
|
cannam@127
|
316 int fftw_import_wisdom_from_filename(const char *filename);
|
cannam@127
|
317 @end example
|
cannam@127
|
318 @findex fftw_import_wisdom_from_filename
|
cannam@127
|
319
|
cannam@127
|
320 Wisdom is automatically used for any size to which it is applicable, as
|
cannam@127
|
321 long as the planner flags are not more ``patient'' than those with which
|
cannam@127
|
322 the wisdom was created. For example, wisdom created with
|
cannam@127
|
323 @code{FFTW_MEASURE} can be used if you later plan with
|
cannam@127
|
324 @code{FFTW_ESTIMATE} or @code{FFTW_MEASURE}, but not with
|
cannam@127
|
325 @code{FFTW_PATIENT}.
|
cannam@127
|
326
|
cannam@127
|
327 The @code{wisdom} is cumulative, and is stored in a global, private
|
cannam@127
|
328 data structure managed internally by FFTW. The storage space required
|
cannam@127
|
329 is minimal, proportional to the logarithm of the sizes the wisdom was
|
cannam@127
|
330 generated from. If memory usage is a concern, however, the wisdom can
|
cannam@127
|
331 be forgotten and its associated memory freed by calling:
|
cannam@127
|
332 @example
|
cannam@127
|
333 void fftw_forget_wisdom(void);
|
cannam@127
|
334 @end example
|
cannam@127
|
335 @findex fftw_forget_wisdom
|
cannam@127
|
336
|
cannam@127
|
337 Wisdom can be exported to a file, a string, or any other medium.
|
cannam@127
|
338 For details, see @ref{Wisdom}.
|
cannam@127
|
339
|
cannam@127
|
340 @node Caveats in Using Wisdom, , Words of Wisdom-Saving Plans, Other Important Topics
|
cannam@127
|
341 @section Caveats in Using Wisdom
|
cannam@127
|
342 @cindex wisdom, problems with
|
cannam@127
|
343
|
cannam@127
|
344 @quotation
|
cannam@127
|
345 @html
|
cannam@127
|
346 <i>
|
cannam@127
|
347 @end html
|
cannam@127
|
348 For in much wisdom is much grief, and he that increaseth knowledge
|
cannam@127
|
349 increaseth sorrow.
|
cannam@127
|
350 @html
|
cannam@127
|
351 </i>
|
cannam@127
|
352 @end html
|
cannam@127
|
353 [Ecclesiastes 1:18]
|
cannam@127
|
354 @cindex Ecclesiastes
|
cannam@127
|
355 @end quotation
|
cannam@127
|
356 @iftex
|
cannam@127
|
357 @medskip
|
cannam@127
|
358 @end iftex
|
cannam@127
|
359
|
cannam@127
|
360 @cindex portability
|
cannam@127
|
361 There are pitfalls to using wisdom, in that it can negate FFTW's
|
cannam@127
|
362 ability to adapt to changing hardware and other conditions. For
|
cannam@127
|
363 example, it would be perfectly possible to export wisdom from a
|
cannam@127
|
364 program running on one processor and import it into a program running
|
cannam@127
|
365 on another processor. Doing so, however, would mean that the second
|
cannam@127
|
366 program would use plans optimized for the first processor, instead of
|
cannam@127
|
367 the one it is running on.
|
cannam@127
|
368
|
cannam@127
|
369 It should be safe to reuse wisdom as long as the hardware and program
|
cannam@127
|
370 binaries remain unchanged. (Actually, the optimal plan may change even
|
cannam@127
|
371 between runs of the same binary on identical hardware, due to
|
cannam@127
|
372 differences in the virtual memory environment, etcetera. Users
|
cannam@127
|
373 seriously interested in performance should worry about this problem,
|
cannam@127
|
374 too.) It is likely that, if the same wisdom is used for two
|
cannam@127
|
375 different program binaries, even running on the same machine, the
|
cannam@127
|
376 plans may be sub-optimal because of differing code alignments. It is
|
cannam@127
|
377 therefore wise to recreate wisdom every time an application is
|
cannam@127
|
378 recompiled. The more the underlying hardware and software changes
|
cannam@127
|
379 between the creation of wisdom and its use, the greater grows
|
cannam@127
|
380 the risk of sub-optimal plans.
|
cannam@127
|
381
|
cannam@127
|
382 Nevertheless, if the choice is between using @code{FFTW_ESTIMATE} or
|
cannam@127
|
383 using possibly-suboptimal wisdom (created on the same machine, but for a
|
cannam@127
|
384 different binary), the wisdom is likely to be better. For this reason,
|
cannam@127
|
385 we provide a function to import wisdom from a standard system-wide
|
cannam@127
|
386 location (@code{/etc/fftw/wisdom} on Unix):
|
cannam@127
|
387 @cindex wisdom, system-wide
|
cannam@127
|
388
|
cannam@127
|
389 @example
|
cannam@127
|
390 int fftw_import_system_wisdom(void);
|
cannam@127
|
391 @end example
|
cannam@127
|
392 @findex fftw_import_system_wisdom
|
cannam@127
|
393
|
cannam@127
|
394 FFTW also provides a standalone program, @code{fftw-wisdom} (described
|
cannam@127
|
395 by its own @code{man} page on Unix) with which users can create wisdom,
|
cannam@127
|
396 e.g. for a canonical set of sizes to store in the system wisdom file.
|
cannam@127
|
397 @xref{Wisdom Utilities}.
|
cannam@127
|
398 @cindex fftw-wisdom utility
|
cannam@127
|
399
|