Mercurial > hg > sv-dependency-builds
comparison src/fftw-3.3.5/doc/FAQ/fftw-faq.html/section3.html @ 127:7867fa7e1b6b
Current fftw source
author | Chris Cannam <cannam@all-day-breakfast.com> |
---|---|
date | Tue, 18 Oct 2016 13:40:26 +0100 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
126:4a7071416412 | 127:7867fa7e1b6b |
---|---|
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> | |
2 <html> | |
3 <head><title> | |
4 FFTW FAQ - Section 3 | |
5 </title> | |
6 <link rev="made" href="mailto:fftw@fftw.org"> | |
7 <link rel="Contents" href="index.html"> | |
8 <link rel="Start" href="index.html"> | |
9 <link rel="Next" href="section4.html"><link rel="Previous" href="section2.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html"> | |
10 </head><body text="#000000" bgcolor="#FFFFFF"><h1> | |
11 FFTW FAQ - Section 3 <br> | |
12 Using FFTW | |
13 </h1> | |
14 | |
15 <ul> | |
16 <li><a href="#fftw2to3" rel=subdocument>Q3.1. Why not support the FFTW 2 interface in FFTW | |
17 3?</a> | |
18 <li><a href="#planperarray" rel=subdocument>Q3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just | |
19 the algorithm?</a> | |
20 <li><a href="#slow" rel=subdocument>Q3.3. FFTW seems really slow.</a> | |
21 <li><a href="#slows" rel=subdocument>Q3.4. FFTW slows down after repeated calls.</a> | |
22 <li><a href="#segfault" rel=subdocument>Q3.5. An FFTW routine is crashing when I call it.</a> | |
23 <li><a href="#fortran64" rel=subdocument>Q3.6. My Fortran program crashes when calling FFTW.</a> | |
24 <li><a href="#conventions" rel=subdocument>Q3.7. FFTW gives results different from my old | |
25 FFT.</a> | |
26 <li><a href="#nondeterministic" rel=subdocument>Q3.8. FFTW gives different results between runs</a> | |
27 <li><a href="#savePlans" rel=subdocument>Q3.9. Can I save FFTW's plans?</a> | |
28 <li><a href="#whyscaled" rel=subdocument>Q3.10. Why does your inverse transform return a scaled | |
29 result?</a> | |
30 <li><a href="#centerorigin" rel=subdocument>Q3.11. How can I make FFTW put the origin (zero frequency) at the center of | |
31 its output?</a> | |
32 <li><a href="#imageaudio" rel=subdocument>Q3.12. How do I FFT an image/audio file in <i>foobar</i> format?</a> | |
33 <li><a href="#linkfails" rel=subdocument>Q3.13. My program does not link (on Unix).</a> | |
34 <li><a href="#linkheader" rel=subdocument>Q3.14. I included your header, but linking still | |
35 fails.</a> | |
36 <li><a href="#nostack" rel=subdocument>Q3.15. My program crashes, complaining about stack | |
37 space.</a> | |
38 <li><a href="#leaks" rel=subdocument>Q3.16. FFTW seems to have a memory leak.</a> | |
39 <li><a href="#allzero" rel=subdocument>Q3.17. The output of FFTW's transform is all zeros.</a> | |
40 <li><a href="#vbetalia" rel=subdocument>Q3.18. How do I call FFTW from the Microsoft language du | |
41 jour?</a> | |
42 <li><a href="#pruned" rel=subdocument>Q3.19. Can I compute only a subset of the DFT outputs?</a> | |
43 <li><a href="#transpose" rel=subdocument>Q3.20. Can I use FFTW's routines for in-place and out-of-place matrix | |
44 transposition?</a> | |
45 </ul><hr> | |
46 | |
47 <h2><A name="fftw2to3"> | |
48 Question 3.1. Why not support the FFTW 2 interface in FFTW | |
49 3? | |
50 </A></h2> | |
51 | |
52 FFTW 3 has semantics incompatible with earlier versions: its plans can | |
53 only be used for a given stride, multiplicity, and other | |
54 characteristics of the input and output arrays; these stronger | |
55 semantics are necessary for performance reasons. Thus, it is | |
56 impossible to efficiently emulate the older interface (whose plans can | |
57 be used for any transform of the same size). We believe that it | |
58 should be possible to upgrade most programs without any difficulty, | |
59 however. | |
60 <h2><A name="planperarray"> | |
61 Question 3.2. Why do FFTW 3 plans encapsulate the input/output arrays | |
62 and not just the algorithm? | |
63 </A></h2> | |
64 | |
65 There are several reasons: | |
66 <ul> | |
67 <li>It was important for performance reasons that the plan be specific to | |
68 array characteristics like the stride (and alignment, for SIMD), and | |
69 requiring that the user maintain these invariants is error prone. | |
70 | |
71 <li>In most high-performance applications, as far as we can tell, you are | |
72 usually transforming the same array over and over, so FFTW's semantics | |
73 should not be a burden. | |
74 <li>If you need to transform another array of the same size, creating a | |
75 new plan once the first exists is a cheap operation. | |
76 | |
77 <li>If you need to transform many arrays of the same size at once, you | |
78 should really use the <code>plan_many</code> routines in FFTW's "advanced" | |
79 interface. | |
80 <li>If the abovementioned array characteristics are the same, you are | |
81 willing to pay close attention to the documentation, and you really | |
82 need to, we provide a "new-array execution" interface to | |
83 apply a plan to a new array. | |
84 </ul> | |
85 | |
86 <h2><A name="slow"> | |
87 Question 3.3. FFTW seems really slow. | |
88 </A></h2> | |
89 | |
90 You are probably recreating the plan before every transform, rather | |
91 than creating it once and reusing it for all transforms of the same | |
92 size. FFTW is designed to be used in the following way: | |
93 | |
94 <ul> | |
95 <li>First, you create a plan. This will take several seconds. | |
96 | |
97 <li>Then, you reuse the plan many times to perform FFTs. These are fast. | |
98 | |
99 </ul> | |
100 If you don't need to compute many transforms and the time for the | |
101 planner is significant, you have two options. First, you can use the | |
102 <code>FFTW_ESTIMATE</code> option in the planner, which uses heuristics | |
103 instead of runtime measurements and produces a good plan in a short | |
104 time. Second, you can use the wisdom feature to precompute the plan; | |
105 see <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A> | |
106 <h2><A name="slows"> | |
107 Question 3.4. FFTW slows down after repeated | |
108 calls. | |
109 </A></h2> | |
110 | |
111 Probably, NaNs or similar are creeping into your data, and the | |
112 slowdown is due to the resulting floating-point exceptions. For | |
113 example, be aware that repeatedly FFTing the same array is a diverging | |
114 process (because FFTW computes the unnormalized transform). | |
115 | |
116 <h2><A name="segfault"> | |
117 Question 3.5. An FFTW routine is crashing when I call | |
118 it. | |
119 </A></h2> | |
120 | |
121 Did the FFTW test programs pass (<code>make check</code>, or <code>cd tests; make bigcheck</code> if you want to be paranoid)? If so, you almost | |
122 certainly have a bug in your own code. For example, you could be | |
123 passing invalid arguments (such as wrongly-sized arrays) to FFTW, or | |
124 you could simply have memory corruption elsewhere in your program that | |
125 causes random crashes later on. Please don't complain to us unless | |
126 you can come up with a minimal self-contained program (preferably | |
127 under 30 lines) that illustrates the problem. | |
128 | |
129 <h2><A name="fortran64"> | |
130 Question 3.6. My Fortran program crashes when calling | |
131 FFTW. | |
132 </A></h2> | |
133 | |
134 As described in the manual, on 64-bit machines you must store the | |
135 plans in variables large enough to hold a pointer, for example | |
136 <code>integer*8</code>. We recommend using <code>integer*8</code> on 32-bit machines as well, to simplify porting. | |
137 | |
138 <h2><A name="conventions"> | |
139 Question 3.7. FFTW gives results different from my old | |
140 FFT. | |
141 </A></h2> | |
142 | |
143 People follow many different conventions for the DFT, and you should | |
144 be sure to know the ones that we use (described in the FFTW manual). | |
145 In particular, you should be aware that the | |
146 <code>FFTW_FORWARD</code>/<code>FFTW_BACKWARD</code> directions correspond to signs of -1/+1 in the exponent of the DFT definition. | |
147 (<i>Numerical Recipes</i> uses the opposite convention.) | |
148 <p> | |
149 You should also know that we compute an unnormalized transform. In | |
150 contrast, Matlab is an example of program that computes a normalized | |
151 transform. See <A href="#whyscaled">Q3.10 `Why does your inverse transform return a scaled | |
152 result?'</A>. | |
153 <p> | |
154 Finally, note that floating-point arithmetic is not exact, so | |
155 different FFT algorithms will give slightly different results (on the | |
156 order of the numerical accuracy; typically a fractional difference of | |
157 1e-15 or so in double precision). | |
158 <h2><A name="nondeterministic"> | |
159 Question 3.8. FFTW gives different results between | |
160 runs | |
161 </A></h2> | |
162 | |
163 If you use <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> mode, then the algorithm FFTW employs is not deterministic: it depends on | |
164 runtime performance measurements. This will cause the results to vary | |
165 slightly from run to run. However, the differences should be slight, | |
166 on the order of the floating-point precision, and therefore should | |
167 have no practical impact on most applications. | |
168 | |
169 <p> | |
170 If you use saved plans (wisdom) or <code>FFTW_ESTIMATE</code> mode, however, then the algorithm is deterministic and the results should be | |
171 identical between runs. | |
172 <h2><A name="savePlans"> | |
173 Question 3.9. Can I save FFTW's plans? | |
174 </A></h2> | |
175 | |
176 Yes. Starting with version 1.2, FFTW provides the | |
177 <code>wisdom</code> mechanism for saving plans; see the FFTW manual. | |
178 | |
179 <h2><A name="whyscaled"> | |
180 Question 3.10. Why does your inverse transform return a scaled | |
181 result? | |
182 </A></h2> | |
183 | |
184 Computing the forward transform followed by the backward transform (or | |
185 vice versa) yields the original array scaled by the size of the array. | |
186 (For multi-dimensional transforms, the size of the array is the | |
187 product of the dimensions.) We could, instead, have chosen a | |
188 normalization that would have returned the unscaled array. Or, to | |
189 accomodate the many conventions in this matter, the transform routines | |
190 could have accepted a "scale factor" parameter. We did not | |
191 do this, however, for two reasons. First, we didn't want to sacrifice | |
192 performance in the common case where the scale factor is 1. Second, in | |
193 real applications the FFT is followed or preceded by some computation | |
194 on the data, into which the scale factor can typically be absorbed at | |
195 little or no cost. | |
196 <h2><A name="centerorigin"> | |
197 Question 3.11. How can I make FFTW put the origin (zero frequency) at | |
198 the center of its output? | |
199 </A></h2> | |
200 | |
201 For human viewing of a spectrum, it is often convenient to put the | |
202 origin in frequency space at the center of the output array, rather | |
203 than in the zero-th element (the default in FFTW). If all of the | |
204 dimensions of your array are even, you can accomplish this by simply | |
205 multiplying each element of the input array by (-1)^(i + j + ...), | |
206 where i, j, etcetera are the indices of the element. (This trick is a | |
207 general property of the DFT, and is not specific to FFTW.) | |
208 | |
209 <h2><A name="imageaudio"> | |
210 Question 3.12. How do I FFT an image/audio file in | |
211 <i>foobar</i> format? | |
212 </A></h2> | |
213 | |
214 FFTW performs an FFT on an array of floating-point values. You can | |
215 certainly use it to compute the transform of an image or audio stream, | |
216 but you are responsible for figuring out your data format and | |
217 converting it to the form FFTW requires. | |
218 | |
219 <h2><A name="linkfails"> | |
220 Question 3.13. My program does not link (on | |
221 Unix). | |
222 </A></h2> | |
223 | |
224 The libraries must be listed in the correct order | |
225 (<code>-lfftw3 -lm</code> for FFTW 3.x) and <i>after</i> your program sources/objects. (The general rule is that if <i>A</i> uses <i>B</i>, then <i>A</i> must be listed before <i>B</i> in the link command.). | |
226 <h2><A name="linkheader"> | |
227 Question 3.14. I included your header, but linking still | |
228 fails. | |
229 </A></h2> | |
230 | |
231 You're a C++ programmer, aren't you? You have to compile the FFTW | |
232 library and link it into your program, not just | |
233 <code>#include <fftw3.h></code>. (Yes, this is really a FAQ.) | |
234 <h2><A name="nostack"> | |
235 Question 3.15. My program crashes, complaining about stack | |
236 space. | |
237 </A></h2> | |
238 | |
239 You cannot declare large arrays with automatic storage (e.g. via | |
240 <code>fftw_complex array[N]</code>); you should use <code>fftw_malloc</code> (or equivalent) to allocate the arrays you want | |
241 to transform if they are larger than a few hundred elements. | |
242 | |
243 <h2><A name="leaks"> | |
244 Question 3.16. FFTW seems to have a memory | |
245 leak. | |
246 </A></h2> | |
247 | |
248 After you create a plan, FFTW caches the information required to | |
249 quickly recreate the plan. (See <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A>) It also maintains a small amount of other persistent memory. You can deallocate all of | |
250 FFTW's internally allocated memory, if you wish, by calling | |
251 <code>fftw_cleanup()</code>, as documented in the manual. | |
252 <h2><A name="allzero"> | |
253 Question 3.17. The output of FFTW's transform is all | |
254 zeros. | |
255 </A></h2> | |
256 | |
257 You should initialize your input array <i>after</i> creating the plan, unless you use <code>FFTW_ESTIMATE</code>: planning with <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> overwrites the input/output arrays, as described in the manual. | |
258 | |
259 <h2><A name="vbetalia"> | |
260 Question 3.18. How do I call FFTW from the Microsoft language du | |
261 jour? | |
262 </A></h2> | |
263 | |
264 Please <i>do not</i> ask us Windows-specific questions. We do not | |
265 use Windows. We know nothing about Visual Basic, Visual C++, or .NET. | |
266 Please find the appropriate Usenet discussion group and ask your | |
267 question there. See also <A href="section2.html#runOnWindows">Q2.2 `Does FFTW run on Windows?'</A>. | |
268 <h2><A name="pruned"> | |
269 Question 3.19. Can I compute only a subset of the DFT | |
270 outputs? | |
271 </A></h2> | |
272 | |
273 In general, no, an FFT intrinsically computes all outputs from all | |
274 inputs. In principle, there is something called a | |
275 <i>pruned FFT</i> that can do what you want, but to compute K outputs out of N the | |
276 complexity is in general O(N log K) instead of O(N log N), thus saving | |
277 only a small additive factor in the log. (The same argument holds if | |
278 you instead have only K nonzero inputs.) | |
279 | |
280 <p> | |
281 There are some specific cases in which you can get the O(N log K) | |
282 performance benefits easily, however, by combining a few ordinary | |
283 FFTs. In particular, the case where you want the first K outputs, | |
284 where K divides N, can be handled by performing N/K transforms of size | |
285 K and then summing the outputs multiplied by appropriate phase | |
286 factors. For more details, see <A href="http://www.fftw.org/pruned.html">pruned FFTs with FFTW</A>. | |
287 <p> | |
288 There are also some algorithms that compute pruned transforms | |
289 <i>approximately</i>, but they are beyond the scope of this FAQ. | |
290 | |
291 <h2><A name="transpose"> | |
292 Question 3.20. Can I use FFTW's routines for in-place and | |
293 out-of-place matrix transposition? | |
294 </A></h2> | |
295 | |
296 You can use the FFTW guru interface to create a rank-0 transform of | |
297 vector rank 2 where the vector strides are transposed. (A rank-0 | |
298 transform is equivalent to a 1D transform of size 1, which. just | |
299 copies the input into the output.) Specifying the same location for | |
300 the input and output makes the transpose in-place. | |
301 | |
302 <p> | |
303 For double-valued data stored in row-major format, plan creation looks | |
304 like this: <pre> | |
305 fftw_plan plan_transpose(int rows, int cols, double *in, double *out) | |
306 { | |
307 const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */ | |
308 fftw_iodim howmany_dims[2]; | |
309 | |
310 howmany_dims[0].n = rows; | |
311 howmany_dims[0].is = cols; | |
312 howmany_dims[0].os = 1; | |
313 | |
314 howmany_dims[1].n = cols; | |
315 howmany_dims[1].is = 1; | |
316 howmany_dims[1].os = rows; | |
317 | |
318 return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL, | |
319 /*howmany_rank=*/ 2, howmany_dims, | |
320 in, out, /*kind=*/ NULL, flags); | |
321 } | |
322 </pre> | |
323 (This entry was written by Rhys Ulerich.) | |
324 <hr> | |
325 Next: <a href="section4.html" rel=precedes>Internals of FFTW</a>.<br> | |
326 Back: <a href="section2.html" rev=precedes>Installing FFTW</a>.<br> | |
327 <a href="index.html" rev=subdocument>Return to contents</a>.<p> | |
328 <address> | |
329 <A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A> | |
330 - 30 July 2016 | |
331 </address><br> | |
332 Extracted from FFTW Frequently Asked Questions with Answers, | |
333 Copyright © 2016 Matteo Frigo and Massachusetts Institute of Technology. | |
334 </body></html> |