Mercurial > hg > sv-dependency-builds
comparison src/fftw-3.3.3/doc/upgrading.texi @ 10:37bf6b4a2645
Add FFTW3
author | Chris Cannam |
---|---|
date | Wed, 20 Mar 2013 15:35:50 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
9:c0fb53affa76 | 10:37bf6b4a2645 |
---|---|
1 @node Upgrading from FFTW version 2, Installation and Customization, Calling FFTW from Legacy Fortran, Top | |
2 @chapter Upgrading from FFTW version 2 | |
3 | |
4 In this chapter, we outline the process for updating codes designed for | |
5 the older FFTW 2 interface to work with FFTW 3. The interface for FFTW | |
6 3 is not backwards-compatible with the interface for FFTW 2 and earlier | |
7 versions; codes written to use those versions will fail to link with | |
8 FFTW 3. Nor is it possible to write ``compatibility wrappers'' to | |
9 bridge the gap (at least not efficiently), because FFTW 3 has different | |
10 semantics from previous versions. However, upgrading should be a | |
11 straightforward process because the data formats are identical and the | |
12 overall style of planning/execution is essentially the same. | |
13 | |
14 Unlike FFTW 2, there are no separate header files for real and complex | |
15 transforms (or even for different precisions) in FFTW 3; all interfaces | |
16 are defined in the @code{<fftw3.h>} header file. | |
17 | |
18 @heading Numeric Types | |
19 | |
20 The main difference in data types is that @code{fftw_complex} in FFTW 2 | |
21 was defined as a @code{struct} with macros @code{c_re} and @code{c_im} | |
22 for accessing the real/imaginary parts. (This is binary-compatible with | |
23 FFTW 3 on any machine except perhaps for some older Crays in single | |
24 precision.) The equivalent macros for FFTW 3 are: | |
25 | |
26 @example | |
27 #define c_re(c) ((c)[0]) | |
28 #define c_im(c) ((c)[1]) | |
29 @end example | |
30 | |
31 This does not work if you are using the C99 complex type, however, | |
32 unless you insert a @code{double*} typecast into the above macros | |
33 (@pxref{Complex numbers}). | |
34 | |
35 Also, FFTW 2 had an @code{fftw_real} typedef that was an alias for | |
36 @code{double} (in double precision). In FFTW 3 you should just use | |
37 @code{double} (or whatever precision you are employing). | |
38 | |
39 @heading Plans | |
40 | |
41 The major difference between FFTW 2 and FFTW 3 is in the | |
42 planning/execution division of labor. In FFTW 2, plans were found for a | |
43 given transform size and type, and then could be applied to @emph{any} | |
44 arrays and for @emph{any} multiplicity/stride parameters. In FFTW 3, | |
45 you specify the particular arrays, stride parameters, etcetera when | |
46 creating the plan, and the plan is then executed for @emph{those} arrays | |
47 (unless the guru interface is used) and @emph{those} parameters | |
48 @emph{only}. (FFTW 2 had ``specific planner'' routines that planned for | |
49 a particular array and stride, but the plan could still be used for | |
50 other arrays and strides.) That is, much of the information that was | |
51 formerly specified at execution time is now specified at planning time. | |
52 | |
53 Like FFTW 2's specific planner routines, the FFTW 3 planner overwrites | |
54 the input/output arrays unless you use @code{FFTW_ESTIMATE}. | |
55 | |
56 FFTW 2 had separate data types @code{fftw_plan}, @code{fftwnd_plan}, | |
57 @code{rfftw_plan}, and @code{rfftwnd_plan} for complex and real one- and | |
58 multi-dimensional transforms, and each type had its own @samp{destroy} | |
59 function. In FFTW 3, all plans are of type @code{fftw_plan} and all are | |
60 destroyed by @code{fftw_destroy_plan(plan)}. | |
61 | |
62 Where you formerly used @code{fftw_create_plan} and @code{fftw_one} to | |
63 plan and compute a single 1d transform, you would now use | |
64 @code{fftw_plan_dft_1d} to plan the transform. If you used the generic | |
65 @code{fftw} function to execute the transform with multiplicity | |
66 (@code{howmany}) and stride parameters, you would now use the advanced | |
67 interface @code{fftw_plan_many_dft} to specify those parameters. The | |
68 plans are now executed with @code{fftw_execute(plan)}, which takes all | |
69 of its parameters (including the input/output arrays) from the plan. | |
70 | |
71 In-place transforms no longer interpret their output argument as scratch | |
72 space, nor is there an @code{FFTW_IN_PLACE} flag. You simply pass the | |
73 same pointer for both the input and output arguments. (Previously, the | |
74 output @code{ostride} and @code{odist} parameters were ignored for | |
75 in-place transforms; now, if they are specified via the advanced | |
76 interface, they are significant even in the in-place case, although they | |
77 should normally equal the corresponding input parameters.) | |
78 | |
79 The @code{FFTW_ESTIMATE} and @code{FFTW_MEASURE} flags have the same | |
80 meaning as before, although the planning time will differ. You may also | |
81 consider using @code{FFTW_PATIENT}, which is like @code{FFTW_MEASURE} | |
82 except that it takes more time in order to consider a wider variety of | |
83 algorithms. | |
84 | |
85 For multi-dimensional complex DFTs, instead of @code{fftwnd_create_plan} | |
86 (or @code{fftw2d_create_plan} or @code{fftw3d_create_plan}), followed by | |
87 @code{fftwnd_one}, you would use @code{fftw_plan_dft} (or | |
88 @code{fftw_plan_dft_2d} or @code{fftw_plan_dft_3d}). followed by | |
89 @code{fftw_execute}. If you used @code{fftwnd} to to specify strides | |
90 etcetera, you would instead specify these via @code{fftw_plan_many_dft}. | |
91 | |
92 The analogues to @code{rfftw_create_plan} and @code{rfftw_one} with | |
93 @code{FFTW_REAL_TO_COMPLEX} or @code{FFTW_COMPLEX_TO_REAL} directions | |
94 are @code{fftw_plan_r2r_1d} with kind @code{FFTW_R2HC} or | |
95 @code{FFTW_HC2R}, followed by @code{fftw_execute}. The stride etcetera | |
96 arguments of @code{rfftw} are now in @code{fftw_plan_many_r2r}. | |
97 | |
98 Instead of @code{rfftwnd_create_plan} (or @code{rfftw2d_create_plan} or | |
99 @code{rfftw3d_create_plan}) followed by | |
100 @code{rfftwnd_one_real_to_complex} or | |
101 @code{rfftwnd_one_complex_to_real}, you now use @code{fftw_plan_dft_r2c} | |
102 (or @code{fftw_plan_dft_r2c_2d} or @code{fftw_plan_dft_r2c_3d}) or | |
103 @code{fftw_plan_dft_c2r} (or @code{fftw_plan_dft_c2r_2d} or | |
104 @code{fftw_plan_dft_c2r_3d}), respectively, followed by | |
105 @code{fftw_execute}. As usual, the strides etcetera of | |
106 @code{rfftwnd_real_to_complex} or @code{rfftwnd_complex_to_real} are no | |
107 specified in the advanced planner routines, | |
108 @code{fftw_plan_many_dft_r2c} or @code{fftw_plan_many_dft_c2r}. | |
109 | |
110 @heading Wisdom | |
111 | |
112 In FFTW 2, you had to supply the @code{FFTW_USE_WISDOM} flag in order to | |
113 use wisdom; in FFTW 3, wisdom is always used. (You could simulate the | |
114 FFTW 2 wisdom-less behavior by calling @code{fftw_forget_wisdom} after | |
115 every planner call.) | |
116 | |
117 The FFTW 3 wisdom import/export routines are almost the same as before | |
118 (although the storage format is entirely different). There is one | |
119 significant difference, however. In FFTW 2, the import routines would | |
120 never read past the end of the wisdom, so you could store extra data | |
121 beyond the wisdom in the same file, for example. In FFTW 3, the | |
122 file-import routine may read up to a few hundred bytes past the end of | |
123 the wisdom, so you cannot store other data just beyond it.@footnote{We | |
124 do our own buffering because GNU libc I/O routines are horribly slow for | |
125 single-character I/O, apparently for thread-safety reasons (whether you | |
126 are using threads or not).} | |
127 | |
128 Wisdom has been enhanced by additional humility in FFTW 3: whereas FFTW | |
129 2 would re-use wisdom for a given transform size regardless of the | |
130 stride etc., in FFTW 3 wisdom is only used with the strides etc. for | |
131 which it was created. Unfortunately, this means FFTW 3 has to create | |
132 new plans from scratch more often than FFTW 2 (in FFTW 2, planning | |
133 e.g. one transform of size 1024 also created wisdom for all smaller | |
134 powers of 2, but this no longer occurs). | |
135 | |
136 FFTW 3 also has the new routine @code{fftw_import_system_wisdom} to | |
137 import wisdom from a standard system-wide location. | |
138 | |
139 @heading Memory allocation | |
140 | |
141 In FFTW 3, we recommend allocating your arrays with @code{fftw_malloc} | |
142 and deallocating them with @code{fftw_free}; this is not required, but | |
143 allows optimal performance when SIMD acceleration is used. (Those two | |
144 functions actually existed in FFTW 2, and worked the same way, but were | |
145 not documented.) | |
146 | |
147 In FFTW 2, there were @code{fftw_malloc_hook} and @code{fftw_free_hook} | |
148 functions that allowed the user to replace FFTW's memory-allocation | |
149 routines (e.g. to implement different error-handling, since by default | |
150 FFTW prints an error message and calls @code{exit} to abort the program | |
151 if @code{malloc} returns @code{NULL}). These hooks are not supported in | |
152 FFTW 3; those few users who require this functionality can just | |
153 directly modify the memory-allocation routines in FFTW (they are defined | |
154 in @code{kernel/alloc.c}). | |
155 | |
156 @heading Fortran interface | |
157 | |
158 In FFTW 2, the subroutine names were obtained by replacing @samp{fftw_} | |
159 with @samp{fftw_f77}; in FFTW 3, you replace @samp{fftw_} with | |
160 @samp{dfftw_} (or @samp{sfftw_} or @samp{lfftw_}, depending upon the | |
161 precision). | |
162 | |
163 In FFTW 3, we have begun recommending that you always declare the type | |
164 used to store plans as @code{integer*8}. (Too many people didn't notice | |
165 our instruction to switch from @code{integer} to @code{integer*8} for | |
166 64-bit machines.) | |
167 | |
168 In FFTW 3, we provide a @code{fftw3.f} ``header file'' to include in | |
169 your code (and which is officially installed on Unix systems). (In FFTW | |
170 2, we supplied a @code{fftw_f77.i} file, but it was not installed.) | |
171 | |
172 Otherwise, the C-Fortran interface relationship is much the same as it | |
173 was before (e.g. return values become initial parameters, and | |
174 multi-dimensional arrays are in column-major order). Unlike FFTW 2, we | |
175 do provide some support for wisdom import/export in Fortran | |
176 (@pxref{Wisdom of Fortran?}). | |
177 | |
178 @heading Threads | |
179 | |
180 Like FFTW 2, only the execution routines are thread-safe. All planner | |
181 routines, etcetera, should be called by only a single thread at a time | |
182 (@pxref{Thread safety}). @emph{Unlike} FFTW 2, there is no special | |
183 @code{FFTW_THREADSAFE} flag for the planner to allow a given plan to be | |
184 usable by multiple threads in parallel; this is now the case by default. | |
185 | |
186 The multi-threaded version of FFTW 2 required you to pass the number of | |
187 threads each time you execute the transform. The number of threads is | |
188 now stored in the plan, and is specified before the planner is called by | |
189 @code{fftw_plan_with_nthreads}. The threads initialization routine used | |
190 to be called @code{fftw_threads_init} and would return zero on success; | |
191 the new routine is called @code{fftw_init_threads} and returns zero on | |
192 failure. @xref{Multi-threaded FFTW}. | |
193 | |
194 There is no separate threads header file in FFTW 3; all the function | |
195 prototypes are in @code{<fftw3.h>}. However, you still have to link to | |
196 a separate library (@code{-lfftw3_threads -lfftw3 -lm} on Unix), as well as | |
197 to the threading library (e.g. POSIX threads on Unix). | |
198 |