annotate src/fftw-3.3.3/doc/upgrading.texi @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 37bf6b4a2645
children
rev   line source
Chris@10 1 @node Upgrading from FFTW version 2, Installation and Customization, Calling FFTW from Legacy Fortran, Top
Chris@10 2 @chapter Upgrading from FFTW version 2
Chris@10 3
Chris@10 4 In this chapter, we outline the process for updating codes designed for
Chris@10 5 the older FFTW 2 interface to work with FFTW 3. The interface for FFTW
Chris@10 6 3 is not backwards-compatible with the interface for FFTW 2 and earlier
Chris@10 7 versions; codes written to use those versions will fail to link with
Chris@10 8 FFTW 3. Nor is it possible to write ``compatibility wrappers'' to
Chris@10 9 bridge the gap (at least not efficiently), because FFTW 3 has different
Chris@10 10 semantics from previous versions. However, upgrading should be a
Chris@10 11 straightforward process because the data formats are identical and the
Chris@10 12 overall style of planning/execution is essentially the same.
Chris@10 13
Chris@10 14 Unlike FFTW 2, there are no separate header files for real and complex
Chris@10 15 transforms (or even for different precisions) in FFTW 3; all interfaces
Chris@10 16 are defined in the @code{<fftw3.h>} header file.
Chris@10 17
Chris@10 18 @heading Numeric Types
Chris@10 19
Chris@10 20 The main difference in data types is that @code{fftw_complex} in FFTW 2
Chris@10 21 was defined as a @code{struct} with macros @code{c_re} and @code{c_im}
Chris@10 22 for accessing the real/imaginary parts. (This is binary-compatible with
Chris@10 23 FFTW 3 on any machine except perhaps for some older Crays in single
Chris@10 24 precision.) The equivalent macros for FFTW 3 are:
Chris@10 25
Chris@10 26 @example
Chris@10 27 #define c_re(c) ((c)[0])
Chris@10 28 #define c_im(c) ((c)[1])
Chris@10 29 @end example
Chris@10 30
Chris@10 31 This does not work if you are using the C99 complex type, however,
Chris@10 32 unless you insert a @code{double*} typecast into the above macros
Chris@10 33 (@pxref{Complex numbers}).
Chris@10 34
Chris@10 35 Also, FFTW 2 had an @code{fftw_real} typedef that was an alias for
Chris@10 36 @code{double} (in double precision). In FFTW 3 you should just use
Chris@10 37 @code{double} (or whatever precision you are employing).
Chris@10 38
Chris@10 39 @heading Plans
Chris@10 40
Chris@10 41 The major difference between FFTW 2 and FFTW 3 is in the
Chris@10 42 planning/execution division of labor. In FFTW 2, plans were found for a
Chris@10 43 given transform size and type, and then could be applied to @emph{any}
Chris@10 44 arrays and for @emph{any} multiplicity/stride parameters. In FFTW 3,
Chris@10 45 you specify the particular arrays, stride parameters, etcetera when
Chris@10 46 creating the plan, and the plan is then executed for @emph{those} arrays
Chris@10 47 (unless the guru interface is used) and @emph{those} parameters
Chris@10 48 @emph{only}. (FFTW 2 had ``specific planner'' routines that planned for
Chris@10 49 a particular array and stride, but the plan could still be used for
Chris@10 50 other arrays and strides.) That is, much of the information that was
Chris@10 51 formerly specified at execution time is now specified at planning time.
Chris@10 52
Chris@10 53 Like FFTW 2's specific planner routines, the FFTW 3 planner overwrites
Chris@10 54 the input/output arrays unless you use @code{FFTW_ESTIMATE}.
Chris@10 55
Chris@10 56 FFTW 2 had separate data types @code{fftw_plan}, @code{fftwnd_plan},
Chris@10 57 @code{rfftw_plan}, and @code{rfftwnd_plan} for complex and real one- and
Chris@10 58 multi-dimensional transforms, and each type had its own @samp{destroy}
Chris@10 59 function. In FFTW 3, all plans are of type @code{fftw_plan} and all are
Chris@10 60 destroyed by @code{fftw_destroy_plan(plan)}.
Chris@10 61
Chris@10 62 Where you formerly used @code{fftw_create_plan} and @code{fftw_one} to
Chris@10 63 plan and compute a single 1d transform, you would now use
Chris@10 64 @code{fftw_plan_dft_1d} to plan the transform. If you used the generic
Chris@10 65 @code{fftw} function to execute the transform with multiplicity
Chris@10 66 (@code{howmany}) and stride parameters, you would now use the advanced
Chris@10 67 interface @code{fftw_plan_many_dft} to specify those parameters. The
Chris@10 68 plans are now executed with @code{fftw_execute(plan)}, which takes all
Chris@10 69 of its parameters (including the input/output arrays) from the plan.
Chris@10 70
Chris@10 71 In-place transforms no longer interpret their output argument as scratch
Chris@10 72 space, nor is there an @code{FFTW_IN_PLACE} flag. You simply pass the
Chris@10 73 same pointer for both the input and output arguments. (Previously, the
Chris@10 74 output @code{ostride} and @code{odist} parameters were ignored for
Chris@10 75 in-place transforms; now, if they are specified via the advanced
Chris@10 76 interface, they are significant even in the in-place case, although they
Chris@10 77 should normally equal the corresponding input parameters.)
Chris@10 78
Chris@10 79 The @code{FFTW_ESTIMATE} and @code{FFTW_MEASURE} flags have the same
Chris@10 80 meaning as before, although the planning time will differ. You may also
Chris@10 81 consider using @code{FFTW_PATIENT}, which is like @code{FFTW_MEASURE}
Chris@10 82 except that it takes more time in order to consider a wider variety of
Chris@10 83 algorithms.
Chris@10 84
Chris@10 85 For multi-dimensional complex DFTs, instead of @code{fftwnd_create_plan}
Chris@10 86 (or @code{fftw2d_create_plan} or @code{fftw3d_create_plan}), followed by
Chris@10 87 @code{fftwnd_one}, you would use @code{fftw_plan_dft} (or
Chris@10 88 @code{fftw_plan_dft_2d} or @code{fftw_plan_dft_3d}). followed by
Chris@10 89 @code{fftw_execute}. If you used @code{fftwnd} to to specify strides
Chris@10 90 etcetera, you would instead specify these via @code{fftw_plan_many_dft}.
Chris@10 91
Chris@10 92 The analogues to @code{rfftw_create_plan} and @code{rfftw_one} with
Chris@10 93 @code{FFTW_REAL_TO_COMPLEX} or @code{FFTW_COMPLEX_TO_REAL} directions
Chris@10 94 are @code{fftw_plan_r2r_1d} with kind @code{FFTW_R2HC} or
Chris@10 95 @code{FFTW_HC2R}, followed by @code{fftw_execute}. The stride etcetera
Chris@10 96 arguments of @code{rfftw} are now in @code{fftw_plan_many_r2r}.
Chris@10 97
Chris@10 98 Instead of @code{rfftwnd_create_plan} (or @code{rfftw2d_create_plan} or
Chris@10 99 @code{rfftw3d_create_plan}) followed by
Chris@10 100 @code{rfftwnd_one_real_to_complex} or
Chris@10 101 @code{rfftwnd_one_complex_to_real}, you now use @code{fftw_plan_dft_r2c}
Chris@10 102 (or @code{fftw_plan_dft_r2c_2d} or @code{fftw_plan_dft_r2c_3d}) or
Chris@10 103 @code{fftw_plan_dft_c2r} (or @code{fftw_plan_dft_c2r_2d} or
Chris@10 104 @code{fftw_plan_dft_c2r_3d}), respectively, followed by
Chris@10 105 @code{fftw_execute}. As usual, the strides etcetera of
Chris@10 106 @code{rfftwnd_real_to_complex} or @code{rfftwnd_complex_to_real} are no
Chris@10 107 specified in the advanced planner routines,
Chris@10 108 @code{fftw_plan_many_dft_r2c} or @code{fftw_plan_many_dft_c2r}.
Chris@10 109
Chris@10 110 @heading Wisdom
Chris@10 111
Chris@10 112 In FFTW 2, you had to supply the @code{FFTW_USE_WISDOM} flag in order to
Chris@10 113 use wisdom; in FFTW 3, wisdom is always used. (You could simulate the
Chris@10 114 FFTW 2 wisdom-less behavior by calling @code{fftw_forget_wisdom} after
Chris@10 115 every planner call.)
Chris@10 116
Chris@10 117 The FFTW 3 wisdom import/export routines are almost the same as before
Chris@10 118 (although the storage format is entirely different). There is one
Chris@10 119 significant difference, however. In FFTW 2, the import routines would
Chris@10 120 never read past the end of the wisdom, so you could store extra data
Chris@10 121 beyond the wisdom in the same file, for example. In FFTW 3, the
Chris@10 122 file-import routine may read up to a few hundred bytes past the end of
Chris@10 123 the wisdom, so you cannot store other data just beyond it.@footnote{We
Chris@10 124 do our own buffering because GNU libc I/O routines are horribly slow for
Chris@10 125 single-character I/O, apparently for thread-safety reasons (whether you
Chris@10 126 are using threads or not).}
Chris@10 127
Chris@10 128 Wisdom has been enhanced by additional humility in FFTW 3: whereas FFTW
Chris@10 129 2 would re-use wisdom for a given transform size regardless of the
Chris@10 130 stride etc., in FFTW 3 wisdom is only used with the strides etc. for
Chris@10 131 which it was created. Unfortunately, this means FFTW 3 has to create
Chris@10 132 new plans from scratch more often than FFTW 2 (in FFTW 2, planning
Chris@10 133 e.g. one transform of size 1024 also created wisdom for all smaller
Chris@10 134 powers of 2, but this no longer occurs).
Chris@10 135
Chris@10 136 FFTW 3 also has the new routine @code{fftw_import_system_wisdom} to
Chris@10 137 import wisdom from a standard system-wide location.
Chris@10 138
Chris@10 139 @heading Memory allocation
Chris@10 140
Chris@10 141 In FFTW 3, we recommend allocating your arrays with @code{fftw_malloc}
Chris@10 142 and deallocating them with @code{fftw_free}; this is not required, but
Chris@10 143 allows optimal performance when SIMD acceleration is used. (Those two
Chris@10 144 functions actually existed in FFTW 2, and worked the same way, but were
Chris@10 145 not documented.)
Chris@10 146
Chris@10 147 In FFTW 2, there were @code{fftw_malloc_hook} and @code{fftw_free_hook}
Chris@10 148 functions that allowed the user to replace FFTW's memory-allocation
Chris@10 149 routines (e.g. to implement different error-handling, since by default
Chris@10 150 FFTW prints an error message and calls @code{exit} to abort the program
Chris@10 151 if @code{malloc} returns @code{NULL}). These hooks are not supported in
Chris@10 152 FFTW 3; those few users who require this functionality can just
Chris@10 153 directly modify the memory-allocation routines in FFTW (they are defined
Chris@10 154 in @code{kernel/alloc.c}).
Chris@10 155
Chris@10 156 @heading Fortran interface
Chris@10 157
Chris@10 158 In FFTW 2, the subroutine names were obtained by replacing @samp{fftw_}
Chris@10 159 with @samp{fftw_f77}; in FFTW 3, you replace @samp{fftw_} with
Chris@10 160 @samp{dfftw_} (or @samp{sfftw_} or @samp{lfftw_}, depending upon the
Chris@10 161 precision).
Chris@10 162
Chris@10 163 In FFTW 3, we have begun recommending that you always declare the type
Chris@10 164 used to store plans as @code{integer*8}. (Too many people didn't notice
Chris@10 165 our instruction to switch from @code{integer} to @code{integer*8} for
Chris@10 166 64-bit machines.)
Chris@10 167
Chris@10 168 In FFTW 3, we provide a @code{fftw3.f} ``header file'' to include in
Chris@10 169 your code (and which is officially installed on Unix systems). (In FFTW
Chris@10 170 2, we supplied a @code{fftw_f77.i} file, but it was not installed.)
Chris@10 171
Chris@10 172 Otherwise, the C-Fortran interface relationship is much the same as it
Chris@10 173 was before (e.g. return values become initial parameters, and
Chris@10 174 multi-dimensional arrays are in column-major order). Unlike FFTW 2, we
Chris@10 175 do provide some support for wisdom import/export in Fortran
Chris@10 176 (@pxref{Wisdom of Fortran?}).
Chris@10 177
Chris@10 178 @heading Threads
Chris@10 179
Chris@10 180 Like FFTW 2, only the execution routines are thread-safe. All planner
Chris@10 181 routines, etcetera, should be called by only a single thread at a time
Chris@10 182 (@pxref{Thread safety}). @emph{Unlike} FFTW 2, there is no special
Chris@10 183 @code{FFTW_THREADSAFE} flag for the planner to allow a given plan to be
Chris@10 184 usable by multiple threads in parallel; this is now the case by default.
Chris@10 185
Chris@10 186 The multi-threaded version of FFTW 2 required you to pass the number of
Chris@10 187 threads each time you execute the transform. The number of threads is
Chris@10 188 now stored in the plan, and is specified before the planner is called by
Chris@10 189 @code{fftw_plan_with_nthreads}. The threads initialization routine used
Chris@10 190 to be called @code{fftw_threads_init} and would return zero on success;
Chris@10 191 the new routine is called @code{fftw_init_threads} and returns zero on
Chris@10 192 failure. @xref{Multi-threaded FFTW}.
Chris@10 193
Chris@10 194 There is no separate threads header file in FFTW 3; all the function
Chris@10 195 prototypes are in @code{<fftw3.h>}. However, you still have to link to
Chris@10 196 a separate library (@code{-lfftw3_threads -lfftw3 -lm} on Unix), as well as
Chris@10 197 to the threading library (e.g. POSIX threads on Unix).
Chris@10 198