cannam@127:
cannam@127: The innovation (if it can be so called) in FFTW consists in having a
cannam@127: variety of composable solvers, representing different FFT algorithms and implementation strategies, whose combination into a
cannam@127: particular plan for a given size can be determined at runtime according to the characteristics of your machine/compiler.
cannam@127: This peculiar software architecture allows FFTW to adapt itself to
cannam@127: almost any machine.
cannam@127:
cannam@127: For more details (albeit somewhat outdated), see the paper "FFTW:
cannam@127: An Adaptive Software Architecture for the FFT", by M. Frigo and
cannam@127: S. G. Johnson, Proc. ICASSP 3, 1381 (1998), also available at the FFTW web page.
cannam@127:
cannam@127:
cannam@127: This is a complex question, and there is no simple answer. In fact,
cannam@127: the authors do not fully know the answer, either. In addition to many
cannam@127: small performance hacks throughout FFTW, there are three general
cannam@127: reasons for FFTW's speed.
cannam@127:
cannam@127:
FFTW uses a variety of FFT algorithms and implementation styles
cannam@127: that can be arbitrarily composed to adapt itself to
cannam@127: a machine. See Q4.1 `How does FFTW work?'.
cannam@127:
FFTW uses a code generator to produce highly-optimized
cannam@127: routines for computing small transforms.
cannam@127:
cannam@127:
FFTW uses explicit divide-and-conquer to take advantage
cannam@127: of the memory hierarchy.
cannam@127:
cannam@127: For more details (albeit somewhat outdated), see the paper "FFTW:
cannam@127: An Adaptive Software Architecture for the FFT", by M. Frigo and
cannam@127: S. G. Johnson, Proc. ICASSP 3, 1381 (1998), available along with other references at
cannam@127: the FFTW web page.
cannam@127: Next: Known bugs.
cannam@127: Back: Using FFTW.
cannam@127: Return to contents.