cannam@167: cannam@167: cannam@167: cannam@167: cannam@167:
cannam@167:cannam@167: Next: Distributed-memory FFTW with MPI, Previous: FFTW Reference, Up: Top [Contents][Index]
cannam@167:In this chapter we document the parallel FFTW routines for cannam@167: shared-memory parallel hardware. These routines, which support cannam@167: parallel one- and multi-dimensional transforms of both real and cannam@167: complex data, are the easiest way to take advantage of multiple cannam@167: processors with FFTW. They work just like the corresponding cannam@167: uniprocessor transform routines, except that you have an extra cannam@167: initialization routine to call, and there is a routine to set the cannam@167: number of threads to employ. Any program that uses the uniprocessor cannam@167: FFTW can therefore be trivially modified to use the multi-threaded cannam@167: FFTW. cannam@167:
cannam@167:A shared-memory machine is one in which all CPUs can directly access cannam@167: the same main memory, and such machines are now common due to the cannam@167: ubiquity of multi-core CPUs. FFTW’s multi-threading support allows cannam@167: you to utilize these additional CPUs transparently from a single cannam@167: program. However, this does not necessarily translate into cannam@167: performance gains—when multiple threads/CPUs are employed, there is cannam@167: an overhead required for synchronization that may outweigh the cannam@167: computatational parallelism. Therefore, you can only benefit from cannam@167: threads if your problem is sufficiently large. cannam@167: cannam@167: cannam@167:
cannam@167:• Installation and Supported Hardware/Software: | cannam@167: | |
• Usage of Multi-threaded FFTW: | cannam@167: | |
• How Many Threads to Use?: | cannam@167: | |
• Thread safety: | cannam@167: |