cannam@127: cannam@127: cannam@127: cannam@127: cannam@127:
cannam@127:cannam@127: Next: Calling FFTW from Modern Fortran, Previous: Multi-threaded FFTW, Up: Top [Contents][Index]
cannam@127:In this chapter we document the parallel FFTW routines for parallel cannam@127: systems supporting the MPI message-passing interface. Unlike the cannam@127: shared-memory threads described in the previous chapter, MPI allows cannam@127: you to use distributed-memory parallelism, where each CPU has cannam@127: its own separate memory, and which can scale up to clusters of many cannam@127: thousands of processors. This capability comes at a price, however: cannam@127: each process only stores a portion of the data to be cannam@127: transformed, which means that the data structures and cannam@127: programming-interface are quite different from the serial or threads cannam@127: versions of FFTW. cannam@127: cannam@127:
cannam@127: cannam@127:Distributed-memory parallelism is especially useful when you are cannam@127: transforming arrays so large that they do not fit into the memory of a cannam@127: single processor. The storage per-process required by FFTW’s MPI cannam@127: routines is proportional to the total array size divided by the number cannam@127: of processes. Conversely, distributed-memory parallelism can easily cannam@127: pose an unacceptably high communications overhead for small problems; cannam@127: the threshold problem size for which parallelism becomes advantageous cannam@127: will depend on the precise problem you are interested in, your cannam@127: hardware, and your MPI implementation. cannam@127:
cannam@127:A note on terminology: in MPI, you divide the data among a set of
cannam@127: “processes” which each run in their own memory address space.
cannam@127: Generally, each process runs on a different physical processor, but
cannam@127: this is not required.  A set of processes in MPI is described by an
cannam@127: opaque data structure called a “communicator,” the most common of
cannam@127: which is the predefined communicator MPI_COMM_WORLD which
cannam@127: refers to all processes.  For more information on these and
cannam@127: other concepts common to all MPI programs, we refer the reader to the
cannam@127: documentation at the MPI home
cannam@127: page.
cannam@127: 
cannam@127: 
cannam@127: 
We assume in this chapter that the reader is familiar with the usage cannam@127: of the serial (uniprocessor) FFTW, and focus only on the concepts new cannam@127: to the MPI interface. cannam@127:
cannam@127:| • FFTW MPI Installation: | cannam@127: | |
| • Linking and Initializing MPI FFTW: | cannam@127: | |
| • 2d MPI example: | cannam@127: | |
| • MPI Data Distribution: | cannam@127: | |
| • Multi-dimensional MPI DFTs of Real Data: | cannam@127: | |
| • Other Multi-dimensional Real-data MPI Transforms: | cannam@127: | |
| • FFTW MPI Transposes: | cannam@127: | |
| • FFTW MPI Wisdom: | cannam@127: | |
| • Avoiding MPI Deadlocks: | cannam@127: | |
| • FFTW MPI Performance Tips: | cannam@127: | |
| • Combining MPI and Threads: | cannam@127: | |
| • FFTW MPI Reference: | cannam@127: | |
| • FFTW MPI Fortran Interface: | cannam@127: | 
cannam@127: Next: Calling FFTW from Modern Fortran, Previous: Multi-threaded FFTW, Up: Top [Contents][Index]
cannam@127: