d@0: d@0: d@0: Distributed-memory FFTW with MPI - FFTW 3.2alpha3 d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0:
d@0:

d@0: d@0: d@0: Next: , d@0: Previous: Multi-threaded FFTW, d@0: Up: Top d@0:


d@0:
d@0: d@0:

6 Distributed-memory FFTW with MPI

d@0: d@0:

d@0: In this chapter we document the parallel FFTW routines for parallel d@0: hardware supporting the MPI message-passing interface. Unlike the d@0: shared-memory threads described in the previous chapter, MPI allows d@0: you to use distributed-memory parallelism, where each CPU has d@0: its own separate memory, and which can scale up to clusters of many d@0: thousands of processors. This capability comes at a price, however: d@0: each process only stores a portion of the data to be d@0: transformed, which means that the data structures and d@0: programming-interface are quite different from the serial or threads d@0: versions of FFTW. d@0: d@0: Distributed-memory parallelism is especially useful when you are d@0: transforming arrays so large that they do not fit into the memory of a d@0: single processor. The storage per-process required by FFTW's MPI d@0: routines is proportional to the total array size divided by the number d@0: of processes. Conversely, distributed-memory parallelism can easily d@0: pose an unacceptably high communications overhead for small problems; d@0: the threshold problem size for which parallelism becomes important d@0: will depend on the precise problem you are interested in, your d@0: hardware, and your MPI implementation. d@0: d@0:

A note on terminology: in MPI, you divide the data among a set of d@0: “processes” which each run in their own memory address space. d@0: Generally, each process runs on a different physical processor, but d@0: this is not required. A group of processes in MPI is described by an d@0: opaque data structure called a “communicator,” the most common of d@0: which is the predefined communicator MPI_COMM_WORLD which d@0: refers to all processes. For more information on these and d@0: other concepts common to all MPI programs, we refer the reader to the d@0: documentation at the MPI home page. d@0: d@0: We assume in this chapter that the reader is familiar with the usage d@0: of the serial (uniprocessor) FFTW, and focus only on the concepts new d@0: to the MPI interface. d@0: d@0:

d@0: d@0: d@0: d@0: