Chris@10: Chris@10:
Chris@10:Chris@10: Next: Calling FFTW from Modern Fortran, Chris@10: Previous: Multi-threaded FFTW, Chris@10: Up: Top Chris@10:
Chris@10: In this chapter we document the parallel FFTW routines for parallel Chris@10: systems supporting the MPI message-passing interface. Unlike the Chris@10: shared-memory threads described in the previous chapter, MPI allows Chris@10: you to use distributed-memory parallelism, where each CPU has Chris@10: its own separate memory, and which can scale up to clusters of many Chris@10: thousands of processors. This capability comes at a price, however: Chris@10: each process only stores a portion of the data to be Chris@10: transformed, which means that the data structures and Chris@10: programming-interface are quite different from the serial or threads Chris@10: versions of FFTW. Chris@10: Chris@10: Chris@10:
Distributed-memory parallelism is especially useful when you are Chris@10: transforming arrays so large that they do not fit into the memory of a Chris@10: single processor. The storage per-process required by FFTW's MPI Chris@10: routines is proportional to the total array size divided by the number Chris@10: of processes. Conversely, distributed-memory parallelism can easily Chris@10: pose an unacceptably high communications overhead for small problems; Chris@10: the threshold problem size for which parallelism becomes advantageous Chris@10: will depend on the precise problem you are interested in, your Chris@10: hardware, and your MPI implementation. Chris@10: Chris@10:
A note on terminology: in MPI, you divide the data among a set of
Chris@10: “processes” which each run in their own memory address space.
Chris@10: Generally, each process runs on a different physical processor, but
Chris@10: this is not required. A set of processes in MPI is described by an
Chris@10: opaque data structure called a “communicator,” the most common of
Chris@10: which is the predefined communicator MPI_COMM_WORLD
which
Chris@10: refers to all processes. For more information on these and
Chris@10: other concepts common to all MPI programs, we refer the reader to the
Chris@10: documentation at the MPI home page.
Chris@10:
Chris@10:
Chris@10:
We assume in this chapter that the reader is familiar with the usage Chris@10: of the serial (uniprocessor) FFTW, and focus only on the concepts new Chris@10: to the MPI interface. Chris@10: Chris@10:
Chris@10: Chris@10: Chris@10: Chris@10: