Chris@82: Chris@82: Chris@82: Chris@82: Chris@82:
Chris@82:Chris@82: Next: Calling FFTW from Modern Fortran, Previous: Multi-threaded FFTW, Up: Top [Contents][Index]
Chris@82:In this chapter we document the parallel FFTW routines for parallel Chris@82: systems supporting the MPI message-passing interface. Unlike the Chris@82: shared-memory threads described in the previous chapter, MPI allows Chris@82: you to use distributed-memory parallelism, where each CPU has Chris@82: its own separate memory, and which can scale up to clusters of many Chris@82: thousands of processors. This capability comes at a price, however: Chris@82: each process only stores a portion of the data to be Chris@82: transformed, which means that the data structures and Chris@82: programming-interface are quite different from the serial or threads Chris@82: versions of FFTW. Chris@82: Chris@82:
Chris@82: Chris@82:Distributed-memory parallelism is especially useful when you are Chris@82: transforming arrays so large that they do not fit into the memory of a Chris@82: single processor. The storage per-process required by FFTW’s MPI Chris@82: routines is proportional to the total array size divided by the number Chris@82: of processes. Conversely, distributed-memory parallelism can easily Chris@82: pose an unacceptably high communications overhead for small problems; Chris@82: the threshold problem size for which parallelism becomes advantageous Chris@82: will depend on the precise problem you are interested in, your Chris@82: hardware, and your MPI implementation. Chris@82:
Chris@82:A note on terminology: in MPI, you divide the data among a set of
Chris@82: “processes” which each run in their own memory address space.
Chris@82: Generally, each process runs on a different physical processor, but
Chris@82: this is not required. A set of processes in MPI is described by an
Chris@82: opaque data structure called a “communicator,” the most common of
Chris@82: which is the predefined communicator MPI_COMM_WORLD
which
Chris@82: refers to all processes. For more information on these and
Chris@82: other concepts common to all MPI programs, we refer the reader to the
Chris@82: documentation at the MPI home
Chris@82: page.
Chris@82:
Chris@82:
Chris@82:
We assume in this chapter that the reader is familiar with the usage Chris@82: of the serial (uniprocessor) FFTW, and focus only on the concepts new Chris@82: to the MPI interface. Chris@82:
Chris@82:• FFTW MPI Installation: | Chris@82: | |
• Linking and Initializing MPI FFTW: | Chris@82: | |
• 2d MPI example: | Chris@82: | |
• MPI Data Distribution: | Chris@82: | |
• Multi-dimensional MPI DFTs of Real Data: | Chris@82: | |
• Other Multi-dimensional Real-data MPI Transforms: | Chris@82: | |
• FFTW MPI Transposes: | Chris@82: | |
• FFTW MPI Wisdom: | Chris@82: | |
• Avoiding MPI Deadlocks: | Chris@82: | |
• FFTW MPI Performance Tips: | Chris@82: | |
• Combining MPI and Threads: | Chris@82: | |
• FFTW MPI Reference: | Chris@82: | |
• FFTW MPI Fortran Interface: | Chris@82: |
Chris@82: Next: Calling FFTW from Modern Fortran, Previous: Multi-threaded FFTW, Up: Top [Contents][Index]
Chris@82: